Page 1 of 1

Remove customXML part/folder

PostPosted: Sun Feb 16, 2014 1:15 am
by jallen
Hi Jason,

Is there an easy way to remove the entire customXml part/folder from a Word document?
More specifically, I have a Word document that has a customXml part with a xmlns:b="http://schemas.openxmlformats.org/officeDocument/2006/bibliography" part in it. When I try to add my custom xml for data binding using code similar to here http://www.docx4java.org/trac/docx4j/browser/trunk/docx4j/src/main/java/org/docx4j/samples/CreateDocxWithCustomXml.java

The document ends up being corrupted. So I was thinking I could just remove the part that is there and then start from scratch, since there is no need for any bibliography in the document. I tried something along the lines of
Code: Select all
wordMLPackage.getCustomXmlDataStorageParts().clear()
but that didn't seem to remove anything. I attached the word file I am starting from.

Thanks,
Jeff

word.docx
(14.22 KiB) Downloaded 335 times

Re: Remove customXML part/folder

PostPosted: Wed Feb 19, 2014 4:10 am
by jallen
OK so I think I have found a way to do this. I am kind of poking with a sharp stick here.

Here is the code I came up with. Do you see any issues with trying to do it this way?

Code: Select all
         PartName partName;
         HashMap hParts = wordMLPackage.getParts().getParts();
         
         Iterator it = hParts.entrySet().iterator();
         while (it.hasNext()) {
            Map.Entry pairs = (Map.Entry)it.next();
            String key = pairs.getKey().toString();
            if(key.startsWith("/customXml")) {
               it.remove();
               partName = new PartName(key);
               wordMLPackage.getContentTypeManager().removeContentType(partName);
            }
         }

Re: Remove customXML part/folder

PostPosted: Thu Feb 20, 2014 6:36 pm
by jason
The bib custom xml part is a rel of the Main Document Part (as shown by running it through the webapp or PartsList):

Code: Select all
    Part /word/document.xml [org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart] http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument containing JaxbElement:org.docx4j.wml.Document
content type: application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml

        Part /customXml/item1.xml [org.docx4j.openpackaging.parts.WordprocessingML.BibliographyPart] http://schemas.openxmlformats.org/officeDocument/2006/relationships/customXml containing JaxbElement:{http://schemas.openxmlformats.org/officeDocument/2006/bibliography}Sources is a javax.xml.bind.JAXBElement; it has declared type org.docx4j.bibliography.CTSources
content type: application/xml

            Part /customXml/itemProps1.xml [org.docx4j.openpackaging.parts.CustomXmlDataStoragePropertiesPart] http://schemas.openxmlformats.org/officeDocument/2006/relationships/customXmlProps containing JaxbElement:org.docx4j.customXmlProperties.DatastoreItem
content type: application/vnd.openxmlformats-officedocument.customXmlProperties+xml


As for any other part, you delete the rel.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
import java.io.File;
import java.util.ArrayList;
import java.util.List;

import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.CustomXmlDataStoragePropertiesPart;
import org.docx4j.openpackaging.parts.Part;
import org.docx4j.openpackaging.parts.WordprocessingML.BibliographyPart;
import org.docx4j.openpackaging.parts.relationships.Namespaces;
import org.docx4j.openpackaging.parts.relationships.RelationshipsPart;
import org.docx4j.relationships.Relationship;



public class DeleteBibCXP {
       

       
        static String file = System.getProperty("user.dir") + "/Bibliography.docx";
       
       
        public static void main(String[] args) throws Exception {

                WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new File(file));          
               
                RelationshipsPart rp = wordMLPackage.getMainDocumentPart().getRelationshipsPart();
               
                               
                List<Relationship> deletions = new ArrayList<Relationship>();
               
                for ( Relationship r : rp.getRelationships().getRelationship() ) {
                       
                        System.out.println("For Relationship Id=" + r.getId()
                                        + " Source is " + rp.getSourceP().getPartName()
                                        + ", Target is " + r.getTarget() );
               
                        if (r.getTargetMode() != null
                                        && r.getTargetMode().equals("External") ) {
                               
                                continue;                              
                        }
                       
                        try {
                                Part part = rp.getPart(r);

                                if (part instanceof BibliographyPart) {
                                        deletions.add(r );     
                                       
                                        // it is also stored by itemId in a hashmap, so for completeness, delete it from there
                                        String itemId = getItemId((BibliographyPart)part);
                                        if (itemId!=null) {
                                                System.out.println("deleting " + itemId);
                                                wordMLPackage.getCustomXmlDataStorageParts().remove(itemId);
                                        }
                                       
                                }
                                       
                        } catch (Exception e) {
                                throw new Docx4JException("Failed to add parts from relationships", e);                        
                        }
                                               
                }
               
                for ( Relationship r : deletions) {
                        System.out.println("deleting " + r.getId() );
                        rp.removeRelationship(r);
                }
               
               
               
        }
       
        public static String getItemId(BibliographyPart entry) {
               
                String itemId = null;
                if (entry.getRelationshipsPart()==null) {
                        return null;
                } else {
                        // Look in its rels for rel of @Type customXmlProps (eg @Target="itemProps1.xml")
                        Relationship r = entry.getRelationshipsPart().getRelationshipByType(
                                        Namespaces.CUSTOM_XML_DATA_STORAGE_PROPERTIES);
                        if (r==null) {
                                System.out.println(".. but that doesn't point to a  customXmlProps part");
                                return null;
                        }
                        CustomXmlDataStoragePropertiesPart customXmlProps =
                                (CustomXmlDataStoragePropertiesPart)entry.getRelationshipsPart().getPart(r);
                        if (customXmlProps==null) {
                                System.out.println(".. but the target seems to be missing?");
                                return null;
                        } else {
                                return customXmlProps.getItemId().toLowerCase();
                        }
                }
        }
               
}

 
Parsed in 0.022 seconds, using GeSHi 1.0.8.4



A Bibliography part, being a CustomXML part, is also "registered" by item id during loading. So the above code also deletes it from that.

Code: Select all
INFO org.docx4j.openpackaging.io.Load .registerCustomXmlDataStorageParts line 301 - Found a CustomXmlPart, named /customXml/item1.xml
DEBUG org.docx4j.openpackaging.io.Load .registerCustomXmlDataStorageParts line 306 - .. it has a rels part
INFO org.docx4j.openpackaging.parts.JaxbXmlPart .getJaxbElement line 129 - Lazily unmarshalling /customXml/itemProps1.xml
INFO org.docx4j.openpackaging.io.Load .registerCustomXmlDataStorageParts line 343 - Identified/registered ds:itemId {bececd4f-4e83-45f8-a8f4-bd5e03b7a7e5}

Re: Remove customXML part/folder

PostPosted: Fri Feb 21, 2014 2:36 pm
by jason
Just added a new method remove() to Part, which allows a part to delete itself from a package.

It may be useful in some cases; I anticipate it will be included in 3.0.2.

Re: Remove customXML part/folder

PostPosted: Sat Feb 22, 2014 6:14 am
by jallen
jason wrote:Just added a new method remove() to Part, which allows a part to delete itself from a package.

It may be useful in some cases; I anticipate it will be included in 3.0.2.


Very cool. I will definitely take advantage of it.