Page 1 of 1

How to convert the document.xml back to docx file

PostPosted: Tue Jan 07, 2014 2:31 pm
by songlei
Hi guys
I am almost new to the docx4j , now i have a problem , i get the document.xml contents by

Code: Select all
MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
String xmlContent= documentPart.getXML();


after i do lots of modifications to the xml String like complex content replacement , refresh some datas, i want to convert this modified xml String back into a docx file, something like replacing the old document.xml with my new xml content and finally export it out. i have gone some examples in github ,but no idea how to implement it. i can't directly use the variableReplace() because some replacement depending on client's requirement is so complex with lots of conditions . does anyone know how to merge the xml String back ? thks :(

Re: How to convert the document.xml back to docx file

PostPosted: Tue Jan 07, 2014 2:55 pm
by songlei
hi guys , i think i get the solution for myself now , just as below :
Code: Select all
result = xmlContent.replaceAll("colour", "xiaoleilei");
      Object obj = XmlUtils.unmarshalString(result);
      documentPart.setJaxbElement((Document) obj);
      wordMLPackage.addTargetPart(documentPart);
      wordMLPackage.save(new java.io.File("C:\\workspaces\\branch2\\test\\src\\main\\java\\com\\songlei\\test\\download.docx"));


thanks guys

Re: How to convert the document.xml back to docx file

PostPosted: Tue Jan 07, 2014 4:29 pm
by jason
You don't need the line wordMLPackage.addTargetPart(documentPart) since the part is already in the pkg.

More generally (for anybody else reading and thinking of this approach), manipulating the XML string should be a last resort, since this is brittle. getXML() is mainly for debugging. If all you are doing is manipulating that, you may as well avoid docx4j altogether, and just unzip the docx...

Re: How to convert the document.xml back to docx file

PostPosted: Tue Jan 07, 2014 5:12 pm
by songlei
jason wrote:You don't need the line wordMLPackage.addTargetPart(documentPart) since the part is already in the pkg.

More generally (for anybody else reading and thinking of this approach), manipulating the XML string should be a last resort, since this is brittle. getXML() is mainly for debugging. If all you are doing is manipulating that, you may as well avoid docx4j altogether, and just unzip the docx...


Hi json , i understand, After i unzip it using the zipInputStream and get and modify the document.xml , i can't restore it back to a docx file ,because a docx file contains not only document.xml but other stuffs , it's hard for me to combine