Page 1 of 1

Opening and saving Word XML document (2006 ML)

PostPosted: Wed Dec 10, 2008 7:25 am
by codified44
Hi All:

I would like to do the following:

1) open a xml document that is a "Word XML document" (2006 ML); it is the document that is created by MS Word 2007 when the document is saved in "Word XML Document" using "Save As.."

2) save a document in Word XML document (2006 ML)

Is this possible using docx4j? If so, how would i be able to do that?


Re: Opening and saving Word XML document (2006 ML)

PostPosted: Wed Dec 10, 2008 10:07 am
by jason
Hi, I think you are in luck.

Using Word 2007, I saved a document using that choice of filetype.

The result was an XML file which looks like:

Code: Select all
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg=""><pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512"><pkg:xmlData><Relationships xmlns=""><Relationship Id="rId3" Type="" Target="docProps/app.xml"/><Relationship Id="rId2" Type="" Target="docProps/core.xml"/><Relationship Id="rId1" Type="" Target="word/document.xml"/></Relationships></pkg:xmlData></pkg:part> ....

which looks like what I've been calling "Package format". docx4j uses this as an intermediate format for XSLT transformations; docx4all shows it when the uses selects "View Source"

So you should be able to import this using, and export to it using org.docx4j.convert.out.xmlPackage.XmlPackage

See also org.docx4j.samples.ImportFromPackageFormat and .ExportInPackageFormat

Let us know how you go!

Re: Opening and saving Word XML document (2006 ML)

PostPosted: Tue Dec 16, 2008 5:29 am
by codified44
You are right. I was able to save the document using out.xmlPackage.XmlPackage(wordMLPackage)

org.docx4j.convert.out.xmlPackage.XmlPackage xmlPackage = new org.docx4j.convert.out.xmlPackage.XmlPackage(wordMLPackage);
org.docx4j.xmlPackage.Package pkg = xmlPackage.get();
Marshaller marshaller = getMarshaller(); // pseudo code for getting the marshaller
marshaller.marshal(pkg, new FileOutputStream(outputfilepath));

However, the xml that gets saved by docx4j is slightly different from the xml that MS Word 2007 generates.

MS Word 2007 generates (notice line 2 mso-application tag)

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg="">

docx4j generates (does not contain the mso-application tag)

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pkg:package xmlns:pkg="">

By including the <?mso-application ?> tag, the xml gets associated with MS Word. It there anyway that docx4j can generate the tag?


Re: Opening and saving Word XML document (2006 ML)

PostPosted: Wed Dec 17, 2008 6:47 am
by jason
Marshalling the org.docx4j.xmlPackage.Package object is a job for one of the convenience methods in org.docx4j.XmlUtils, for example marshaltoString.

If you marshall suppressing the XML declaration, it is easy enough to add the XML declaration + <?mso-application progid="Word.Document"?>

Or am i missing something?



Re: Opening and saving Word XML document (2006 ML)

PostPosted: Wed Dec 17, 2008 7:39 am
by codified44
No Jason. You are not missing anything.

I am not familiar with WordML. So far, I have worked with XMLBeans and was able to get away from using JAXB.

If i could kick start my brain, it would not ask such questions :)

On the brighter side, answers to my questions take less than 60 seconds :)