Page 1 of 1

Parsing .docx XML format

PostPosted: Mon Mar 01, 2010 10:53 pm
by RithanyaLaxmi
Hi,

In my current project we are creating word 2007 document at runtime. I am developing Java component that takes word 2007 template (docx) files as input and it also fetches values from external web services. Then this java component builds actual word 2007 document (docx) file by merging the values into template file (replacing values with placeholders). Since I am new to word 2007 open xml file format, I am facing difficulties in parsing and updating the dotx/docx files.

How to parse and update the Word 2007 file using docx4j? I have a document.xml file which contains the placeholder with in the tag like this:-

<w:r><w:t xml:space="preserve"> [[</w:t></w:r><w:proofErr w:type="spellStart"/><w:proofErr w:type="gramStart"/><w:r><w:t>bodycontent</w:t></w:r><w:proofErr w:type="spellEnd"/><w:proofErr w:type="gramEnd"/><w:r><w:t>]]</w:t></w:r></w:p>

how to parse this tag and populate the data for the placeholder [[bodycontent]]? Please shed some light into this. Please do provide the code for this or a sample example to parse this .xml and providing the dynamic value for the placeholders.

Thanks,
Rithu

Re: Parsing .docx XML format

PostPosted: Tue Mar 02, 2010 7:50 am
by jason
Please see the Getting Started guide, especially the Text Substitution section.

There is also another approach entirely, called CustomXmlBinding. See http://dev.plutext.org/trac/docx4j/brow ... nding.java