Page 1 of 1

Traversing DocX-File

PostPosted: Sat Nov 16, 2013 1:02 am
by tori
Hi community,

I created a simple docx file consisting of 4 lines with simple text. I tried to manipulate some parts of the text by traversing over all org.docx4j.wml.Text elements:

Code: Select all
WordprocessingMLPackage wpml = WordprocessingMLPackage.load(pathToDocX.toFile());
MainDocumentPart mdp = wpml.getMainDocumentPart();
      
Finder finder = new Finder(Text.class);
new TraversalUtil(mdp.getContent(), finder);

public static class Finder extends CallbackImpl   {
      
protected Class<?> typeToFind;
      
      public List<Object> results = new ArrayList<>();
      
      protected Finder(Class<?> typeToFind) {
         this.typeToFind   = typeToFind;
      }
            
      public List<Object> apply(Object o) {
         // Adapt as required
         if (o.getClass().equals(typeToFind)) {
            results.add(o);
         }
         return null;
      }      
   }


I got all Text-Elements but not in one piece. I believe its because of my word document:
Code: Select all
<w:document mc:Ignorable="w14 wp14">
            <w:body>
                <w:p>
                    <w:r>
                        <w:t>Das ist Text 1!</w:t>
                    </w:r>
                </w:p>
                <w:p>
                    <w:r>
                        <w:t>Das ist Text 2!</w:t>
                    </w:r>
                </w:p>
                <w:p>
                    <w:r>
                        <w:t>Das ist Text3!</w:t>
                    </w:r>
                </w:p>
                <w:p>
                    <w:r>
                        <w:t>Das ist Text 4!</w:t>
                    </w:r>
                </w:p>
                <w:p/>
                <w:p>
                    <w:r>
                        <w:t xml:space="preserve">Hier ein ganz langer </w:t>
                    </w:r>
                    <w:r>
                        <w:t>Text!</w:t>
                    </w:r>
                    <w:bookmarkStart w:name="_GoBack" w:id="0"/>
                    <w:bookmarkEnd w:id="0"/>
                </w:p>
                <w:sectPr>
                    <w:pgSz w:w="11906" w:h="16838"/>
                    <w:pgMar w:top="1417" w:right="1417" w:bottom="1134" w:left="1417" w:header="708" w:footer="708" w:gutter="0"/>
                    <w:cols w:space="708"/>
                    <w:docGrid w:linePitch="360"/>
                </w:sectPr>
            </w:body>
        </w:document>


How can i handle such docx files with traversing? Should i traverse over paragraphs?

Thanks for help.

Re: Traversing DocX-File

PostPosted: Sat Nov 16, 2013 7:17 am
by jason
What is your objective?

Re: Traversing DocX-File

PostPosted: Mon Nov 18, 2013 9:06 pm
by tori
Thanks jason for your replay and sorry for my unclear post:

My objective is to replace special placeholder in a docx-file which I want to use as template. For example:

Code: Select all
This is my <placeholder1> and this is my <placeholder2>!


I did traversing over the docx-file in order to find all Text.class elements. Next I checked if the value of the Text.class element contains a special placeholder which i want to replace via the getValue methode. I expected to get only one single Text.class element with a value of "This is my <placeholder1> and this is my <placeholder2>!". However, the text elements which i found are:

Code: Select all
org.docx4j.wml.Text@20498030 --> whith a value --> "This is "
org.docx4j.wml.Text@4a58fee6 --> whith a value --> "my <placeholder1"
org.docx4j.wml.Text@4a58fee6 --> whith a value --> "> and this is my <"
org.docx4j.wml.Text@4a58fee6 --> whith a value --> "placeholder2>!"


I used Word2010 and docx4j version 2.8.1

Re: Traversing DocX-File

PostPosted: Mon Nov 18, 2013 9:16 pm
by jason
Yes, the problem with variable replacement approaches is "split runs".

You may be able to work around some/most issues by running your docx through https://github.com/plutext/docx4j/blob/ ... epare.java

Alternatively, you could use content control data binding instead, which isn't affected by split runs

Re: Traversing DocX-File

PostPosted: Mon Nov 18, 2013 9:42 pm
by tori
Thanks jason,

I will try to fix it via VariablePrepare. The only VariablePrepare class file available is: org.docx4j.samples.VariablePrepare without the static prepare() method; Can i found that in a jar-file or should i do copy and past of your posted implementation?