Page 1 of 1

Traverse problem

PostPosted: Wed Sep 06, 2017 3:26 am
by lazvegas
Hello,

I am trying to use a template with my docx4j. In my template I am using a placeholder like

#SessionDate#

but when I traverse the docx it is traversing it like that

#
SessionDate
#

How can I fix it ? Thank you.

Re: Traverse problem

PostPosted: Wed Sep 06, 2017 8:03 am
by jason
Sorry, it isn't clear what you are asking here.

Is #SessionDate# a variable you are trying to replace?

Including your code and/or the XML around #SessionDate# might make your question clearer.

Re: Traverse problem

PostPosted: Wed Sep 06, 2017 3:15 pm
by lazvegas
Hello I am using thefinal version of docx4j, I attached the template document and
the output of the traverse code.

OUTPUT:
Concole : SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
org.docx4j.wml.P ""
org.docx4j.wml.R ""
org.docx4j.wml.Text "#sessiond"
org.docx4j.wml.CTBookmark ""
org.docx4j.wml.CTMarkupRange ""
org.docx4j.wml.R ""
org.docx4j.wml.Text "ate#:"

and code is:

Code: Select all
package com.sap.tap4s.rest;

import java.io.File;
import java.net.URL;
import java.util.List;

import javax.xml.bind.JAXBContext;

import org.docx4j.TraversalUtil;
import org.docx4j.XmlUtils;
import org.docx4j.TraversalUtil.Callback;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart;
import org.docx4j.samples.AbstractSample;
import org.docx4j.wml.Body;


/**
* To see what parts comprise your docx, try the PartsList sample.
*
* There will always be a MainDocumentPart, usually called document.xml.
* This sample shows you what objects are in that part.
*
* It also shows a general approach for traversing the JAXB object tree in
* the Main Document part.  It can also be applied to headers, footers etc.
*
* It is an alternative to XSLT, and doesn't require marshalling/unmarshalling.
*
* If many cases, the method getJAXBNodesViaXPath would be more convenient,
* but there are 3 JAXB bugs which detract from that (see Getting Started). 
*
* See related classes SingleTraversalUtilVisitorCallback
* and CompoundTraversalUtilVisitorCallback
*
* @author jharrop
*
*/
public class OpenMainDocumentAndTraverse extends AbstractSample {

   public static JAXBContext context = org.docx4j.jaxb.Context.jc;

   public static void main(String[] args) throws Exception {

      /*
       * You can invoke this from an OS command line with something like:
       *
       * java -cp dist/docx4j.jar:dist/log4j-1.2.15.jar
       * org.docx4j.samples.OpenMainDocumentAndTraverse inputdocx
       *
       * Note the minimal set of supporting jars.
       *
       * If there are any images in the document, you will also need:
       *
       * dist/xmlgraphics-commons-1.4.jar:dist/commons-logging-1.1.1.jar
       */

      try {
         getInputFilePath(args);
      } catch (IllegalArgumentException e) {
         inputfilepath = "C:\\Temp\\temp1.docx";
      }
      
      WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
            .load(new java.io.File(inputfilepath));
      MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();

      // Uncomment to see the raw XML
      //System.out.println(XmlUtils.marshaltoString(documentPart.getJaxbElement(), true, true));

      org.docx4j.wml.Document wmlDocumentEl = (org.docx4j.wml.Document) documentPart
            .getJaxbElement();
      Body body = wmlDocumentEl.getBody();

      new TraversalUtil(body,

      new Callback() {

         String indent = "";

         @Override
         public List<Object> apply(Object o) {

            String text = "";
            if (o instanceof org.docx4j.wml.Text)
               text = ((org.docx4j.wml.Text) o).getValue();

            System.out.println(indent + o.getClass().getName() + "  \""
                  + text + "\"");
            return null;
         }

         @Override
         public boolean shouldTraverse(Object o) {
            return true;
         }

         // Depth first
         @Override
         public void walkJAXBElements(Object parent) {

            indent += "    ";

            List children = getChildren(parent);
            if (children != null) {

               for (Object o : children) {

                  // if its wrapped in javax.xml.bind.JAXBElement, get its
                  // value
                  o = XmlUtils.unwrap(o);

                  this.apply(o);

                  if (this.shouldTraverse(o)) {
                     walkJAXBElements(o);
                  }

               }
            }

            indent = indent.substring(0, indent.length() - 4);
         }

         @Override
         public List<Object> getChildren(Object o) {
            return TraversalUtil.getChildrenImpl(o);
         }

      }

      );

   }

}

Re: Traverse problem

PostPosted: Wed Sep 06, 2017 3:19 pm
by lazvegas
I also implemented but the same result

Code: Select all
   protected List<Object> getAllElementsFromObject(Object obj, Class<?> toSearch) {
        List<Object> result = new ArrayList<Object>();

           if (obj instanceof JAXBElement) {
               obj = ((JAXBElement<?>) obj).getValue();
           }
          
           if (obj instanceof SdtElement) {
              obj = ((SdtElement)obj).getSdtContent(); // implements ContentAccessor             
           }

           if (obj.getClass().equals(toSearch)) {
               result.add(obj);
           } else if (obj instanceof ContentAccessor) {
               List<?> children = ((ContentAccessor) obj).getContent();
               for (Object child : children) {
                   result.addAll(getAllElementsFromObject(child, toSearch));
               }
           }
      
      return result;
   }
   

Re: Traverse problem

PostPosted: Wed Sep 06, 2017 4:52 pm
by jason
You have the so-called "split run" problem, where Word is splitting your run of text with other markup in this case, you have a bookmark boundary in the middle of your text, plus CTMarkupRange).

https://github.com/plutext/docx4j/blob/ ... epare.java should help; you'll have to see whether it handles all the junk. If any junk remains, you can extend VariablePrepare to handle it.

The alternative is to manually correct your input.

Re: Traverse problem

PostPosted: Wed Sep 06, 2017 5:12 pm
by lazvegas
Thank you jason,
How can I manually correct your input ?

Regards. How can I delete the bookmarks and boudaries manually?

Re: Traverse problem

PostPosted: Wed Sep 06, 2017 5:23 pm
by lazvegas
Hello Json,

after executing https://github.com/plutext/docx4j/blob/ ... epare.java, it saves the file but the CTBookmark and CTMarkupRange and still remains. How can I extend the class ?

Thank you.

Re: Traverse problem

PostPosted: Wed Sep 06, 2017 9:23 pm
by jason
Probably easiest to just make a copy of the class, then add:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
                filterSettings.setRemoveBookmarks(true);
 
Parsed in 0.014 seconds, using GeSHi 1.0.8.4


before line 79.