Page 1 of 1

Error with big Docx files

PostPosted: Fri May 27, 2016 8:07 pm
by brunoop
Hello,

I am trying to use Docx4J in a web server to generate large Docx reports. We generate a XML and then inyect it into the Docx using many conditionals and repeats. The problem is, I want to get to ~1000 pages but I can only generate ~300-400 pages. With more than that, It shows the following error:

17:13:43.717 [default task-13] ERROR org.docx4j.model.datastorage.OpenDoPEHandler - New xpath entry overwrites existing xpath NVmAj_231_627

Why Is this error appearing and how can we fix it?

Thanks.

Re: Error with big Docx files

PostPosted: Wed Jun 01, 2016 4:06 am
by brunoop
Nobody has any Idea of why is this happening? I really need this to work and can't figure it out :(

Any help will be highly appreciated...

Thanks.

Re: Error with big Docx files

PostPosted: Wed Jun 01, 2016 7:26 pm
by jason
I don't think the message "New xpath entry overwrites existing xpath" is fatal, at least in the current release.

Without checking the source code, I guess it is saying that different repeats are creating the same XPath.

Turn logging up to see what is happening? It might be that it is just taking a long time... Since 3.3.0 was released, we have done performance optimizations specifically targeting this case. See 6 May entries at https://github.com/plutext/docx4j/commi ... atastorage

I'd recommend you try a nightly release from after 6 May, which you can get from http://www.docx4java.org/docx4j/

Re: Error with big Docx files

PostPosted: Fri Jun 24, 2016 3:59 am
by brunoop
I am trying to use a Nightly 3.3.1 version. I try to generate a report like with 3.3.0 but It just fails when I use it. Here is part of the stacktrace:

Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Problems applying bindings
at org.docx4j.model.datastorage.BindingTraverserXSLT.traverseToBind(BindingTraverserXSLT.java:188)
at org.docx4j.model.datastorage.BindingHandler.applyBindings(BindingHandler.java:259)
at org.docx4j.model.datastorage.BindingHandler.applyBindings(BindingHandler.java:189)
at org.docx4j.Docx4J.bind(Docx4J.java:426)
... 122 more
Caused by: java.lang.IllegalArgumentException: local part cannot be "null" when creating a QName
at javax.xml.namespace.QName.<init>(QName.java:244)
at javax.xml.namespace.QName.<init>(QName.java:188)
at org.docx4j.model.datastorage.xpathtracker.Histgram.update(Histgram.java:19)
at org.docx4j.model.datastorage.DomToXPathMap.walkTree(DomToXPathMap.java:67)
at org.docx4j.model.datastorage.DomToXPathMap.walkTree(DomToXPathMap.java:61)
at org.docx4j.model.datastorage.DomToXPathMap.map(DomToXPathMap.java:35)
at org.docx4j.model.datastorage.BindingTraverserXSLT.traverseToBind(BindingTraverserXSLT.java:162)

Re: Error with big Docx files

PostPosted: Fri Jun 24, 2016 4:25 pm
by jason
org.docx4j.model.datastorage.DomToXPathMap.walkTree(DomToXPathMap.java:67) is

Code: Select all
sourceNode.getLocalName()


Per the javadoc, for nodes created with a DOM Level 1 method, such as Document.createElement(), this is always null

How do you create your XML data / feed it to docx4j? Was that namespace aware?

Re: Error with big Docx files

PostPosted: Fri Jun 24, 2016 10:04 pm
by brunoop
The XML is created by ourselves using mainly Document.createElement() and Element.appendChild(). We do not use namespaces in any way so It is not namespace aware. Here is a small example of our XML:
    <rootel>
      <elem1>
        <subelem1>1</subelem1>
        <subelem2>2</subelem2>
      </elem1>
      <elem2>...</elem2>
    </rootel>

The code we use when feeding it to docx4j is the following:
Code: Select all
WordprocessingMLPackage wordMLPackage = Docx4J.load(template);
Docx4J.bind(wordMLPackage, xml, Docx4J.FLAG_BIND_INSERT_XML & Docx4J.FLAG_BIND_BIND_XML);
Docx4J.save(wordMLPackage, out, Docx4J.FLAG_NONE);




This worked fine with 3.3.0 and before so I dont get what is wrong with it. Do I need to use namespaces now? Could you post a sample of what the XML should look like?

Re: Error with big Docx files

PostPosted: Sat Jun 25, 2016 2:24 pm
by jason
Assuming you are using JAXP, can you just set:

Code: Select all
   documentBuilderFactory.setNamespaceAware(true);


with no other changes, and see what happens?

Re: Error with big Docx files

PostPosted: Tue Jun 28, 2016 12:30 am
by brunoop
Nothing, same error with setNamespaceAware(true). also with setValidating(true).

Re: Error with big Docx files

PostPosted: Tue Jun 28, 2016 3:12 am
by brunoop
Could you please post an example of a working XML so I can try to debug the problem?

Thank you.

Re: Error with big Docx files

PostPosted: Fri Jul 01, 2016 12:42 am
by brunoop
I tried using createElementNS instead of createElement and using a made up NameSpace, because reading the code I found out a function which assumes the existence of a Qname, so I thought that was the problem. It still didn't work, now It gives a different error:

Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Problems applying bindings
at org.docx4j.model.datastorage.BindingTraverserXSLT.traverseToBind(BindingTraverserXSLT.java:188)
at org.docx4j.model.datastorage.BindingHandler.applyBindings(BindingHandler.java:259)
at org.docx4j.model.datastorage.BindingHandler.applyBindings(BindingHandler.java:189)
at org.docx4j.Docx4J.bind(Docx4J.java:426)
... 122 more
Caused by: javax.xml.bind.UnmarshalException: unexpected element (URI:"http://schemas.openxmlformats.org/wordprocessingml/2006/main", local:"tc"). The expected elements are <{http://schemas.openxmlformats.org/wordprocessingml/2006/main}sdt>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}permEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}permStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlMoveFromRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlMoveToRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlMoveToRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}del>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlInsRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveFromRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}commentRangeStart>,<{http://schemas.openxmlformats.org/officeDocument/2006/math}oMathPara>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveFrom>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlDelRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlDelRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveToRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveToRangeStart>,<{http://schemas.openxmlformats.org/officeDocument/2006/math}oMath>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveFromRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}bookmarkStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXml>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlInsRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}ins>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlMoveFromRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}bookmarkEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}commentRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveTo>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}proofErr>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}tr>

I hope that helps...

Re: Error with big Docx files

PostPosted: Fri Jul 01, 2016 2:37 pm
by jason
brunoop wrote:Could you please post an example of a working XML so I can try to debug the problem?


https://github.com/plutext/docx4j/blob/ ... geXML.java is a working example. You'll notice the XML in this example, at sample-docs/word/databinding/binding-simple-data.xml does not use namespaces.

Re: Error with big Docx files

PostPosted: Fri Jul 01, 2016 2:40 pm
by jason
Caused by: javax.xml.bind.UnmarshalException: unexpected element (URI:"http://schemas.openxmlformats.org/wordprocessingml/2006/main", local:"tc"). The expected elements are <{http://schemas.openxmlformats.org/wordprocessingml/2006/main}sdt>,


That's a completely separate issue.

That's JAXB telling you you have added a tc somewhere it isn't allowed. (A table cell belongs in a tr. w:tr/w:sdt/w:sdtContent/w:tc is also ok)

Re: Error with big Docx files

PostPosted: Fri Jul 01, 2016 11:55 pm
by brunoop
https://github.com/plutext/docx4j/blob/ ... geXML.java is a working example. You'll notice the XML in this example, at sample-docs/word/databinding/binding-simple-data.xml does not use namespaces.


Are you sure that is a working example in 3.3.1? I downloaded the xml and docx used in the example and tried to generate a report but I keep getting the same error as before, "local part cannot be null when creating a QName". Also tried with 3.3.0 and it works perfectly.

Maybe I am not using the 3.3.1 version correctly? I am installing it with the following command:
Code: Select all
mvn install:install-file -Dfile=/home/dev/nightly/docx4j-3.3.1-nightly-20160605.jar

And then changing the docx4j dependency of my pom.xml to this:
<version>3.3.1-SNAPSHOT</version>

Re: Error with big Docx files

PostPosted: Sat Jul 02, 2016 11:44 am
by jason
brunoop wrote:Are you sure that is a working example in 3.3.1?


Yes.

Please set logging to INFO for XmlUtils

Using slf log4j, for example:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
    <logger name="org.docx4j.XmlUtils">
                <level value="info"/>
        </logger>      
 
Parsed in 0.000 seconds, using GeSHi 1.0.8.4


On docx4j startup, it will report your implementations. I have:

Code: Select all
INFO org.docx4j.jaxb.Context .<clinit> line 85 - java.vendor=Sun Microsystems Inc.
INFO org.docx4j.jaxb.Context .<clinit> line 86 - java.version=1.6.0_27
:
INFO org.docx4j.XmlUtils .<clinit> line 192 - Using com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
INFO org.docx4j.XmlUtils .<clinit> line 239 - Using com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

Re: Error with big Docx files

PostPosted: Wed Jul 06, 2016 3:18 am
by brunoop
This is my output. It is similar with 3.3.0 and 3.3.1.

Code: Select all
INFO  org.docx4j.jaxb.Context - java.vendor=Oracle Corporation
INFO  org.docx4j.jaxb.Context - java.version=1.8.0_91
INFO  org.docx4j.jaxb.Context - No MOXy JAXB config found; assume not intended..
WARN  org.docx4j.jaxb.NamespacePrefixMapperUtils - name: com.sun.xml.internal.bind.namespacePrefixMapper value: org.docx4j.jaxb.NamespacePrefixMapperSunInternal@73989866 .. trying RI.
INFO  org.docx4j.jaxb.NamespacePrefixMapperUtils - Using NamespacePrefixMapper, which is suitable for the JAXB RI
INFO  org.docx4j.jaxb.Context - Using JAXB Reference Implementation
INFO  org.docx4j.jaxb.Context - Not using MOXy; using com.sun.xml.bind.v2.runtime.JAXBContextImpl
WARN  org.docx4j.utils.ResourceUtils - Couldn't get resource: docx4j.properties
WARN  org.docx4j.Docx4jProperties - Couldn't find/read docx4j.properties; docx4j.properties not found via classloader.
INFO  org.docx4j.XmlUtils - Using com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
INFO  org.docx4j.XmlUtils - Using com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
INFO  org.docx4j.openpackaging.contenttype.ContentTypeManager - Detected WordProcessingML package
INFO  org.docx4j.openpackaging.io3.Load3 - Instantiated package of type org.docx4j.openpackaging.packages.WordprocessingMLPackage
INFO  org.docx4j.utils.XPathFactoryUtil - xpath implementation: __redirected.__XPathFactory
WARN  org.docx4j.openpackaging.io3.Load3 - No JAXB model for this CustomXmlDataStorage part; null
INFO  org.docx4j.openpackaging.io3.Load3 - package read;  elapsed time: 3362 ms
INFO  org.docx4j.model.datastorage.OpenDoPEHandler - OpenDoPE XPaths part missing (ok if you are just processing w15 repeatinSection)
WARN  org.docx4j.model.datastorage.BindingHandler - OpenDoPE XPaths part missing
INFO  org.docx4j.model.datastorage.BindingHandler - Using BindingTraverserXSLT, which is slower, but fully featured

Re: Error with big Docx files

PostPosted: Sat Jul 09, 2016 9:30 am
by jason
I tried java.version=1.8.0_05 (you report _91); it still works for me.

I've added some logging to report actual javax implementations in use; you may be using something other than the property setting: https://github.com/plutext/docx4j/commi ... 41ea31257c

I'll make a nightly build including this in the next day or so.

Re: Error with big Docx files

PostPosted: Fri Oct 21, 2016 9:53 am
by jason

Re: Error with big Docx files

PostPosted: Sat Nov 19, 2016 3:12 am
by klausT
Hello Jason,

I am using version 3.3.1 - the nightly build from 2016/10/30 - which still produces log entries like
org.docx4j.model.datastorage.OpenDoPEHandler (createNewXPathObject:1630) - New xpath entry overwrites existing xpath mCWoX_0
org.docx4j.model.datastorage.OpenDoPEHandler (createNewXPathObject:1630) - New xpath entry overwrites existing xpath mCWoX_1


Your assumption that this is caused by repeating xpaths is absolutely correct.
The overall result looks good. Nevertheless, it would be nice if these Errors could be fixed.
I am using Java 1.8.101

Thanks,
Klaus

Re: Error with big Docx files

PostPosted: Mon Nov 21, 2016 6:56 pm
by jason

Re: Error with big Docx files

PostPosted: Tue Nov 22, 2016 8:37 am
by klausT
Hi Jason,

The 3.3.2 beta results in this error:
java.lang.NullPointerException
at org.docx4j.model.datastorage.OpenDoPEHandler.processOpenDopeRepeat(OpenDoPEHandler.java:1012)
at org.docx4j.model.datastorage.OpenDoPEHandler.processBindingRoleIfAny(OpenDoPEHandler.java:821)
at org.docx4j.model.datastorage.OpenDoPEHandler.access$100(OpenDoPEHandler.java:80)
at org.docx4j.model.datastorage.OpenDoPEHandler$ShallowTraversor.apply(OpenDoPEHandler.java:667)
at org.docx4j.model.datastorage.OpenDoPEHandler$ShallowTraversor.walkJAXBElements(OpenDoPEHandler.java:713)
at org.docx4j.model.datastorage.OpenDoPEHandler$ShallowTraversor.walkJAXBElements(OpenDoPEHandler.java:729)
at org.docx4j.model.datastorage.OpenDoPEHandler$ShallowTraversor.walkJAXBElements(OpenDoPEHandler.java:729)
at org.docx4j.TraversalUtil.<init>(TraversalUtil.java:214)
at org.docx4j.model.datastorage.OpenDoPEHandler.preprocess(OpenDoPEHandler.java:276)


This doesn't not happen when using the nightly build from 2016/10/30

Cheers...Klaus

Re: Error with big Docx files

PostPosted: Tue Nov 22, 2016 1:56 pm
by jason
klausT wrote:This doesn't not happen when using the nightly build from 2016/10/30


Oops, a stupid last minute commit. Should be OK now - since you picked this up so quickly, I have elected to replace the beta with a new jar with the same name. So please download again...

Re: Error with big Docx files

PostPosted: Tue Nov 22, 2016 7:35 pm
by klausT
the latest version is working fine!