Error with big Docx files
Posted:
Fri May 27, 2016 8:07 pm
by brunoop
Hello,
I am trying to use Docx4J in a web server to generate large Docx reports. We generate a XML and then inyect it into the Docx using many conditionals and repeats. The problem is, I want to get to ~1000 pages but I can only generate ~300-400 pages. With more than that, It shows the following error:
17:13:43.717 [default task-13] ERROR org.docx4j.model.datastorage.OpenDoPEHandler - New xpath entry overwrites existing xpath NVmAj_231_627
Why Is this error appearing and how can we fix it?
Thanks.
Re: Error with big Docx files
Posted:
Wed Jun 01, 2016 4:06 am
by brunoop
Nobody has any Idea of why is this happening? I really need this to work and can't figure it out
Any help will be highly appreciated...
Thanks.
Re: Error with big Docx files
Posted:
Wed Jun 01, 2016 7:26 pm
by jason
I don't think the message "New xpath entry overwrites existing xpath" is fatal, at least in the current release.
Without checking the source code, I guess it is saying that different repeats are creating the same XPath.
Turn logging up to see what is happening? It might be that it is just taking a long time... Since 3.3.0 was released, we have done performance optimizations specifically targeting this case. See 6 May entries at
https://github.com/plutext/docx4j/commi ... atastorageI'd recommend you try a nightly release from after 6 May, which you can get from
http://www.docx4java.org/docx4j/
Re: Error with big Docx files
Posted:
Fri Jun 24, 2016 3:59 am
by brunoop
I am trying to use a Nightly 3.3.1 version. I try to generate a report like with 3.3.0 but It just fails when I use it. Here is part of the stacktrace:
Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Problems applying bindings
at org.docx4j.model.datastorage.BindingTraverserXSLT.traverseToBind(BindingTraverserXSLT.java:188)
at org.docx4j.model.datastorage.BindingHandler.applyBindings(BindingHandler.java:259)
at org.docx4j.model.datastorage.BindingHandler.applyBindings(BindingHandler.java:189)
at org.docx4j.Docx4J.bind(Docx4J.java:426)
... 122 more
Caused by: java.lang.IllegalArgumentException: local part cannot be "null" when creating a QName
at javax.xml.namespace.QName.<init>(QName.java:244)
at javax.xml.namespace.QName.<init>(QName.java:188)
at org.docx4j.model.datastorage.xpathtracker.Histgram.update(Histgram.java:19)
at org.docx4j.model.datastorage.DomToXPathMap.walkTree(DomToXPathMap.java:67)
at org.docx4j.model.datastorage.DomToXPathMap.walkTree(DomToXPathMap.java:61)
at org.docx4j.model.datastorage.DomToXPathMap.map(DomToXPathMap.java:35)
at org.docx4j.model.datastorage.BindingTraverserXSLT.traverseToBind(BindingTraverserXSLT.java:162)
Re: Error with big Docx files
Posted:
Fri Jun 24, 2016 4:25 pm
by jason
org.docx4j.model.datastorage.DomToXPathMap.walkTree(DomToXPathMap.java:67) is
- Code: Select all
sourceNode.getLocalName()
Per the javadoc, for nodes created with a DOM Level 1 method, such as Document.createElement(), this is always null
How do you create your XML data / feed it to docx4j? Was that namespace aware?
Re: Error with big Docx files
Posted:
Fri Jun 24, 2016 10:04 pm
by brunoop
The XML is created by ourselves using mainly Document.createElement() and Element.appendChild(). We do not use namespaces in any way so It is not namespace aware. Here is a small example of our XML:
<rootel>
<elem1>
<subelem1>1</subelem1>
<subelem2>2</subelem2>
</elem1>
<elem2>...</elem2>
</rootel>
The code we use when feeding it to docx4j is the following:
- Code: Select all
WordprocessingMLPackage wordMLPackage = Docx4J.load(template);
Docx4J.bind(wordMLPackage, xml, Docx4J.FLAG_BIND_INSERT_XML & Docx4J.FLAG_BIND_BIND_XML);
Docx4J.save(wordMLPackage, out, Docx4J.FLAG_NONE);
This worked fine with 3.3.0 and before so I dont get what is wrong with it. Do I need to use namespaces now? Could you post a sample of what the XML should look like?
Re: Error with big Docx files
Posted:
Sat Jun 25, 2016 2:24 pm
by jason
Assuming you are using JAXP, can you just set:
- Code: Select all
documentBuilderFactory.setNamespaceAware(true);
with no other changes, and see what happens?
Re: Error with big Docx files
Posted:
Tue Jun 28, 2016 12:30 am
by brunoop
Nothing, same error with setNamespaceAware(true). also with setValidating(true).
Re: Error with big Docx files
Posted:
Tue Jun 28, 2016 3:12 am
by brunoop
Could you please post an example of a working XML so I can try to debug the problem?
Thank you.
Re: Error with big Docx files
Posted:
Fri Jul 01, 2016 12:42 am
by brunoop
I tried using createElementNS instead of createElement and using a made up NameSpace, because reading the code I found out a function which assumes the existence of a Qname, so I thought that was the problem. It still didn't work, now It gives a different error:
Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Problems applying bindings
at org.docx4j.model.datastorage.BindingTraverserXSLT.traverseToBind(BindingTraverserXSLT.java:188)
at org.docx4j.model.datastorage.BindingHandler.applyBindings(BindingHandler.java:259)
at org.docx4j.model.datastorage.BindingHandler.applyBindings(BindingHandler.java:189)
at org.docx4j.Docx4J.bind(Docx4J.java:426)
... 122 more
Caused by: javax.xml.bind.UnmarshalException: unexpected element (URI:"http://schemas.openxmlformats.org/wordprocessingml/2006/main", local:"tc"). The expected elements are <{http://schemas.openxmlformats.org/wordprocessingml/2006/main}sdt>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}permEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}permStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlMoveFromRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlMoveToRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlMoveToRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}del>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlInsRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveFromRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}commentRangeStart>,<{http://schemas.openxmlformats.org/officeDocument/2006/math}oMathPara>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveFrom>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlDelRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlDelRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveToRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveToRangeStart>,<{http://schemas.openxmlformats.org/officeDocument/2006/math}oMath>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveFromRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}bookmarkStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXml>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlInsRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}ins>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}customXmlMoveFromRangeStart>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}bookmarkEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}commentRangeEnd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}moveTo>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}proofErr>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}tr>
I hope that helps...
Re: Error with big Docx files
Posted:
Fri Jul 01, 2016 2:37 pm
by jason
brunoop wrote:Could you please post an example of a working XML so I can try to debug the problem?
https://github.com/plutext/docx4j/blob/ ... geXML.java is a working example. You'll notice the XML in this example, at sample-docs/word/databinding/binding-simple-data.xml does not use namespaces.
Re: Error with big Docx files
Posted:
Fri Jul 01, 2016 2:40 pm
by jason
Caused by: javax.xml.bind.UnmarshalException: unexpected element (URI:"http://schemas.openxmlformats.org/wordprocessingml/2006/main", local:"tc"). The expected elements are <{http://schemas.openxmlformats.org/wordprocessingml/2006/main}sdt>,
That's a completely separate issue.
That's JAXB telling you you have added a tc somewhere it isn't allowed. (A table cell belongs in a tr. w:tr/w:sdt/w:sdtContent/w:tc is also ok)
Re: Error with big Docx files
Posted:
Fri Jul 01, 2016 11:55 pm
by brunoop
https://github.com/plutext/docx4j/blob/ ... geXML.java is a working example. You'll notice the XML in this example, at sample-docs/word/databinding/binding-simple-data.xml does not use namespaces.
Are you sure that is a working example in 3.3.1? I downloaded the xml and docx used in the example and tried to generate a report but I keep getting the same error as before, "local part cannot be null when creating a QName". Also tried with 3.3.0 and it works perfectly.
Maybe I am not using the 3.3.1 version correctly? I am installing it with the following command:
- Code: Select all
mvn install:install-file -Dfile=/home/dev/nightly/docx4j-3.3.1-nightly-20160605.jar
And then changing the docx4j dependency of my pom.xml to this:
<version>3.3.1-SNAPSHOT</version>
Re: Error with big Docx files
Posted:
Sat Jul 02, 2016 11:44 am
by jason
brunoop wrote:Are you sure that is a working example in 3.3.1?
Yes.
Please set logging to INFO for XmlUtils
Using slf log4j, for example:
Using xml Syntax Highlighting
<logger name="org.docx4j.XmlUtils">
<level value="info"/>
</logger>
Parsed in 0.000 seconds, using
GeSHi 1.0.8.4
On docx4j startup, it will report your implementations. I have:
- Code: Select all
INFO org.docx4j.jaxb.Context .<clinit> line 85 - java.vendor=Sun Microsystems Inc.
INFO org.docx4j.jaxb.Context .<clinit> line 86 - java.version=1.6.0_27
:
INFO org.docx4j.XmlUtils .<clinit> line 192 - Using com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
INFO org.docx4j.XmlUtils .<clinit> line 239 - Using com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
Re: Error with big Docx files
Posted:
Wed Jul 06, 2016 3:18 am
by brunoop
This is my output. It is similar with 3.3.0 and 3.3.1.
- Code: Select all
INFO org.docx4j.jaxb.Context - java.vendor=Oracle Corporation
INFO org.docx4j.jaxb.Context - java.version=1.8.0_91
INFO org.docx4j.jaxb.Context - No MOXy JAXB config found; assume not intended..
WARN org.docx4j.jaxb.NamespacePrefixMapperUtils - name: com.sun.xml.internal.bind.namespacePrefixMapper value: org.docx4j.jaxb.NamespacePrefixMapperSunInternal@73989866 .. trying RI.
INFO org.docx4j.jaxb.NamespacePrefixMapperUtils - Using NamespacePrefixMapper, which is suitable for the JAXB RI
INFO org.docx4j.jaxb.Context - Using JAXB Reference Implementation
INFO org.docx4j.jaxb.Context - Not using MOXy; using com.sun.xml.bind.v2.runtime.JAXBContextImpl
WARN org.docx4j.utils.ResourceUtils - Couldn't get resource: docx4j.properties
WARN org.docx4j.Docx4jProperties - Couldn't find/read docx4j.properties; docx4j.properties not found via classloader.
INFO org.docx4j.XmlUtils - Using com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
INFO org.docx4j.XmlUtils - Using com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
INFO org.docx4j.openpackaging.contenttype.ContentTypeManager - Detected WordProcessingML package
INFO org.docx4j.openpackaging.io3.Load3 - Instantiated package of type org.docx4j.openpackaging.packages.WordprocessingMLPackage
INFO org.docx4j.utils.XPathFactoryUtil - xpath implementation: __redirected.__XPathFactory
WARN org.docx4j.openpackaging.io3.Load3 - No JAXB model for this CustomXmlDataStorage part; null
INFO org.docx4j.openpackaging.io3.Load3 - package read; elapsed time: 3362 ms
INFO org.docx4j.model.datastorage.OpenDoPEHandler - OpenDoPE XPaths part missing (ok if you are just processing w15 repeatinSection)
WARN org.docx4j.model.datastorage.BindingHandler - OpenDoPE XPaths part missing
INFO org.docx4j.model.datastorage.BindingHandler - Using BindingTraverserXSLT, which is slower, but fully featured
Re: Error with big Docx files
Posted:
Sat Jul 09, 2016 9:30 am
by jason
I tried java.version=1.8.0_05 (you report _91); it still works for me.
I've added some logging to report actual javax implementations in use; you may be using something other than the property setting:
https://github.com/plutext/docx4j/commi ... 41ea31257cI'll make a nightly build including this in the next day or so.
Re: Error with big Docx files
Posted:
Sat Nov 19, 2016 3:12 am
by klausT
Hello Jason,
I am using version 3.3.1 - the nightly build from 2016/10/30 - which still produces log entries like
org.docx4j.model.datastorage.OpenDoPEHandler (createNewXPathObject:1630) - New xpath entry overwrites existing xpath mCWoX_0
org.docx4j.model.datastorage.OpenDoPEHandler (createNewXPathObject:1630) - New xpath entry overwrites existing xpath mCWoX_1
Your assumption that this is caused by repeating xpaths is absolutely correct.
The overall result looks good. Nevertheless, it would be nice if these Errors could be fixed.
I am using Java 1.8.101
Thanks,
Klaus
Re: Error with big Docx files
Posted:
Tue Nov 22, 2016 8:37 am
by klausT
Hi Jason,
The 3.3.2 beta results in this error:
java.lang.NullPointerException
at org.docx4j.model.datastorage.OpenDoPEHandler.processOpenDopeRepeat(OpenDoPEHandler.java:1012)
at org.docx4j.model.datastorage.OpenDoPEHandler.processBindingRoleIfAny(OpenDoPEHandler.java:821)
at org.docx4j.model.datastorage.OpenDoPEHandler.access$100(OpenDoPEHandler.java:80)
at org.docx4j.model.datastorage.OpenDoPEHandler$ShallowTraversor.apply(OpenDoPEHandler.java:667)
at org.docx4j.model.datastorage.OpenDoPEHandler$ShallowTraversor.walkJAXBElements(OpenDoPEHandler.java:713)
at org.docx4j.model.datastorage.OpenDoPEHandler$ShallowTraversor.walkJAXBElements(OpenDoPEHandler.java:729)
at org.docx4j.model.datastorage.OpenDoPEHandler$ShallowTraversor.walkJAXBElements(OpenDoPEHandler.java:729)
at org.docx4j.TraversalUtil.<init>(TraversalUtil.java:214)
at org.docx4j.model.datastorage.OpenDoPEHandler.preprocess(OpenDoPEHandler.java:276)
This doesn't not happen when using the nightly build from 2016/10/30
Cheers...Klaus
Re: Error with big Docx files
Posted:
Tue Nov 22, 2016 1:56 pm
by jason
klausT wrote:This doesn't not happen when using the nightly build from 2016/10/30
Oops, a stupid last minute commit. Should be OK now - since you picked this up so quickly, I have elected to replace the beta with a new jar with the same name. So please download again...
Re: Error with big Docx files
Posted:
Tue Nov 22, 2016 7:35 pm
by klausT
the latest version is working fine!