Page 1 of 1

docx4j JaxbValidationEventHandler logs Error

PostPosted: Thu Apr 05, 2018 11:29 pm
by admin2
docx4j JaxbValidationEventHandler openpackaging.parts.JaxbXmlPart

docx4j printed an exception(not throwed ) when just loading docx file . The program continued, but, it seems docx4j cann't handle ppr and rpr tags in /word/document.xml

test code:
Code: Select all
public static void main(String[] args)throws Exception {
        logger.info("size is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), PPr.class).size());  // returns 0!!
        logger.info("size is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), RPr.class).size());  // returns 0!!
        logger.info("size is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), Text.class).size());
        logger.info("size is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), P.class).size());
        logger.info("size is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), R.class).size());
}


system out :

20:05:22.955 [main] DEBUG org.docx4j.openpackaging.parts.JaxbXmlPart - Lazily unmarshalling /word/document.xml
20:05:22.955 [main] DEBUG org.docx4j.openpackaging.parts.JaxbXmlPartXPathAware - For org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart, unmarshall (no binder)
20:05:22.995 [main] WARN org.docx4j.jaxb.JaxbValidationEventHandler - [ERROR] : 意外的元素 (uri:"http://schemas.openxmlformats.org/markup-compatibility/2006", local:"AlternateContent")。所需元素为<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}delInstrText>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}footnoteReference>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}footnoteRef>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}endnoteRef>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}sym>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}yearShort>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}endnoteReference>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}softHyphen>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}continuationSeparator>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}ptab>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}br>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}pgNum>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}noBreakHyphen>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}rPr>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}tab>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}commentReference>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}drawing>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}lastRenderedPageBreak>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}monthLong>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}pict>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}annotationRef>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}monthShort>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}instrText>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}cr>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}ruby>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}dayShort>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}separator>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}yearLong>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}t>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}delText>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}dayLong>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}object>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}fldChar>
20:05:22.996 [main] WARN org.docx4j.jaxb.JaxbValidationEventHandler - Column is 6617 at line number 2
20:05:23.003 [main] DEBUG org.docx4j.jaxb.JaxbValidationEventHandler - shouldContinue is set to false
java.lang.Throwable: null
at org.docx4j.jaxb.JaxbValidationEventHandler.handleEvent(JaxbValidationEventHandler.java:186)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(UnmarshallingContext.java:716)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:247)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:242)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Loader.java:109)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.childElement(Loader.java:90)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StructureLoader.childElement(StructureLoader.java:237)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(UnmarshallingContext.java:556)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(UnmarshallingContext.java:538)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.InterningXmlVisitor.startElement(InterningXmlVisitor.java:60)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXStreamConnector.handleStartElement(StAXStreamConnector.java:231)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXStreamConnector.bridge(StAXStreamConnector.java:165)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:400)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:371)
at org.docx4j.openpackaging.parts.JaxbXmlPartXPathAware.unmarshal(JaxbXmlPartXPathAware.java:445)
at org.docx4j.openpackaging.parts.JaxbXmlPartXPathAware.unmarshal(JaxbXmlPartXPathAware.java:346)
at org.docx4j.openpackaging.parts.JaxbXmlPart.getContents(JaxbXmlPart.java:176)
at org.docx4j.openpackaging.parts.JaxbXmlPart.getJaxbElement(JaxbXmlPart.java:198)
at org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart.getContent(MainDocumentPart.java:147)
at com.edm.infra.docx.DocxUtil.getAllElementFromObject(DocxUtil.java:149)
20:05:23.004 [main] INFO org.docx4j.openpackaging.parts.JaxbXmlPartXPathAware - encountered unexpected content in /word/document.xml; pre-processing
20:05:23.065 [main] DEBUG org.docx4j.utils.ResourceUtils - docx4j.jaxb.JaxbValidationEventHandler resolved to org/docx4j/jaxb/mc-preprocessor.xslt
20:05:23.065 [main] DEBUG org.docx4j.utils.ResourceUtils - Attempting to load: org/docx4j/jaxb/mc-preprocessor.xslt
20:05:23.516 [main] INFO org.docx4j.XmlUtils - Using org.apache.xalan.transformer.TransformerImpl
20:05:23.715 [main] WARN org.docx4j.utils.XSLTUtils - Found some mc:AlternateContent
20:05:23.717 [main] WARN org.docx4j.utils.XSLTUtils - Selecting w:pict
20:05:24.098 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 0
20:05:24.099 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 0
20:05:24.100 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 105
20:05:24.100 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 47
20:05:24.101 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 111


DocxUtil.java

Code: Select all
public static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
        List<Object> result = new ArrayList<Object>();
        if (obj instanceof JAXBElement) obj = ((JAXBElement<?>) obj).getValue();
        if (obj.getClass().equals(toSearch)) {
            result.add(obj);
        }
        else if (obj instanceof ContentAccessor) {
        List<?> children = ((ContentAccessor) obj).getContent();  // line 149!!!
        for (Object child : children) {
                result.addAll(getAllElementFromObject(child, toSearch));
            }
        }
        return result;
    }

Re: docx4j JaxbValidationEventHandler logs Error

PostPosted: Fri Apr 06, 2018 9:01 am
by jason
admin2 wrote:20:05:22.995 [main] WARN org.docx4j.jaxb.JaxbValidationEventHandler - [ERROR] : 意外的元素 (uri:"http://schemas.openxmlformats.org/markup-compatibility/2006", local:"AlternateContent")。所需元素为<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}delInstrText>,


This looks like it is normal mc:AlternateContent processing, in this case containing w:pict:

admin2 wrote:20:05:23.004 [main] INFO org.docx4j.openpackaging.parts.JaxbXmlPartXPathAware - encountered unexpected content in /word/document.xml; pre-processing
20:05:23.065 [main] DEBUG org.docx4j.utils.ResourceUtils - docx4j.jaxb.JaxbValidationEventHandler resolved to org/docx4j/jaxb/mc-preprocessor.xslt
20:05:23.065 [main] DEBUG org.docx4j.utils.ResourceUtils - Attempting to load: org/docx4j/jaxb/mc-preprocessor.xslt
20:05:23.516 [main] INFO org.docx4j.XmlUtils - Using org.apache.xalan.transformer.TransformerImpl
20:05:23.715 [main] WARN org.docx4j.utils.XSLTUtils - Found some mc:AlternateContent
20:05:23.717 [main] WARN org.docx4j.utils.XSLTUtils - Selecting w:pict

Re: docx4j JaxbValidationEventHandler logs Error

PostPosted: Fri Apr 06, 2018 1:15 pm
by admin2
Thanks Jason.

#1
Indeed, the document.xml does contains w:pict. Is there anyway docx4j could handle w:pict ?

#2
And there's another question in the post before, I coun't get PPr Object or RPr Object directly from main doc part, is it some kind of restriction or bad coding ?
By the way, since I can get R.class object through the same method 'R r = (R)DocxUtil.getAllElementFromObject(...)' , and then using RPr rpr = (RPr)r.getRPr() dose returns it's value. Is it means we just couldn't get RPr.class object from main document part?

Code: Select all
        logger.info("sieze is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), PPr.class).size());  // returns 0
        logger.info("sieze is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), RPr.class).size());  // returns 0
        logger.info("sieze is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), Text.class).size());
        logger.info("sieze is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), P.class).size());
        logger.info("sieze is  {}", DocxUtil.getAllElementFromObject(docxTemplate.getMainDocumentPart(), R.class).size());


console loged:
20:05:24.098 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 0
20:05:24.099 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 0
20:05:24.100 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 105
20:05:24.100 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 47
20:05:24.101 [main] INFO com.edm.dev.TEMPLATE_GEN3 - size is 111

Re: docx4j JaxbValidationEventHandler logs Error

PostPosted: Fri Apr 06, 2018 9:38 pm
by jason
Docx4j does handle w:pict. It generally doesn't preserve mc:alternateContent (though it does in some particular cases); it generally selects the fallback: https://github.com/plutext/docx4j/blob/ ... essor.xslt So assuming your w:pict is in there, that's what will be retained.

Regarding your question #2, the reason is that PPr and RPr aren't in the respective content lists which your code relies on. See https://github.com/plutext/docx4j/blob/ ... P.java#L82