I am running into a problem when I attempt to read the contents of a header that contains Word drawings. I can read and modify the contents within the code alright, but when the file is saved, Word is unable to open it. The details of the error Word gives are "The XML data is invalid according to the schema. Location: Part: /word/header3.xml, Line: 0, Column: 0". I am using version 6.0.1 of docx4j through Maven. I have identified is that this only seems to happen when I am using MOXy. The project I need this code for is in WebLogic Server, which uses MOXy as the default JAXB implementation. I tried setting the Java system properties as defined at the end of this page to change to the Glassfish RI provider, but it didn't seem to work. I should note that I have reproduced this issue both in and outside of WebLogic.
This code should reproduce the issue using the problematic documents (args is the file to read from, and args is the file to save to):
- Code: Select all
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new File(args));
HeaderPart part = (HeaderPart) wordMLPackage.getParts().get(new PartName("/word/header3.xml"));
After some experimentation with diving into the getContents() and unmarshal() code, the problem seems to happen when the part is unmarshalled and the resulting Hdr object is set as the contents of the HeaderPart. I've attached the problematic file from both before and after running the above code. Do you have any idea what the issue could be? Let me know if you need more information.