Page 1 of 1

"Invalid xml after template modification" 6.0.1

PostPosted: Wed Sep 19, 2018 7:23 am
by capocomico
Hi there,
I saw a previous post in the forum with someone getting this Warning Message after creating a document from a template, so I've tried the solution that solved it, but it didn't work for me. (The solution was upgrading version from 3.3.5 to 6.0.1). In my case I've changed from version 6.0.0 to 6.0.1

After version upgrading, I keep getting this warnig when I want to open my word document:

"The xml data is invalid according to the schema. Location: Part: /word/document.xml, Line:0, Column:0".
After that error message I get another message:
"Word found unreadable content. Do you want to recover the content of this document? If you trust the source of this document, click yes".


If i click yes, I can see my document filled with my data, but it appears to be a new document, in fact it has Document1.docx as it's name instead of the one I've set.

Anyone know how avoid getting this warning?

Thanks in advance

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Wed Sep 19, 2018 9:32 am
by jason
Please post the problematic document, or if it is sensitive, you can run it through https://github.com/plutext/docx4j/blob/ ... ingle.java

Also, please advise:

- what version of Java you are using

- are you using MOXy?

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Thu Sep 20, 2018 12:06 am
by capocomico
jason wrote:Please post the problematic document, or if it is sensitive, you can run it through https://github.com/plutext/docx4j/blob/ ... ingle.java

Also, please advise:

- what version of Java you are using

- are you using MOXy?


Thank for your reply, I'm using java version "1.8.0_181", and I'm not using MOXy.

I've tried to anonymize the documente with the sample code, and It doesn't work result:

Exception in thread "main" java.lang.Error: Unresolved compilation problems:
PresentationMLPackage cannot be resolved to a type
SpreadsheetMLPackage cannot be resolved to a type
...
at org.docx4j.samples.AnonSingle.main(AnonSingle.java:23)

I got this result with both documents, my document and sample document.

Here is the result document.

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Thu Sep 20, 2018 9:49 am
by jason
capocomico wrote:Exception in thread "main" java.lang.Error: Unresolved compilation problems:
PresentationMLPackage cannot be resolved to a type
SpreadsheetMLPackage cannot be resolved to a type


Seems like your classpath is broken; these are in the docx4j jar.

The docx you attached gives errors when I unzip it with 7-zip; docProps\core.xml is empty. Not sure whether that's the only problem.

How did you create the docx and save it?

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Thu Sep 20, 2018 11:44 pm
by capocomico
jason wrote:Seems like your classpath is broken; these are in the docx4j jar.

The docx you attached gives errors when I unzip it with 7-zip; docProps\core.xml is empty. Not sure whether that's the only problem.

How did you create the docx and save it?


The template is a docx document, that the client gave me.
For getting the template, I use

WordprocessingMLPackage wordDoc = WordprocessingMLPackage.load(FileInputStream)

For text replacements I use:

variableReplace(java.util.Map<String, String>) for text in paragraphs and table titles.

and for the rows content on the tables i use
Code: Select all
private List<Object> getAllElementsFrom(Object object, Class<?> searchParam) {
        List<Object> result = new ArrayList<Object>();
        if (object instanceof JAXBElement)
            object = ((JAXBElement<?>) object).getValue();

        if (object.getClass().equals(searchParam))
            result.add(object);
        else if (object instanceof ContentAccessor) {
            List<?> children = ((ContentAccessor) object).getContent();
            for (Object child : children) {
                result.addAll(getAllElementsFrom(child, searchParam));
            }
        }
        return result;
    }
private void addRowToTable(Tbl templateTable, Tr templateRow, Map<String, String> repacements) {
        Tr workingRow = (Tr) XmlUtils.deepCopy(templateRow);
        List<?> textElements = getlAlEllementsFrom(workingRow, Text.class);
        for (Object object : textElements) {
            Text text = (Text) object;
            String replacementValue = (String) replacements.get(text.getValue());
            if (replacementValue != null)
                text.setValue(replacementValue);
        }
        templateTable.getContent().add(workingRow);
    }


For saving the document:

wordDoc.save(new File("result.docx"));

I would probably need to use the .save(ByteArrayOutputStream) after too.

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Fri Sep 21, 2018 7:13 am
by jason
If you manually unzip the input docx, does that work ok?

What does its docProps\core.xml contain?

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Fri Sep 21, 2018 11:19 pm
by capocomico
jason wrote:If you manually unzip the input docx, does that work ok?

What does its docProps\core.xml contain?


i unzipped with winzip and got this on the docProps/cores.xml

Code: Select all
<?xml version="1.0" encoding="UTF-8" standalone="true"?>

-<cp:coreProperties xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dcmitype="http://purl.org/dc/dcmitype/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties">

<dc:title/>

<dc:subject/>

<cp:keywords/>

<dc:description/>

<cp:revision>69</cp:revision>

<cp:lastPrinted>2018-01-25T16:31:00Z</cp:lastPrinted>

<dcterms:created xsi:type="dcterms:W3CDTF">2018-09-12T18:00:00Z</dcterms:created>

<dcterms:modified xsi:type="dcterms:W3CDTF">2018-09-19T12:35:00Z</dcterms:modified>

</cp:coreProperties>

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Sat Sep 22, 2018 8:02 am
by jason
Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
<?xml version="1.0" encoding="UTF-8" standalone="true"?>

-<cp:coreProperties
 
Parsed in 0.000 seconds, using GeSHi 1.0.8.4


That dash ('-') immediately before the cp:coreProperties element is definitely a problem!

If you delete the dash in-place in a zip file editor (I guess WinZip can do this?), then save, then try again...

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Tue Sep 25, 2018 5:03 am
by capocomico
jason wrote:
Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
<?xml version="1.0" encoding="UTF-8" standalone="true"?>

-<cp:coreProperties
 
Parsed in 0.000 seconds, using GeSHi 1.0.8.4


That dash ('-') immediately before the cp:coreProperties element is definitely a problem!

If you delete the dash in-place in a zip file editor (I guess WinZip can do this?), then save, then try again...
รง


That dash only appears when I open the xml with edge browser. I cant't edit docProps\core.xml from the zip file.
If I open that file after extracting the .zip with a Text Editor, notepad ++ in this case the dash (-) is not there

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Tue Oct 02, 2018 10:45 am
by jason
Yeah, of course .. that's IE/Edge collapsible rendering of XML.

If cp:coreProperties looks ok, maybe I can help you if you can get me the anonymised document (you'll need to fix your build path for that - how are you adding docx4j to your classpath?)

Otherwise, I'd really need to see the input docx.

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Fri Oct 12, 2018 5:04 am
by capocomico
jason wrote:Yeah, of course .. that's IE/Edge collapsible rendering of XML.

If cp:coreProperties looks ok, maybe I can help you if you can get me the anonymised document (you'll need to fix your build path for that - how are you adding docx4j to your classpath?)

Otherwise, I'd really need to see the input docx.


Can I send it to you in private?

Re: "Invalid xml after template modification" 6.0.1

PostPosted: Fri Oct 12, 2018 7:24 am
by jason