Page 1 of 1

Extensions in content type mapping are case sensitive

PostPosted: Thu Dec 16, 2010 9:25 pm
by jban
Hi,

attached is a simple docx file that was created by Word 2010 and it cannot be parsed by docx4j 2.5.0: it throws this exception:

org.docx4j.openpackaging.exceptions.Docx4JException: Failed to add parts from relationships
at org.docx4j.openpackaging.io.LoadFromZipNG.addPartsFromRelationships(LoadFromZipNG.java:346)
at org.docx4j.openpackaging.io.LoadFromZipNG.process(LoadFromZipNG.java:229)
at org.docx4j.openpackaging.io.LoadFromZipNG.get(LoadFromZipNG.java:182)
(...)
Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Failed to add parts from relationships
at org.docx4j.openpackaging.io.LoadFromZipNG.addPartsFromRelationships(LoadFromZipNG.java:346)
at org.docx4j.openpackaging.io.LoadFromZipNG.getPart(LoadFromZipNG.java:436)
at org.docx4j.openpackaging.io.LoadFromZipNG.addPartsFromRelationships(LoadFromZipNG.java:344)
... 22 more
Caused by: java.lang.IllegalArgumentException: part
at org.docx4j.openpackaging.parts.relationships.RelationshipsPart.loadPart(RelationshipsPart.java:390)
at org.docx4j.openpackaging.io.LoadFromZipNG.getPart(LoadFromZipNG.java:420)
at org.docx4j.openpackaging.io.LoadFromZipNG.addPartsFromRelationships(LoadFromZipNG.java:344)
... 24 more


I have found the reason why parsing fails: the [Content_Types].xml file contains a mapping for extension "PNG", but the package also contains files with extension "png" (word/media/image3.png) and docx4j is not able to determine the content type if the capitalization of the extension is not exactly the same as in the mapping file. I think the mapping should be case insensitive, I think that specification says it.

Now I am not able to create such a file again with Word, but definitely such a document can be produced by Word and it is a document with just text and images, so I consider this to be quite a severe issue that would be nice to have fixed soon;-).

Best regards,
Jiri

Re: Extensions in content type mapping are case sensitive

PostPosted: Thu Dec 16, 2010 11:27 pm
by jason
Done; see http://dev.plutext.org/trac/docx4j/changeset/1355

Do you think the uppercase extensions could have been introduced into your Word docx, by inserting an image which in the filesystem had an uppercase extension?

cheers .. Jason

Re: Extensions in content type mapping are case sensitive

PostPosted: Fri Dec 17, 2010 1:53 am
by jban
Thank you very much for prompt fix!

I think I have pasted the image from clipboard... I tried to reproduce it again by doing this and also inserted image with uppercase extension, but it did not happen again.

Best regards,
Jiri