Page 1 of 1

No direct way of setting limits on XML processing

PostPosted: Tue Jun 23, 2020 2:31 am
by elknar
There doesn't seem to be a way to limit DTD access nor external entities.

These measures are necessary to reduce the threat of XXE when processing docx files.

Could you advise if I am simply missing something, or if this is indeed missing functionality? And if so, if it would be possible to add that, or at least use safe defaults when processing XML files?

Re: No direct way of setting limits on XML processing

PostPosted: Tue Jun 23, 2020 7:56 am
by jason
Docx4j's settings should be good already.

Throughout the package org.docx4j.openpackaging, you'll see the pattern:

Code: Select all
XMLInputFactory xif = XMLInputFactory.newInstance();
                    xif.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
xif.setProperty(XMLInputFactory.SUPPORT_DTD, false); // a DTD is merely ignored, its presence doesn't cause an exception


Also, from 2014 there is:

Notable Changes in Version 3.2.0
d150d9c7f6 Security fix Configure DocumentBuilderFactory to disallow doctype declaration etc. (reported by Sven Jacobi)


If you do notice somewhere specific where this may still be an issue, please do let us know.

Re: No direct way of setting limits on XML processing

PostPosted: Tue Jun 23, 2020 5:37 pm
by elknar
huh, that does seem correct. I do however end up with an XXE when doing Docx4J.load(inputStream)

Re: No direct way of setting limits on XML processing

PostPosted: Wed Jul 01, 2020 3:37 am
by elknar
I was wrong. `load` is not the issue.

I've identified exactly where the XXE is happening. If you use JaxbXmlPart.transform available to MainDocumentPart, it creates a StreamSource with a default SAXParser and passes that to XmlUtils.transform.

As to be expected, by default, the SAXParser has no XXE protections.

I suggest creating a SAXSource with a properly initialized (including the anti-XXE features) SAXParser instead of the StreamSource.

Re: No direct way of setting limits on XML processing

PostPosted: Wed Jul 08, 2020 1:46 pm
by jason
Thanks for this. In reviewing, I too found that StreamSource was a gap in our XXE fixes. Corrected in various places by https://github.com/plutext/docx4j/commi ... e6bdf8230c

These fixes are in https://www.docx4java.org/docx4j/docx4j ... 200707.jar

Please let me know if you notice any issues; otherwise 8.2.1 will be released soon.

Re: No direct way of setting limits on XML processing

PostPosted: Thu Jul 16, 2020 9:21 pm
by jason
These fixes are in the just released 8.2.1: announces/docx4j-8-2-1-released-t2930.html