Page 1 of 1

Namespaces in Custom XML Parts are discarded

PostPosted: Sat Dec 11, 2010 9:41 am
by amdonov
Using docx4j 2.6.0 namespaces on our attributes were being thrown away for custom xml parts by the FlatOpcXmlImporter. It uses XmlUtils to copy the custom xml data element into a new org.w3c.dom.Document. I'm not sure what this class is intended to do, but the DOM API directly supports this functionality and doesn't ditch the namespaces. I recommend the following patch. It fixes our problem.

Index: FlatOpcXmlImporter.java
===================================================================
--- FlatOpcXmlImporter.java (revision 1343)
+++ FlatOpcXmlImporter.java (working copy)
@@ -522,8 +522,8 @@
javax.xml.parsers.DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
org.w3c.dom.Document doc = dbf.newDocumentBuilder().newDocument();
- XmlUtils.treeCopy(el, doc);
-
+ org.w3c.dom.Node copy = doc.importNode(el, true);
+ doc.appendChild(copy);
data.setDocument(doc);

((org.docx4j.openpackaging.parts.CustomXmlDataStoragePart) part)

Re: Namespaces in Custom XML Parts are discarded

PostPosted: Sat Dec 11, 2010 10:59 pm
by jason
Thanks for this patch. I've applied it as http://dev.plutext.org/trac/docx4j/changeset/1345

To answer your question re XmlUtils.TreeCopy method, the reason it exists is that the classes which transform to PDF and HTML use Xalan, and Xalan makes nodes of type org.apache.xml.dtm.ref.DTMNodeProxy.

Unfortunately, trying to import a node of type org.apache.xml.dtm.ref.DTMNodeProxy causes org.w3c.dom.DOMException: NOT_SUPPORTED_ERR: The implementation does not support the requested type of object or operation.

Hence the workaround.

I don't now know why I was using this in the FlatOpcXmlImporter class ...

Re: Namespaces in Custom XML Parts are discarded

PostPosted: Wed Dec 15, 2010 7:51 am
by amdonov
Jason,

Thanks for applying the patch. It is step in the right direction, but following some more through testing, we have uncovered a minor problem. Word doesn't like the xml namespace to be bound. At some point in the process from docx -> package-> flatopc -> package -> docx, it is added to the custom xml root element. We had to use the following patch to throw it away. Sorry that I didn't uncover this before.

Index: FlatOpcXmlImporter.java
===================================================================
--- FlatOpcXmlImporter.java (revision 1353)
+++ FlatOpcXmlImporter.java (working copy)
@@ -504,8 +504,8 @@
javax.xml.parsers.DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
org.w3c.dom.Document doc = dbf.newDocumentBuilder().newDocument();
- //XmlUtils.treeCopy(el, doc);
org.w3c.dom.Node copy = doc.importNode(el, true);
+ copy.getAttributes().removeNamedItemNS("http://www.w3.org/2000/xmlns/","xml");
doc.appendChild(copy);
data.setDocument(doc);

@@ -523,8 +523,8 @@
javax.xml.parsers.DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
org.w3c.dom.Document doc = dbf.newDocumentBuilder().newDocument();
- //XmlUtils.treeCopy(el, doc);
org.w3c.dom.Node copy = doc.importNode(el, true);
+ copy.getAttributes().removeNamedItemNS("http://www.w3.org/2000/xmlns/","xml");
doc.appendChild(copy);
data.setDocument(doc);

Re: Namespaces in Custom XML Parts are discarded

PostPosted: Thu Dec 16, 2010 11:00 pm
by jason