Page 1 of 1

Alternate to XmlUtils.treeCopy

PostPosted: Wed Mar 06, 2013 2:21 am
by rhaley
I was wondering how I might do what XmlUtils.treeCopy. I have an issues where it's dropping all the namespaces that it copies.

In the comments on the treeCopy meathod it states that attribute aren't fully supported. Does this mean it's coming or its out of docx4j's control?

Thanks

Re: Alternate to XmlUtils.treeCopy

PostPosted: Wed Mar 06, 2013 2:24 am
by rhaley
I also loose the CDATA tag but the content is there?

Re: Alternate to XmlUtils.treeCopy

PostPosted: Wed Mar 06, 2013 6:42 am
by jason
treeCopy is used internally by docx4j, because some XML libraries don't support importNode (eg Xalan's org.apache.xml.dtm.ref.DTMNodeProxy).

Where you can use DOM's importNode, that's the standard way of doing things.

Regarding namespaces, can you provide an example where they are handled incorrectly?

Regarding CDATA, please note that it and some other cases are not implemented:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
//                case Node.CDATA_SECTION_NODE:
//                    writer.write("<![CDATA[" +
//                                 node.getNodeValue() + "]]>");
//                    break;
//
//                case Node.COMMENT_NODE:
//                    writer.write(indentLevel + "<!-- " +
//                                 node.getNodeValue() + " -->");
//                    writer.write(lineSeparator);
//                    break;
//
//                case Node.PROCESSING_INSTRUCTION_NODE:
//                    writer.write("<?" + node.getNodeName() +
//                                 " " + node.getNodeValue() +
//                                 "?>");
//                    writer.write(lineSeparator);
//                    break;
//
//                case Node.ENTITY_REFERENCE_NODE:
//                    writer.write("&" + node.getNodeName() + ";");
//                    break;
//
//                case Node.DOCUMENT_TYPE_NODE:
//                    DocumentType docType = (DocumentType)node;
//                    writer.write("<!DOCTYPE " + docType.getName());
//                    if (docType.getPublicId() != null)  {
//                        System.out.print(" PUBLIC \"" +
//                            docType.getPublicId() + "\" ");
//                    } else {
//                        writer.write(" SYSTEM ");
//                    }
//                    writer.write("\"" + docType.getSystemId() + "\">");
//                    writer.write(lineSeparator);
//                    break;
 
Parsed in 0.017 seconds, using GeSHi 1.0.8.4

Re: Alternate to XmlUtils.treeCopy

PostPosted: Wed Mar 06, 2013 9:51 am
by rhaley
Let me give you some backgroud: see: http://www.docx4java.org/forums/docx-java-f6/creating-xml-from-docx-t1347.html

I have made a copy of org.docx4j.convert.out.html and created org.docx4j.convert.out.xml because we have a requirement to generate XML from the docx4j document the users author. We used a overly complicated XML which is out of my control but I have re-written add new methods to HtmlExporterNG2.java (now called XmlExporterNG2.java). I know create many different custom XML tags based on various styled w:p tags (i.e. ListParagraph).

That being said I have done the following using the original code from the org.docx4j.convert.out.html package.

in the XSL I do something similar to this....

Code: Select all
<xsl:variable name="otherVariable select="'testvalue'"/>
<xsl:variable name="endnotes" select="java:org.docx4j.convert.out.Converter.getEndnotes($conversionContext)" />
<!-- this gives me the text from the endnote part
<xsl:variable name="sourceRef" select="$endnotes//w:endnote[@w:id = $currentID]"/>
<xsl:value-of select="java:our.custom.class.TestClass.returnXML($otherVariable, $sourceRef)"/>


This would typically return: (Our custom method returns a w3c Node)
Code: Select all
<SourceGroup>
<SourceText>testvalue</SourceText>
<EndNoteRef ism:producer="CNN" ism:date="somedate">Some information about the source</EndNoteRef>
</SourceGroup>


As part of the <apply-templates/> on the <w:r> tag in docx2xhtml-core.xslt

This is called:
Code: Select all
           
<xsl:copy-of select="java:org.docx4j.convert.out.html.HtmlExporterNG2.createBlockForRPr(
              $conversionContext, $pStyleVal, $rPrNode, $childResults)" />


Which in turn calls this:
What happens in Docx4j when I used HtmlExporterNG2.java the createBlock does a XmlUtils.treeCopy and looses the "ism" namespace prefixes which we need.
Code: Select all
<SourceGroup>
<SourceText>testvalue</SourceText>
<EndNoteRef producer="CNN" date="somedate">Some information about the source</EndNoteRef>
</SourceGroup>


When I output the above value in the as some custom xml thats not called by any template the namespace prefixes are visible in the transformed document.

I hope this is enough information.

Re: Alternate to XmlUtils.treeCopy

PostPosted: Wed Mar 06, 2013 11:38 am
by jason
This would typically return: (Our custom method returns a w3c Node)
<SourceGroup>
<SourceText>testvalue</SourceText>
<EndNoteRef ism:producer="CNN" ism:date="somedate">Some information about the source</EndNoteRef>
</SourceGroup>


It needs to return something which includes a namespace declaration for namespace prefix ism (on the SourceGroup node or EndNoteRef node).

Please try that.