Page 1 of 1

Exception loading .docx XML

PostPosted: Sat Mar 06, 2010 11:20 pm
by RithanyaLaxmi
Hi,

I have a .docx XML (Customer.xml) which contains the MS Word tags with content, when i try to load the XML using docx4j API . I am getting the following exception :-

org.docx4j.openpackaging.exceptions.Docx4JException: Couldn't load xml from D:\Sample\Employee.xml
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:173)
at org.docx4j.openpackaging.packages.WordprocessingMLPackage.load(WordprocessingMLPackage.java:163)
at PDFGenerator.main(PDFGenerator.java:17)
Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"http://schemas.openxmlformats.org/wordprocessingml/2006/main", local:"document"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(UnmarshallingContext.java:616)
at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:244)


Here is my java code:-
String inputFilePath = System.getProperty("user.dir")+ "/Customer.xml";
try {
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
.load(new java.io.File(inputFilePath));
wordMLPackage.setFontMapper(new IdentityPlusMapper());
org.docx4j.convert.out.pdf.PdfConversion c = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(
wordMLPackage);
((org.docx4j.convert.out.pdf.viaXSLFO.Conversion) c)
.setSaveFO(new java.io.File(
ApplicationConstants.XML_FILE_PATH + ".fo"));
OutputStream os = new java.io.FileOutputStream(
ApplicationConstants.XML_FILE_PATH + ".pdf");
c.output(os);

Is there any thing i am doing wrong here, Why i am unable to load the XML? I have also JARS in place. Please correct the code if needed and shed some light into it.

Thanks,
Rithu

Re: Exception loading .docx XML

PostPosted: Mon Mar 08, 2010 3:07 pm
by RithanyaLaxmi
No replies guys? This is urgent, please do respond.

Re: Exception loading .docx XML

PostPosted: Tue Mar 09, 2010 1:44 pm
by jason
The stack trace refers to Employee.xml; the code refers to Customer.xml. Please post the relevant file (Employee.xml?).

What version of docx4j are you using? Is the the pre-compiled binary, or did you build it yourself?

cheers .. Jason

Re: Exception loading .docx XML

PostPosted: Wed Mar 10, 2010 4:41 pm
by RithanyaLaxmi
Thanks Jason, the Employee.xml is quite huge, here it is :-

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?><w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xml:space="preserve">
<w:styles><w:style w:type="table" w:styleId="MyTableStyle">
<w:name w:val="My Table Style"/>
<w:tblPr>
<w:tblBorders>
<w:top w:val="single"/>
<w:left w:val="single"/>
<w:bottom w:val="single"/>
<w:right w:val="single"/>
<w:insideH w:val="single"/>
<w:insideV w:val="single"/>
</w:tblBorders>
<w:tblCellMar>
<w:left w:w="108" w:type="dxa"/>
<w:right w:w="108" w:type="dxa"/>
</w:tblCellMar>
</w:tblPr>
</w:style>
<w:style w:type="paragraph" w:styleId="EmphasizedParagraph"><w:name w:val="Emphasized Paragraph"/><w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" w:cs="Arial"/><wx:font wx:val="Arial" /><w:b /><w:b-cs /><w:kern w:val="32" />
<w:sz w:val="32" /> <w:sz-cs w:val="32" /></w:rPr></w:style>
</w:styles>
<w:p><w:pPr><w:pStyle w:val="EmphasizedParagraph"/><w:jc w:val="center"/></w:pPr><w:r><w:t>Employee Details</w:t></w:r></w:p>
<w:body>
<w:tbl>
<w:tblPr>
<w:tblStyle w:val="MyTableStyle"/>
<w:jc w:val="center"/>
</w:tblPr>
<w:tr w:rsidR="00906ABD" w:rsidTr="00906ABD"> <w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:rPr><w:color w:val="FF0000" w:themeColor="background1"/><w:b/></w:rPr><w:t>EmpId</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00090C1D"> <w:proofErr w:type="spellStart"/><w:r><w:rPr><w:color w:val="FF0000" w:themeColor="background1"/><w:b/></w:rPr><w:t>EmpName</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tc><w:tcPr>
<w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD"><w:proofErr w:type="spellStart"/><w:r><w:rPr><w:color w:val="FF0000" w:themeColor="background1"/><w:b/></w:rPr><w:t>EmpAddress</w:t></w:r><w:proofErr w:type="spellEnd"/>
</w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD"><w:proofErr w:type="spellStart"/><w:r><w:rPr><w:color w:val="FF0000" w:themeColor="background1"/><w:b/></w:rPr><w:t>EmpCity</w:t></w:r>
<w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD"><w:proofErr w:type="spellStart"/>
<w:r><w:rPr><w:color w:val="FF0000" w:themeColor="background1"/><w:b/></w:rPr><w:t>EmpState</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:rPr><w:color w:val="FF0000" w:themeColor="background1"/><w:b/></w:rPr><w:t>EmpCountry</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc></w:tr>
<w:tr w:rsidR="00906ABD" w:rsidTr="00906ABD"><w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
100
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Anantha
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
121 Gandhiji Street
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Bangalore
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Karnataka
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
India
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
</w:tr>
<w:tr w:rsidR="00906ABD" w:rsidTr="00906ABD"><w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
200
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Boris
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
122 Nethaji Street
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Bangalore
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Karnataka
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
India
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
</w:tr>
<w:tr w:rsidR="00906ABD" w:rsidTr="00906ABD"><w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
300
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Chenthil
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
123 Rajaji Street
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Chennai
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
Tamilnadu
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
<w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr><w:p w:rsidR="00906ABD" w:rsidRDefault="00090C1D" w:rsidP="00906ABD">
<w:proofErr w:type="spellStart"/><w:r><w:t>
India
</w:t></w:r><w:proofErr w:type="spellEnd"/></w:p></w:tc><w:tcPr><w:tcW w:w="1596" w:type="dxa"/></w:tcPr>
</w:tr>
</w:tbl>
</w:body>
</w:wordDocument>

This will display the records in a table. I am using docx4j 2.3 version. Whether the docx4j looks for a XML with a particular schema to generate the PDF? If not why the above XML is not loaded? Please do respond.

Thanks,
Rithu

Re: Exception loading .docx XML

PostPosted: Wed Mar 10, 2010 7:28 pm
by jason
Whether the docx4j looks for a XML with a particular schema to generate the PDF? If not why the above XML is not loaded?


That's right, docx4j requires the OpenXML standard (2006) schema, and does not work with the http://schemas.microsoft.com/office/word/2003/wordml namespace you have supplied. (On a related note, docx4j v3 will work with Office 2010 XML)

Your best bet is to write an XSLT which transforms the 2003 XML to OpenXML. After that pre-processing, docx4j will be able to consume it.

Such an XSLT would be pretty straightforward (at least for most cases); you might even find one via Google. If so, let us know. I'd be happy to add this capability into docx4j if you write it and would like to contribute it.

Re: Exception loading .docx XML

PostPosted: Thu Mar 11, 2010 11:56 pm
by RithanyaLaxmi
Thanks Jason, Now i changed to OpenXML standard (2006) schema, here is the XML(Sample.xml) :-

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pkg:package xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">
<pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512">
<pkg:xmlData><Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/>
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/>
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml"/>
</Relationships></pkg:xmlData></pkg:part><pkg:part pkg:name="/word/_rels/document.xml.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="256">
<pkg:xmlData><Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header1.xml"/>
<Relationship Id="rId13" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer3.xml"/>
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml"/>
<Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes" Target="endnotes.xml"/>
<Relationship Id="rId12" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer2.xml"/>
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Target="numbering.xml"/><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/customXml" Target="../customXml/item1.xml"/>
<Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml"/>
<Relationship Id="rId11" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header3.xml"/>
<Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml"/>
<Relationship Id="rId15" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml"/>
<Relationship Id="rId10" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header2.xml"/>
<Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml"/>
<Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer1.xml"/>
<Relationship Id="rId14" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml"/>
</Relationships></pkg:xmlData></pkg:part><pkg:part pkg:name="/word/document.xml" pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"><pkg:xmlData>
<w:document
xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
<w:body><w:p><w:r><w:t>Proud to be an Indian</w:t></w:r></w:p></w:body>
</w:document></pkg:xmlData></pkg:part></pkg:package>

This is the java code:-

try {
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
.load(new java.io.File("Sample.xml"));
wordMLPackage.setFontMapper(new IdentityPlusMapper());
org.docx4j.convert.out.pdf.PdfConversion c = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(
wordMLPackage);
((org.docx4j.convert.out.pdf.viaXSLFO.Conversion) c)
.setSaveFO(new java.io.File(
ApplicationConstants.XML_FILE_PATH + ".fo"));
OutputStream os = new java.io.FileOutputStream(
ApplicationConstants.XML_FILE_PATH + ".pdf");
c.output(os);
System.out.println("PDF Generated.");
} catch (Exception e) {
e.printStackTrace();

}

I am getting the below exception :-

org.docx4j.openpackaging.exceptions.Docx4JException: Failed to add parts from relationships
at org.docx4j.convert.in.FlatOpcXmlImporter.addPartsFromRelationships(FlatOpcXmlImporter.java:246)
at org.docx4j.convert.in.FlatOpcXmlImporter.get(FlatOpcXmlImporter.java:175)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:175)
at org.docx4j.openpackaging.packages.WordprocessingMLPackage.load(WordprocessingMLPackage.java:163)
at com.atlas.dm.dao.EmployeeDAOImpl.getSampleXML(EmployeeDAOImpl.java:203)
at com.atlas.dm.dao.EmployeeDAOImpl.getEmployeeDetails(EmployeeDAOImpl.java:47)
at com.atlas.dm.service.XMLGeneratorServiceImpl.getXMLString(XMLGeneratorServiceImpl.java:18)
at com.atlas.dm.service.XMLGeneratorServiceImpl.main(XMLGeneratorServiceImpl.java:23)
Caused by: java.lang.IllegalArgumentException: part
at org.docx4j.openpackaging.parts.relationships.RelationshipsPart.loadPart(RelationshipsPart.java:265)
at org.docx4j.convert.in.FlatOpcXmlImporter.getPart(FlatOpcXmlImporter.java:315)
at org.docx4j.convert.in.FlatOpcXmlImporter.addPartsFromRelationships(FlatOpcXmlImporter.java:244)
... 7 more

It seems that it is able to load the file properly, i am not sure about failed to add parts for relationship", please correct the above XML and Java code if necessary , so that i can try and generate the PDF. Thanks in advance..

Thanks,
Rithu

Re: Exception loading .docx XML

PostPosted: Fri Mar 12, 2010 9:31 am
by jason
Where are all the parts which are referenced:

Code: Select all
<Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header1.xml"/>
<Relationship Id="rId13" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer3.xml"/>
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml"/>
<Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes" Target="endnotes.xml"/>
<Relationship Id="rId12" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer2.xml"/>
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Target="numbering.xml"/><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/customXml" Target="../customXml/item1.xml"/>
<Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml"/>
<Relationship Id="rId11" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header3.xml"/>
<Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml"/>
<Relationship Id="rId15" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml"/>
<Relationship Id="rId10" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header2.xml"/>
<Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml"/>
<Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer1.xml"/>
<Relationship Id="rId14" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml"/>


??

That is what the error is telling you!

Re: Exception loading .docx XML

PostPosted: Fri Mar 12, 2010 5:55 pm
by RithanyaLaxmi
Thanks Jason, the thing is that i what some working example , hence i have taken a sample XML , here it is:-

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"> <w:body><w:p><w:r><w:t>Hello World!!</w:t></w:r></w:p></w:body> </w:document>

I am using the same Java code, i am again getting the following exception:-

org.docx4j.openpackaging.exceptions.Docx4JException: Couldn't load xml from D:\Anantha\DMP\Alternate_Directory_Generation\Sample.xml
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:173)
at org.docx4j.openpackaging.packages.WordprocessingMLPackage.load(WordprocessingMLPackage.java:163)
at com.atlas.dm.dao.EmployeeDAOImpl.getSampleXML(EmployeeDAOImpl.java:203)
at com.atlas.dm.dao.EmployeeDAOImpl.getEmployeeDetails(EmployeeDAOImpl.java:47)
at com.atlas.dm.service.XMLGeneratorServiceImpl.getXMLString(XMLGeneratorServiceImpl.java:18)
at com.atlas.dm.service.XMLGeneratorServiceImpl.main(XMLGeneratorServiceImpl.java:23)
Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"http://schemas.openxmlformats.org/wordprocessingml/2006/main", local:"document"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>

Why can't I use the above XMLns which is fine when i open that in MS Word. What i am doing wrong here. Please do correct the XML so that i can generate the PDF. If there is any sample to generate PDF from XML please provide that, which will be really helpful.

Thanks,
Rithu

Re: Exception loading .docx XML

PostPosted: Fri Mar 12, 2010 6:15 pm
by jason
I'm not sure where you got your last 2 example documents from, but neither is any good.

docx4j works with:

* OpenXML standard docx files

* "Flat OPC" XML files

You can find examples of both in http://dev.plutext.org/trac/docx4j/brow ... ample-docs (Your latest example is neither, although it is interesting that Word accepts it)

Or you can create them in Word 2007 (or 2003 with the compatibility pack), saving as either .docx or .xml

Or you can create one with docx4j.

The sample http://dev.plutext.org/trac/docx4j/brow ... tePdf.java does what you are after.

Re: Exception loading .docx XML

PostPosted: Fri Mar 12, 2010 7:15 pm
by RithanyaLaxmi
Thanks Jason, the problem I am having is that i am creating the .docx XML on my own rather than generating it from the .docx file, hence for me to the do relationships part manually is tough. Hence how to get the perfect XML with the matching relationship and parts to generate a PDF? I think you will be right person to answer these queries:-
To get a brief idea of what I am doing :-

I am using MS Word API to generate .docx which contains the data fetched from DB, in which i am applying the respective styles, fonts, symbols, etc. If the data fetched from the DB is quite huge, then there is a problem in displaying those data in the .docx file. I found that internally MS Word 2007 will write some content through tags which may not be needed to display the data. Hence i am figuring out what are the necessary MS Word tags needed when converting into a .xml file. So that i can avoid unnecessary tags and build only the respective tags which are needed to display the data. Hence i am planning to write my own .xml with the MS Word tags which are needed, than generating a .XML from .docx file

My queries are:-

1) Whether it is right that the MS Word will generate some tags which may not be needed during the conversion of .docx to document.xml? That makes it heavy? If so what are the tags , so that i can avoid them when write by own .xml file.
2) Please send links to understand about the MS Word tags and its advantages, which tags are needed and which are not ?
3) Whether my approach to write a new .xml similar to document.xml (.docx conversion) is worthy one to go forward so that i can build the .xml with the tags i needed , so that i can improve the performance of the data display?
4) I want to know whether the "bleed" which is done through the Word API is supported in WordML, if what is the tag for it? I tested with vertAlign didn't work through. Similarly "WaterMark" , "Pagebreak" are supported in WordML? Is there any links to find that what are the features WordML suuports when compared to Word API.


When done some investigation on this, i found that there are some of the points which take be taken care :-

1) Unnecessary Namespaces can be avoided , by default when generated through Word will include all the namespaces.

2) The <w:sectPr> element displayed at the end which contains the layout information will be created by default when generated through Word. Which can be avoided.

Like these i want to know what are the other elements which makes it heavy and what are the things i need to take care while generating the .xml (WordWL) on my own? So that i can avoid the unnecssary elements (tags) and retain only the elements i need.

Please do shed some light into it, i think you might have some experience on this before. Please do answer this queries and share your exp. so that i can go forward. I will be really grateful if you can do that.

Thanks,
Rithu

Re: Exception loading .docx XML

PostPosted: Fri Mar 12, 2010 7:48 pm
by jason
RithanyaLaxmi wrote:Thanks Jason, the problem I am having is that i am creating the .docx XML on my own rather than generating it from the .docx file, hence for me to the do relationships part manually is tough. Hence how to get the perfect XML with the matching relationship and parts to generate a PDF?


Assuming the header/footer and the rest of your document always stays the same, and all you want to do is insert the data inti it, as a table say, one approach would be to save as xml from Word 2007 to get the XML file (this is Flat OPC XML). Then save as one String the Flat OPC XML up until your data table, and as another String the bit from there to the end.

Now you can create your entire XML file = string1 + datatable + string2.

But that approach doesn't need docx4j at all, at least until you decide you want PDF output.

The other approach is to use docx4j. You could use it to build the document from scratch (ie creating header parts etc, if you need them). Or, you could use docx4j to open an existing document, add in your data table, and save the document (possibly using a new name).

RithanyaLaxmi wrote: I am using MS Word API to generate .docx which contains the data fetched from DB, in which i am applying the respective styles, fonts, symbols, etc. If the data fetched from the DB is quite huge, then there is a problem in displaying those data in the .docx file.


Problem in Word? How many pages is your document? Are you displaying the data in a table? If so, how many rows?

RithanyaLaxmi wrote:I found that internally MS Word 2007 will write some content through tags which may not be needed to display the data. Hence i am figuring out what are the necessary MS Word tags needed when converting into a .xml file. So that i can avoid unnecessary tags and build only the respective tags which are needed to display the data. Hence i am planning to write my own .xml with the MS Word tags which are needed, than generating a .XML from .docx file

My queries are:-

1) Whether it is right that the MS Word will generate some tags which may not be needed during the conversion of .docx to document.xml? That makes it heavy? If so what are the tags , so that i can avoid them when write by own .xml file.
2) Please send links to understand about the MS Word tags and its advantages, which tags are needed and which are not ?


Well certainly you can do without the rsid's and the proofing (grammar/spelling).

And Word does add a number of parts which aren't strictly necessary. But I'm not convinced that trying to get rid of tags will make much difference.

As far as I know, the content of a docx is identical to the Flat OPC XML (on a semantic level that is - of course the former is zipped up etc).

It'd be interesting to know if there are some performance limitations with the Flat OPC XML format .. I'm not aware of any.

RithanyaLaxmi wrote:3) Whether my approach to write a new .xml similar to document.xml (.docx conversion) is worthy one to go forward so that i can build the .xml with the tags i needed , so that i can improve the performance of the data display?


See above.

RithanyaLaxmi wrote:4) I want to know whether the "bleed" which is done through the Word API is supported in WordML, if what is the tag for it? I tested with vertAlign didn't work through. Similarly "WaterMark" , "Pagebreak" are supported in WordML? Is there any links to find that what are the features WordML suuports when compared to Word API.


What is bleed? WordML can represent anything that Word can put in a document. So, add bleed (whatever it is) to the document, save it, and open it in package explorer to see how it is represented. Watermark might be found in the header part? Pagebreaks are definitely supported.


RithanyaLaxmi wrote:When done some investigation on this, i found that there are some of the points which take be taken care :-

1) Unnecessary Namespaces can be avoided , by default when generated through Word will include all the namespaces.

2) The <w:sectPr> element displayed at the end which contains the layout information will be created by default when generated through Word. Which can be avoided.



As I mentioned above, I think you are wasting your time trying to get rid of these things.

Re: Exception loading .docx XML

PostPosted: Fri Mar 12, 2010 9:12 pm
by RithanyaLaxmi
Thanks Jason, I definitely value the points which you have mentioned. Ok.. the final conclusion as per your comments is:-

1) Removing the unnecessary tags in the .docx XML wont improve the performance much, if so improve by 5%. We can rule this out.
2) Is there any approach i can follow to make sure the performance improves when displaying a huge amount of data, as mentioned i am not storing the data in a table or etc, where i am fetching a huge amount of data from the DB and applying various styles to it which makes it heavy and hampering the performance. I reed an article which says we need to transform the data to another XML vocabalry like XHTML to make it lighter.
3) But here in my case, finally i am going to display the data in the MS Word (.docx) hence transforming the .docx XML t another XML vocabaltry doesn't make much difference, because eventually i need to display the data in the WORD.
4) What would be your advice on this to make the performance improve, or what is the best approach i can follow in this situation?

Thanks,
Rithu

Re: Exception loading .docx XML

PostPosted: Sat Mar 13, 2010 1:05 am
by jason
How much data are you trying to display? If you are displaying it as paragraphs of text, how many paragraphs?

When you say performance is bad, is that Word opening the document, or scrolling through it, or what?

What are the specs of the PC you are working on? Does it have enough RAM?

You should be able to open say 100 pages without any difficulty on most PCs. Even several thousand could be ok - the docx version of one of the Open XML specs is several thousand pages long.

Re: Exception loading .docx XML

PostPosted: Tue Mar 16, 2010 12:01 am
by RithanyaLaxmi
Thanks again Jason, the data i am displaying is huge , those are directory information with various styles, fonts, icons, symbols, etc. Based on the functionality or feature the data will be displayed, approx, it will come around
100 - 500 pages with directory details displayed in a PDF. When i try to generate a PDF from the Word using Word API, the performance is quite slow, where it is taking some time to display the content in the PDF. Somtimes it hangs as well, i have high performance machine with latest configuration, hence bandwidth or machine conf. or RAM is not a pbm. There are so much of styles that have been applied to text as each directory and sub sections have different font and style. I know that styles is causing the pbm. if we reduce the number of styles it is OK. But as per the requirement that is not possible for all the directory generation. Rather than reducing the styles, is there any other thing we can do to improve the performance? as i said hardware is not a pbm , probably it will improve the performance by 10-15 %. But I am looking for 40-50 % in increase in performance atleast? Hence, how to go about this, I need your suggestions.

Thanks,
Rithu

Re: Exception loading .docx XML

PostPosted: Tue Mar 16, 2010 12:31 am
by jason
This has moved outside the scope of the sort of assistance I normally provide in this forum for docx4j.

I'm happy to look at your problem, but on a consulting basis. Please contact me by email if you want this.

cheers .. Jason