Page 1 of 1

Display HTML as Rich Text in Word - Error in importing HTML

PostPosted: Thu Sep 25, 2014 8:13 pm
by jignesh
Hi Jason,

I am trying to convert HTML text to rich text so that it can be displayed in word document. I am using XHTMLImporter of docx4j 2.8.1 to do this.

I am getting java.lang.NullPointerException . One thing I have noticed is that: everything works fine if I don't have <table>...</table> in my html text but when I pass html with full <table>..</table> it gives me this error. I have tried to trace it by looking at the XHTMLImporter.nestedTableHierarchyFix and XHTMLImporter.traverse methods in the jar file.
What I have found is that XHTMLImporter.traverse method at line 459 is trying pass null as 'Box parent' and nestedTableHierarchyFix (line 961) method is checking for

Code: Select all
if (parent instanceof TableBox
            || parent.getElement().getNodeName().equals("table") )

where parent is already passed as NULL. I believe this is where I am getting NullPointerException.

Please help me resolve this exception.

Thanking you in anticipation.

I am getting below exception.
Code: Select all
org.docx4j.org.xhtmlrenderer.load INFO:: Loaded document in ~26ms
org.docx4j.org.xhtmlrenderer.load INFO:: TIME: parse stylesheets  114ms
org.docx4j.org.xhtmlrenderer.match INFO:: media = print
org.docx4j.org.xhtmlrenderer.match INFO:: Matcher created with 119 selectors
java.lang.NullPointerException
at org.docx4j.convert.in.xhtml.XHTMLImporter.nestedTableHierarchyFix(XHTMLImporter.java:961)
at org.docx4j.convert.in.xhtml.XHTMLImporter.traverse(XHTMLImporter.java:544)
at org.docx4j.convert.in.xhtml.XHTMLImporter.traverse(XHTMLImporter.java:459)
at org.docx4j.convert.in.xhtml.XHTMLImporter.convert(XHTMLImporter.java:415)


This is my code where I am trying to convert html text:
Code: Select all
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert( htmlText, null,wordMLPackage) );


Below is my htmlText:
Code: Select all
<table border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top" width="24">
<p class="Normal-spaceabove"></p>
</td>
<td colspan="4" valign="top" width="274">
<p class="Normal-spaceabove">Text text text text</p>
</td>
</tr>
<tr>
<td valign="top" width="24">
<p class="Normal-spaceabove">(a)</p>
</td>
<td colspan="4" valign="top" width="274">
<p class="Normal-spaceabove"><em>one </em>of the following</p>
</td>
</tr>
<tr>
<td valign="top" width="24">
<p class="Normal-spaceabove">(i)</p>
</td>
<td colspan="4" valign="top" width="274">
<p class="Normal-spaceabove">(Post A Level Russian stream)</p>
</td>
</tr>
<tr>
<td valign="top" width="24">
<p></p>
</td>
<td valign="top" width="58">
<p>RUS205</p>
</td>
<td valign="top" width="19">
<p>F5</p>
</td>
<td valign="top" width="173">
<p>Russian Language Skills I (Course A)</p>
</td>
<td valign="bottom" width="24">
<p align="right">10</p>
</td>
</tr>
<tr>
<td valign="top" width="24">
<p></p>
</td>
<td valign="top" width="58">
<p>RUS206</p>
</td>
<td valign="top" width="19">
<p>F5</p>
</td>
<td valign="top" width="173">
<p>Russian Language Skills II (Course A)</p>
</td>
<td valign="bottom" width="24">
<p align="right">10</p>
</td>
</tr>
<tr>
<td valign="top" width="24">
<p class="Normal-spaceabove">(ii)</p>
</td>
<td colspan="4" valign="top" width="274">
<p class="Normal-spaceabove">(Non A Level Russian stream)</p>
</td>
</tr>
<tr>
<td valign="top" width="24">
<p></p>
</td>
<td valign="top" width="58">
<p>RUS207</p>
</td>
<td valign="top" width="19">
<p>F5</p>
</td>
<td valign="top" width="173">
<p>Russian Language Skills I (Course B)</p>
</td>
<td valign="bottom" width="24">
<p align="right">10</p>
</td>
</tr>
<tr>
<td valign="top" width="24">
<p></p>
</td>
<td valign="top" width="58">
<p>RUS208</p>
</td>
<td valign="top" width="19">
<p>F5</p>
</td>
<td valign="top" width="173">
<p>Russian Language Skills II (Course B)</p>
</td>
<td valign="bottom" width="24">
<p align="right">10</p>
</td>
</tr>
</tbody>
</table>

Re: Display HTML as Rich Text in Word - Error in importing H

PostPosted: Thu Sep 25, 2014 10:12 pm
by jason
Can you upgrade to v3.2.0 of docx4j, with v3.2.0 or 3.2.1 of ImportXHTML?