Page 1 of 1

A problem in convert html to docx file

PostPosted: Mon Aug 19, 2013 1:51 pm
by liuwei
hello everyone
I had some difficulies in useing the docx4j ,I want convert a html file to docx, but the background color of table in the html does not display in the docx file ,so Can you tell me how to deal with it ? thanks a lot ....

html code
Code: Select all
<html>
   <table>
      
   <tr>
      <td width="148" valign="top" style="word-break: break-all;" align="center">Test</td>
      <td width="148" valign="top" style="word-break: break-all;" align="center">Test</td>
      <td width="148" valign="top" style="word-break: break-all;" align="center">Test</td>
      <td width="148" valign="top" style="word-break: break-all;" align="center">Test</td>
      <td width="148" valign="top" style="word-break: break-all;" align="center">Test</td>
      <td width="148" valign="top" style="word-break: break-all;" align="center">Test</td>
      <td width="148" valign="top" style="word-break: break-all;" align="center">Test</td>
   </tr>
      
   <tr>
      <td width="148" valign="top" align="center" style="background-color: #BBBBBB;"><br/>
      </td><td width="148" valign="top" align="center" style="background-color:#BBBBBB;"><br/></td>
      <td width="148" valign="top" align="center" style="background-color: #BBBBBB;"><br/></td>
      <td width="148" valign="top" align="center" style="background-color: #BBBBBB;"><br/></td>
      <td width="148" valign="top" align="center" style="background-color: #BBBBBB;"><br/></td>
      <td width="148" valign="top" align="center" style="background-color: #BBBBBB;"><br/></td>
      <td width="148" valign="top" align="center" style="background-color: #BBBBBB;"><br/></td>
   </tr>
   
   </table>
</html>


java test code
Code: Select all
public class ConvertInXHTMLFile
{
  public static void main(String[] args)
    throws Exception
  {
   

    String baseURL = "file:///C:/Users/jharrop/git/docx4j-ImportXHTML/requirement_2013-07-21/requirement_2013-07-21/";

    String stringFromFile = FileUtils.readFileToString(new File("C:/Data.html"), "UTF-8");

    String unescaped = stringFromFile;
    if (stringFromFile.contains("&lt;/")) {
      unescaped = StringEscapeUtils.unescapeHtml(stringFromFile);
    }
   
    unescaped = unescaped.replaceAll("WORD-BREAK: break-all", "").replaceAll("middle", "center");   
    System.out.println("Unescaped: " + unescaped);

    XHTMLImporter.setHyperlinkStyle("Hyperlink");

    WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();

    NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
    wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
    ndp.unmarshalDefaultNumbering();

    wordMLPackage.getMainDocumentPart().getContent().addAll(
      XHTMLImporter.convert(unescaped,baseURL, wordMLPackage));

    System.out.println(
      XmlUtils.marshaltoString(wordMLPackage.getMainDocumentPart().getJaxbElement(), true, true));

    wordMLPackage.save(new File("C:/OUT_from_XHTML.docx"));
  }
}

Re: A problem in convert html to docx file

PostPosted: Mon Aug 19, 2013 9:07 pm
by jason
It will work if you use a current nightly build of docx4j and docx4j XHTML import jars

Re: A problem in convert html to docx file

PostPosted: Wed Aug 21, 2013 11:42 am
by liuwei
jason wrote:It will work if you use a current nightly build of docx4j and docx4j XHTML import jars

thanks for your answer...I will try it later