Page 1 of 1

Convert a html file into one column of docx table

PostPosted: Thu May 15, 2014 7:55 am
by david.zhaowl
Hi Jason,

I have an application using docx4j now, it's pulling some data from database and then store the data into a table inside a docx file. I just found out that one string I got from database is not pure text but html code, very simple html, something like "<html><body>....</body></html>". It's stored this way to keep the format I guess. My question is whether I could use docx4j convert this small part of html into docx formatted text and then put the formatted text into my previous docx table.

Looking forward to your reply. Thanks.

David Z.

Re: Convert a html file into one column of docx table

PostPosted: Thu May 15, 2014 8:15 am
by jason
You can use https://github.com/plutext/docx4j-ImportXHTML to do this (provided you first ensure it is well formed xml); the jars you need to import xhtml may be found at

http://search.maven.org/#artifactdetail ... .0.1%7Cjar in dir:

or

http://www.docx4java.org/docx4j/archive ... portXHTML/

You can use it directly (see the samples), or you can make an altChunk of type XHTML, then invoke wordMLPackage.getMainDocumentPart().convertAltChunks();

If you use convertAltChunks, please use a post 3.1 docx4j nightly, such as http://www.docx4java.org/docx4j/docx4j- ... 140512.jar

Further posts on importing xhtml belong in the dedicated subforum, please.

Re: Convert a html file into one column of docx table

PostPosted: Thu May 15, 2014 11:18 pm
by david.zhaowl
Thanks for your help Jason. I guess it's also possible to covert it back? from docx to html, Correct?

David Z.

jason wrote:You can use https://github.com/plutext/docx4j-ImportXHTML to do this (provided you first ensure it is well formed xml); the jars you need to import xhtml may be found at

http://search.maven.org/#artifactdetail ... .0.1%7Cjar in dir:

or

http://www.docx4java.org/docx4j/archive ... portXHTML/

You can use it directly (see the samples), or you can make an altChunk of type XHTML, then invoke wordMLPackage.getMainDocumentPart().convertAltChunks();

If you use convertAltChunks, please use a post 3.1 docx4j nightly, such as http://www.docx4java.org/docx4j/docx4j- ... 140512.jar

Further posts on importing xhtml belong in the dedicated subforum, please.