Page 1 of 1

HTML Conversion

PostPosted: Tue May 20, 2014 11:54 pm
by newmanater
Hi All

We have a HTM file, simaler to the style of file that you get when you use Word Save As option and select "Web Page".
It is my understanding that Docx4j can convert XHtml into a Docx. in order to get my HTM document into Xhtml format I'm using a tool called JTidy.
However Jtidy doesn't work with the HTM format that I am trying to convert and as a result Docx4j cannot convert the HTM file.

I'm wondering if anyone has some advice, or an alternative tool I can use to get the correct format for Docx4j to successfully create a docx from my HTM.

I can provide examples of anything you require

Thank you.

Re: HTML Conversion

PostPosted: Fri Jun 13, 2014 9:55 pm
by barnaba_hunters
Hi newmanater,
I'm new to docx4j and at the moment I'm using example available in ConvertOutHtml class to convert to html, then it is possible to convert html result back to docx, at the moment I have some troubles with styles and fonts and elements such as footers and headers but I hope I will be able to get similar output docx to the input file. Good luck,
Mat