Page 1 of 1

Docx to HTML with custom tags

PostPosted: Fri Apr 14, 2017 6:09 pm
by thirub04
Can anyone tell how to generate HTML from docx file with custom tags like section, sub-section, bullets ?

Re: Docx to HTML with custom tags

PostPosted: Sat Apr 15, 2017 10:38 am
by jason
There's nothing in docx4j to make it trivially easy to use section, sub-section. (Bullets should be OK - please search/browse SdtToListSdtTagHandler for more)

Docx4j can create HTML:

1. via XSLT (with Java extension functions); see https://github.com/plutext/docx4j/blob/ ... -core.xslt

or 2. using Java to traverse the document; see https://github.com/plutext/docx4j/blob/ ... rator.java

The XSLT approach is more fully featured, but somewhat slower (and a bit more complex since it uses both XSLT and Java).

To use section, sub-section, choose your preferred approach, then start modifying...

If you want to map eg style heading 1 to section, and heading 2 to sub-section, ListsToContentControls does a similar thing for bullets/numbering: https://github.com/plutext/docx4j/blob/ ... .java#L213

Re: Docx to HTML with custom tags

PostPosted: Mon Apr 17, 2017 3:18 pm
by thirub04
Thank you Jason for your reply, It helped a lot.

on what basis is the contents of the docx file classified ?

Re: Docx to HTML with custom tags

PostPosted: Mon Apr 17, 2017 7:45 pm
by thirub04
How do we modify the HTML content generated using,

Docx4J.toHTML(htmlSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);