Page 1 of 1

Xhtml to docx using XSLT

PostPosted: Sat Dec 24, 2022 9:30 pm
by mithilesh.jha
hi @Jason ,
Can I convert xhtml to docx using XSLT through docx4j lib. If yes then can you please guide me how?

Re: Xhtml to docx using XSLT

PostPosted: Sun Dec 25, 2022 6:20 am
by jason
You could use XLST to convert XHTML to Flat OPC XML, but generally you'd expect a better result using https://github.com/plutext/docx4j-Impor ... 4j/samples

Re: Xhtml to docx using XSLT

PostPosted: Mon Dec 26, 2022 2:58 pm
by mithilesh.jha
Currently, I am converting XHTML to pdf using XSLT via Apache FOP lib where page layout and margins are defined in XSLT. I want a 100% similar Docx output as the pdf. So, I want to reuse the current XSLT file for rendering the XHTML into Docx. FYI, I am using JAVA.
For XHTML to Docx conversion currently, I am using Docx4j's XHTMLImporterImpl, but there are some differences in generated Docx file and pdf generated with the same XHTML via Apache FOP.
Differences:
1. Generated pdf is 2 pages, but generated Docx is of 3 pages.
2. Tables column's widths and heights are different in pdf and Docx.

Re: Xhtml to docx using XSLT

PostPosted: Wed Dec 28, 2022 4:48 am
by jason
Have a look at https://xmlgraphics.apache.org/fop/2.8/output.html

You may be able to develop something from the RTF output, or using Area Tree or Intermediate Format.

You know you can open XHTML files in Word right? If that helps...

How much variety is there in your input XHTML? If not much, then hand coding your XSLT may be straightforward.