Page 1 of 1

Convert mht altchunk to PDF

PostPosted: Fri Nov 09, 2018 6:41 am
by sdezso
Hi,

I created the attached docx with docx4j. I want to convert this docx to pdf. I tried to use the docx4j-export-FO and I got the attached pdf. I tried to use the PDF Converter, but I got the error message below:
Code: Select all
F|DEV0000000000000000000000000000000000000000000000000000319D0F9L53GUHEG4BISFQ7JJOE0000|2018-11-08T17:19:25+0100|Invalid assertion {ASSERT:putil\source\noox_import.c:258[Mon Sep  3 21:25:59 2018]}. Please include this token in an incident report.
W||2018-11-08T17:19:25+0100|Process 0ecf79f4-e372-11e8-a287-a3d4f4404b61 failed with exit code -1073740791. (Set PLUTEXT_VERBOSE=1 for crash dumps)


Please help me how to convert this docx to pdf.

Thanks!

MHT altChunks

PostPosted: Sat Nov 10, 2018 9:59 am
by jason
The problem with your document is that it contains AltChunks of type mht: https://en.wikipedia.org/wiki/MHTML

If I remove these (see attached docx), it works.

You need to convert the altChunks to real docx content before converting to PDF.

docx4j-ImportXHTML can do this for altChunks of type XHTML: https://github.com/plutext/docx4j-Impor ... s.java#L28

but it can't do it right now for chunks of type mht.

You could either avoid mht, or write some code to convert it to normal XHTML. Pretty straightforward, based on the code at https://stackoverflow.com/questions/323 ... es-in-java