Page 1 of 1

html to docx - images missing?

PostPosted: Thu Sep 06, 2018 11:29 pm
by carlos
Hi everyone,

Secondly, I'm converting html into DOCX.
When doing this conversion a docx file is generated in a directory. Then i try to read this docx file( which contains some images) with Inputstream in order to save it in my repository.
When reading the docx file again. Images are missing.
Is there any way how can i generate a docx file which keeps the images? because it seems that the generated docx file references the images so if you go to the directory where the images are and then delete them, and if you open the generated docx file images are missing.

This is currently my code to convert html to docx
RFonts rfonts = Context.getWmlObjectFactory().createRFonts();
rfonts.setAscii("Century Gothic");
XHTMLImporterImpl.addFontMapping("Century Gothic", rfonts);

WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
XHTMLImporter.setHyperlinkStyle("Hyperlink");
XHTMLImageHandler xHTMLImageHandler= new XHTMLImageHandlerDefault(XHTMLImporter);
XHTMLImporter.setXHTMLImageHandler(xHTMLImageHandler);

wordMLPackage.getMainDocumentPart().addAltChunk(AltChunkType.Xhtml, html.getBytes(Charset.forName("UTF-8")));
// System.out.println(XmlUtils.marshaltoString(wordMLPackage.getMainDocumentPart().getJaxbElement(), true, true));
Path path = Paths.get(System.getProperty("user.dir") + "/../../portlets/documentsstore").toRealPath();
File parentFile = new File(path+ "/OUT_from_XHTML.docx");


Docx4J.save(wordMLPackage, parentFile, Docx4J.FLAG_EXPORT_PREFER_XSL);
FileInputStream fileInputStream = new FileInputStream(parentFile.getAbsolutePath());

Please i would apreciate your help :)
Thanks in advance
Carlos