Page 1 of 1

Bullets getting converted to #, docx to pdf

PostPosted: Mon Feb 04, 2019 6:29 pm
by monika_thakran
Hi Jason,

I am using docx4j jar for converting variables in a docx document and then converting it to pdf.

Below is the code
Code: Select all
public File generateDocument(InputStream inputStream, Map<String, String> params, String fileName, String tempDir) {
        OutputStream os = null;
        byte[] bytes = null;
        File file;
        Map<String, String> fields;
        try {
            tempDir = correctDirectoryPath(tempDir);
            bytes = IOUtils.toByteArray(inputStream);
            file = new File(tempDir + fileName + this.formatter.format(new Date()) + ".pdf");
            WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new ByteArrayInputStream(bytes));
            VariablePrepare.prepare(wordMLPackage);
            MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
            if (params != null) {
                fields = new HashMap<>(params);
                documentPart.variableReplace((HashMap<String, String>) fields);
            }
            Mapper fontMapper = new IdentityPlusMapper();
            wordMLPackage.setFontMapper(fontMapper);
            List<SectionWrapper> sectionWrappers = replaceHeaders(params, wordMLPackage);
            FieldUpdater updater = new FieldUpdater(wordMLPackage);
            updater.update(true);
            FOSettings foSettings = Docx4J.createFOSettings();
            foSettings.setWmlPackage(wordMLPackage);
            if (!file.exists()) {
                file.createNewFile();
            }
            os = new FileOutputStream(file);
            Docx4J.toFO(foSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);
            System.out.println("Done!");
            return file;
        }
        catch (Exception e) {
            logger.error("Some error occurred while processing document", e);
            return null;
        }
        finally {
            try {
                if (os != null) {
                    os.close();
                }
                if (inputStream != null) {
                    inputStream.close();
                }
            }
            catch (IOException e) {
                logger.error("Could not close input/output stream", e);
            }
        }
    }


But when I run this, the bullets in the docx document get converted to # symbol in the pdf.

Attached are the template and the output file.

I am using 3.2.0 jar for docx4j.

Please help me understand what am I missing!!

Re: Bullets getting converted to #, docx to pdf

PostPosted: Mon Feb 04, 2019 6:33 pm
by monika_thakran
Attaching files now as they didn't get attached initially

Re: Bullets getting converted to #, docx to pdf

PostPosted: Mon Feb 04, 2019 6:37 pm
by monika_thakran
output file

Re: Bullets getting converted to #, docx to pdf

PostPosted: Tue Feb 05, 2019 3:42 pm
by monika_thakran
Can someone help me with this issue, it is acting as a blocker for us in our production environment.

Re: Bullets getting converted to #, docx to pdf

PostPosted: Fri Feb 08, 2019 6:18 pm
by jason
You'll need to install the correct font on your system, or map it to a suitable physical font which is present.

In this case, your bullet is in the Symbol font, so easiest would be to install that font.

I expect export-FO warns you:

Code: Select all
WARN org.docx4j.fonts.fop.util.FopConfigUtil .declareFonts line 123 - Document font Symbol is not mapped to a physical font!
WARN org.apache.fop.apps.FOUserAgent .processEvent line 94 - Glyph "" (0xf0b7) not available in font "Times-Roman".


Or you can of course use a different bullet char in your docx (ie one for which you have the font installed).

By the way, we don't provide commercial grade support for export-FO. If you need commercial quality PDF output quality and support wise, you may wish to consider our commercial PDF Converter: https://converter-eval.plutext.com/ though the same principle about having a suitable font installed applies! :-)

Re: Bullets getting converted to #, docx to pdf

PostPosted: Fri Aug 23, 2019 10:04 am
by ramjibbk
Hi

We are also facing same issue. Bullets are converted to # while converting docx to pdf . Please let me know if you already found the solution for this.

Re: Bullets getting converted to #, docx to pdf

PostPosted: Fri Aug 23, 2019 10:08 am
by ramjibbk
It is blocking our production release. so please let us know if you have any information

Re: Bullets getting converted to #, docx to pdf

PostPosted: Sat Aug 24, 2019 9:26 pm
by jason
The diagnosis/solution is as stated: install/use a suitable font. If you don't have the actual font, you can map to a different font which also contains the glyph.

Locking this topic. If this answer doesn't help, feel free to create another topic, posting a short docx exhibiting the issue. Alternatively, if you have a support contract with Plutext, feel free to email.