Page 1 of 1

Fonts not rendered in PDF output

PostPosted: Thu Jun 19, 2014 3:57 pm
by tariq47
Hi Jason,
First of all let me thank you for this extremely useful library.
But I am having a little problem related to fonts when the docx is converted to pdf and also when docx is converted to XHTML. It fails to recognize any font and defaults all to a single font. I am sure I am missing something. Only if you can look at it and guide me.
I am using windows 7 box right now, but ultimately the code would be migrated to linux box in production.
I am attaching the log output (which is in zipped format due to max file upload restriction) for pdf conversion, the code used for conversion and both the docx file as well the result pdf.

code :
Code: Select all
static void convertDocxToPdfViaDocx4j(String inputDocx, String destPdf) throws Exception {
      String regex = null;
      regex=".*(calibri|cour|arial|times|comic|georgia|impact|LSANS|pala|tahoma|trebuc|verdana|symbol|webdings|wingding).*";
      PhysicalFonts.setRegex(regex);
      File docxFile = new File(inputDocx);
      WordprocessingMLPackage wordMLPackage = Docx4J.load(docxFile);
      // Set up font mapper
      Mapper fontMapper = new IdentityPlusMapper();         
      //fontMapper.getFontMappings().putAll(PhysicalFonts.getPhysicalFonts());
      wordMLPackage.setFontMapper(fontMapper);
      FOSettings foSettings = Docx4J.createFOSettings();
      //to save fo dump
      //foSettings.setFoDumpFile(new java.io.File(destFilePath + ".fo"));
      foSettings.setWmlPackage(wordMLPackage);
      //TODOd
      //os = new java.io.FileOutputStream("D://yeah.pdf");
      OutputStream os = new java.io.FileOutputStream(destPdf);//ByteArrayOutputStream();
      Docx4J.toFO(foSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);
      os.close();
   }

Re: Fonts not rendered in PDF output

PostPosted: Tue Jun 24, 2014 10:25 pm
by jason
For general trouble shooting, first comment out the line:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
regex=".*(calibri|cour|arial|times|comic|georgia|impact|LSANS|pala|tahoma|trebuc|verdana|symbol|webdings|wingding).*";
 
Parsed in 0.014 seconds, using GeSHi 1.0.8.4


since it restricts the fonts being loaded.

However, in this case, I still see:

Code: Select all
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'comic sans ms' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'courier new' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'georgia' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'lucida sans unicode' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'tahoma' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'times new roman' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'trebuchet ms' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'verdana' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'comic sans ms' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'comic sans ms' is not mapped to a physical font.
WARN org.docx4j.fonts.RunFontSelector .getPhysicalFont line 942 - Font 'comic sans ms' is not mapped to a physical font.


The problem is that your document contains font elements such as:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
          <w:rFonts w:ascii="tahoma"/>
          <w:rFonts w:ascii="times new roman"/>
          <w:rFonts w:ascii="trebuchet ms"/>
          <w:rFonts w:ascii="verdana"/>
 
Parsed in 0.001 seconds, using GeSHi 1.0.8.4


whereas these should be more like:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
          <w:rFonts w:ascii="Tahoma" w:hAnsi="Tahoma" w:cs="Tahoma"/>
          <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
          <w:rFonts w:ascii="Trebuchet MS" w:hAnsi="Trebuchet MS"/>
          <w:rFonts w:ascii="Verdana" w:hAnsi="Verdana"/>
 
Parsed in 0.001 seconds, using GeSHi 1.0.8.4


Moral of the story: don't guess/make up your font names! (or did some other application create them?) Mimic what Word does.

That said, it looks like the physical font mappings might be case sensitive. We ought to fix that, so tracking this at https://github.com/plutext/docx4j/issues/120