Page 1 of 1

problem in docx to pdf cnversion

PostPosted: Tue Feb 25, 2014 2:13 am
by francofabbri
Hi
i'm looking for help about docx to pdf coversion.
I've attached the source file and the output of the conversion, which does not
correspond to the expected result.

The code is listed below.

Any help/suggestion will be extremely appreciated

TIA
Franco


Code: Select all
import java.io.OutputStream;

import org.docx4j.Docx4J;
import org.docx4j.convert.out.FOSettings;
import org.docx4j.fonts.IdentityPlusMapper;
import org.docx4j.fonts.Mapper;
import org.docx4j.fonts.PhysicalFont;
import org.docx4j.fonts.PhysicalFonts;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.samples.AbstractSample;


public class ConvertOutPDF extends AbstractSample  {
   public static void main(String[] args)    throws Exception {

      inputfilepath = "C:/temp/test.docx";
      String regex = null;
      PhysicalFonts.setRegex(regex);

      WordprocessingMLPackage wordMLPackage;
      System.out.println("Loading file from " + inputfilepath);
      wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));

      Mapper fontMapper = new IdentityPlusMapper();
      wordMLPackage.setFontMapper(fontMapper);

      PhysicalFont font = PhysicalFonts.getPhysicalFonts().get("Arial Unicode MS");
      fontMapper.getFontMappings().put("Times New Roman", font);

      FOSettings foSettings = Docx4J.createFOSettings();
      foSettings.setWmlPackage(wordMLPackage);
      String outputfilepath = inputfilepath + ".pdf";
      OutputStream os = new java.io.FileOutputStream(outputfilepath);
   }
}

Re: problem in docx to pdf cnversion

PostPosted: Tue Feb 25, 2014 2:19 am
by francofabbri
Hi, want just to add that the conversion seem to fails
in header and footer processing.

cheers
ff

Re: problem in docx to pdf cnversion

PostPosted: Wed Feb 26, 2014 6:05 pm
by jason
Hi Franco

There are at least 4 issues I notice:

1. header overlaps content
2. arabic characters displayed as #
3. absolutely positioned image top right
4. top right image is of type EMF (which isn't supported)

I'll look into and fix issue 2 (possibly tomorrow), and I guess you could avoid EMF images, but faithfully reproducing your content in PDF is a little beyond the capability of our PDF output at present. May I suggest you try LibreOffice or OpenOffice, to see whether their PDF output is good enough for you?