Page 1 of 1

Docx to Pdf But Chinese Character Show #,Help Please

PostPosted: Fri Apr 18, 2014 11:45 pm
by qingliang
Hi,Jason:

I Convert Html file to docx , and convert docx to pdf ,but chinese charcter in pdf files show # , I try to fix it ,but Failure. help please!

Re: Docx to Pdf But Chinese Character Show #,Help Please

PostPosted: Fri Apr 18, 2014 11:47 pm
by qingliang
sorry , it's my html file

Re: Docx to Pdf But Chinese Character Show #,Help Please

PostPosted: Fri Apr 18, 2014 11:51 pm
by qingliang
my code :
--------------------------------------------------
Date startTime = new Date();

String inputfilepath = "e:/123news.docx";
String outputfilepath = "e:/PDF_" + new Date().getTime() + ".pdf";
String regex = null;

WordprocessingMLPackage wordMLPackage;

// Load .docx or Flat OPC .xml
System.out.println("Loading file from " + inputfilepath);


try {
wordMLPackage = WordprocessingMLPackage.load(new java.io.File(
inputfilepath));

FOSettings foSettings = Docx4J.createFOSettings();

Mapper fontMapper = new IdentityPlusMapper();

// URL ttcUrl = new URL("http://localhost/conf/fonts/SimSun.ttc");
// PhysicalFonts.addPhysicalFont("宋体", ttcUrl);
// PhysicalFont font
// = PhysicalFonts.getPhysicalFonts().get("宋体");
//
// fontMapper.getFontMappings().put("宋体", font);
wordMLPackage.setFontMapper(fontMapper);

if (saveFO) {
foSettings
.setFoDumpFile(new java.io.File(inputfilepath + ".fo"));
}
foSettings.setWmlPackage(wordMLPackage);
OutputStream os = new java.io.FileOutputStream(outputfilepath);
// Don't care what type of exporter you use
// foSettings.setApacheFopConfiguration(apacheFopConfiguration)
Docx4J.toFO(foSettings, os, Docx4J.FLAG_NONE);
} catch (Docx4JException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Re: Docx to Pdf But Chinese Character Show #,Help Please

PostPosted: Mon Apr 21, 2014 3:37 pm
by qingliang
I have fixed this issue. it is font issue from html to docx .

Re: Docx to Pdf But Chinese Character Show #,Help Please

PostPosted: Fri Jul 25, 2014 6:54 pm
by aritz85
Hi,

I'm facing same issue, Chinese characters shows as '?' in the docx (I follow the same way as you, html->docx->pdf).

How did you solve this issue?

Thanks