Page 1 of 1

how to convert docx to pdf?

PostPosted: Mon Oct 19, 2009 6:31 am
by eric
i want to change "docx" fommat to "pdf",i try docx4j's samply but not work,can anybody show the example for me or has any other good suggestions?thanks~~~~

Re: how to convert docx to pdf?

PostPosted: Mon Oct 19, 2009 9:31 am
by jason
eric wrote:i want to change "docx" fommat to "pdf",i try docx4j's samply but not work,can anybody show the example for me or has any other good suggestions?


Hi, what went wrong when you tried it? Basic documents should work, but its quite possible your documents contain features we don't handle yet.

We have three broad ways of going from docx to PDF:

  • via HTML
  • via XSL FO
  • via iText

In the sample, you configure this by commenting out the 2 lines you don't want:

Code: Select all
         org.docx4j.convert.out.pdf.PdfConversion c
//            = new org.docx4j.convert.out.pdf.viaHTML.Conversion(wordMLPackage);
            = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);
//            = new org.docx4j.convert.out.pdf.viaIText.Conversion(wordMLPackage);


The different methods have their own strengths and weaknesses:
Code: Select all
         /*
          * .. viaHTML uses docX2HTML.xslt and xhtmlrenderer,
          *    and supports numbering, images,
          *    and tables, but the XSLT is pretty hard to understand
          *   
          * .. viaXSLFO uses docx2fo.xslt and FOP.  It is
          *    rudimentary right now, but does support
          *    headers/footers, images and fairly basic tables
          *    (but supporting merged cells)
          *   
          * .. viaItext - for developers who don't like xslt
          *    at all! Or want to use iText's features..
          *    Displays images, but as at 2009 03 19.
          *    doesn't try to scale them.
          */


If you use the viaHTML approach, please be aware that we also have "next generation" HTML converters, which can be used in place of the XSLT one. As far as HTML output is concerned, these are where our attention is.

cheers .. Jason

Re: how to convert docx to pdf?

PostPosted: Mon Oct 19, 2009 9:50 am
by eric
thanks for your reply!

org.docx4j.convert.out.pdf.PdfConversion c
// = new org.docx4j.convert.out.pdf.viaHTML.Conversion(wordMLPackage);
= new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);
// = new org.docx4j.convert.out.pdf.viaIText.Conversion(wordMLPackage);


after i use upper code to convert docx,but the class “PdfConversion” only has the method "view",realy i want to save the pdf into local , how can i do? what methods can user?

Re: how to convert docx to pdf?

PostPosted: Mon Oct 19, 2009 10:05 am
by jason
jason wrote:after i use upper code to convert docx,but the class “PdfConversion” only has the method "view",realy i want to save the pdf into local , how can i do? what methods can user?


If you want to use docx4j, you need to be prepared to spend a bit of time understanding the code.

The sample contains:

Code: Select all
      if (save) {
            OutputStream os = new java.io.FileOutputStream(inputfilepath + ".pdf");         
            c.output(os);
            System.out.println("Saved " + inputfilepath + ".pdf");
         } else {
            c.view();
         }   


You can see from that that that PdfConversion has a method output(OutputStream os).

In fact that snippet shows you how to save to the local filesystem, which, if I'm understanding you correctly, is what you want to do.

Re: how to convert docx to pdf?

PostPosted: Mon Oct 19, 2009 10:27 am
by eric
thanks ,jason ! now,i run the app success!but still have something wrong:
Code: Select all
public static void main(String[] args) throws Exception {
      WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
            .load(new java.io.File("f:/2.docx"));

      OutputStream os = new java.io.FileOutputStream("f:/2.pdf");

      org.docx4j.convert.out.pdf.PdfConversion c
//       = new org.docx4j.convert.out.pdf.viaHTML.Conversion(wordMLPackage);
//       = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);
      = new org.docx4j.convert.out.pdf.viaIText.Conversion(wordMLPackage);
      c.output(os);
   }

this is my code.

1.JPG
my lib
1.JPG (32.87 KiB) Viewed 10740 times

this is lib in my project.when i run the app,there is't error!the pdf has generated.but there is nothing in it!
and then i change this method "org.docx4j.convert.out.pdf.PdfConversion c = new org.docx4j.convert.out.pdf.viaIText.Conversion(wordMLPackage)" to convert,but some erorrs exsit:
Code: Select all
java.lang.NoClassDefFoundError: org/apache/batik/util/XMLResourceDescriptor
   at org.apache.fop.fo.extensions.svg.SVGElementMapping.initialize(SVGElementMapping.java:80)
   at org.apache.fop.fo.ElementMapping.getTable(ElementMapping.java:54)
   at org.apache.fop.fo.ElementMappingRegistry.addElementMapping(ElementMappingRegistry.java:118)
   at org.apache.fop.fo.ElementMappingRegistry.addElementMapping(ElementMappingRegistry.java:97)
   at org.apache.fop.fo.ElementMappingRegistry.setupDefaultMappings(ElementMappingRegistry.java:78)
   at org.apache.fop.fo.ElementMappingRegistry.<init>(ElementMappingRegistry.java:65)
   at org.apache.fop.apps.FopFactory.<init>(FopFactory.java:149)
   at org.apache.fop.apps.FopFactory.newInstance(FopFactory.java:172)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:192)
   at test.TestConvertPdf.main(TestConvertPdf.java:21)


is my lib not enough?

Re: how to convert docx to pdf?

PostPosted: Mon Oct 19, 2009 10:36 am
by eric
Code: Select all
7:45:41,437 ERROR XmlUtils:678 - java.lang.NoSuchMethodException: For extension function, could not find method static org.docx4j.convert.out.pdf.viaXSLFO.Conversion.hasDefaultHeader([ExpressionContext,] #UNKNOWN (org.docx4j.openpackaging.packages.WordprocessingMLPackage)).

just now,i found this error running application!

Re: how to convert docx to pdf?

PostPosted: Mon Oct 19, 2009 12:58 pm
by jason
That was using viaHTML right, with the old (non-NG) DocX2Html.xslt?

I've fixed that now. But the sample code hasn't used that for a long time, so unless you specifically changed to that by setting useHtmlExporterNG to false .... what docx4j.jar are you using? You should use http://dev.plutext.org/docx4j/docx4j-2.2.2/docx4j.jar or the most recent nightly, which is currently http://dev.plutext.org/docx4j/docx4j-ni ... 091013.jar