Page 1 of 1

PDF not generated for all merged files

PostPosted: Fri Apr 29, 2016 5:38 am
by sri_p
Hello all,
I was able to successfully combine all docx files using alt + chunk method by googling. However when I convert to pdf only the first document is exported.
Any ideas. No errors in the log

Thanks
Code: Select all
public static InputStream mergeDocx(final List<InputStream> streams) throws Docx4JException, IOException, Exception {

      WordprocessingMLPackage target = null;
      final File generated = File.createTempFile("generated", ".docx");

      int chunkId = 0;
      Iterator<InputStream> it = streams.iterator();
      while (it.hasNext()) {
         InputStream is = it.next();
         if (is != null) {
            if (target == null) {
               // Copy first (master) document
               OutputStream os = new FileOutputStream(generated);
               os.write(IOUtils.toByteArray(is));
               os.close();

               target = WordprocessingMLPackage.load(generated);
            } else {
               // Attach the others (Alternative input parts)
               insertDocx(target.getMainDocumentPart(), IOUtils.toByteArray(is), chunkId++);
               
            }
         }
      }

      if (target != null) {
         target.save(generated);         
         return new FileInputStream(generated);
      } else {
         return null;
      }
   }
private static void insertDocx(MainDocumentPart main, byte[] bytes, int chunkId) {
      try {
         AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(
               new PartName("/part" + chunkId + ".docx"));
         afiPart.setContentType(new ContentType(CONTENT_TYPE));
         afiPart.setBinaryData(bytes);
         Relationship altChunkRel = main.addTargetPart(afiPart);

         CTAltChunk chunk = Context.getWmlObjectFactory().createCTAltChunk();
         chunk.setId(altChunkRel.getId());

         main.addObject(chunk);         
      } catch (Exception e) {
         e.printStackTrace();
      }
   }

   private static void generatePdf(File inputFile, String outputFolderPath) throws Exception {

      // Font regex (optional)
      // Set regex if you want to restrict to some defined subset of fonts
      // Here we have to do this before calling createContent,
      // since that discovers fonts
      String regex = null;
      // Windows:
      // String
      // regex=".*(calibri|camb|cour|arial|symb|times|Times|zapf).*";
      // regex=".*(calibri|camb|cour|arial|times|comic|georgia|impact|LSANS|pala|tahoma|trebuc|verdana|symbol|webdings|wingding).*";
      // Mac
      // String
      // regex=".*(Courier New|Arial|Times New Roman|Comic
      // Sans|Georgia|Impact|Lucida Console|Lucida Sans Unicode|Palatino
      // Linotype|Tahoma|Trebuchet|Verdana|Symbol|Webdings|Wingdings|MS Sans
      // Serif|MS Serif).*";
      PhysicalFonts.setRegex(regex);

      // Document loading (required)
      WordprocessingMLPackage wordMLPackage;

      // Load .docx or Flat OPC .xml
      System.out.println("Loading file from " + inputFile.getName());
      wordMLPackage = WordprocessingMLPackage.load(inputFile);
      

      // Refresh the values of DOCPROPERTY fields
      FieldUpdater updater = new FieldUpdater(wordMLPackage);
      updater.update(true);

      String outputfilepath = outputFolderPath + "merged.pdf";

      // All methods write to an output stream
      OutputStream os = new java.io.FileOutputStream(outputfilepath);

      // Since 3.3.0, Plutext's PDF Converter is used by default

      System.out.println("Using Plutext's PDF Converter; add docx4j-export-fo if you don't want that");

      Docx4J.toPDF(wordMLPackage, os);
      
      System.out.println("Saved: " + outputfilepath);

      return;

   }


Re: PDF not generated for all merged files

PostPosted: Fri Apr 29, 2016 9:06 pm
by jason
With version 3.3.0, there are 2 ways to get PDF output.

Way #1, the default is using Plutext's commercial PDF Converter. If you are doing it this way, it should work, since it handles altChunk. If you are doing it this way and it is not working, please send me your input docx with the altChunks so I can investigate (save it with Docx4J.save)

Way #2, is via XSL FO, same as for previous versions of docx4j, and will be used in docx4j-export-FO is on your classpath. This doesn't support altChunk of type docx processing. You could process the docx altChunks using MergeDocx, which is in the Enterprise Ed.

Re: PDF not generated for all merged files

PostPosted: Sat Apr 30, 2016 12:55 am
by sri_p
Thanks Jason.
On a side note, I tried 3.3.0 version with FO but got method not found exceptions. FOP 2.1 jar was included in the dependencies.
I guess conversion code is still referring to FOP 1.1.

Re: PDF not generated for all merged files

PostPosted: Sat Apr 30, 2016 6:49 am
by jason
sri_p wrote:3.3.0 version with FO but got method not found exceptions


What exceptions did you get?

If you're using maven, the export-FO dependency should pull in everything required.

If you are not using Maven, you need to add the same jars manually.. The jars are these:

Code: Select all
    <path id="docx4j-export-fo.classpath">

        <pathelement location="${m2Repository}/org/docx4j/docx4j-export-fo/3.3.0/docx4j-export-fo-3.3.0.jar"/>
        <pathelement location="${m2Repository}/org/plutext/jaxb-xslfo/1.0.1/jaxb-xslfo-1.0.1.jar"/>
       
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/fop/2.1/fop-2.1.jar"/>
         
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-svg-dom/1.7/batik-svg-dom-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-anim/1.7/batik-anim-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-css/1.7/batik-css-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-dom/1.7/batik-dom-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-parser/1.7/batik-parser-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-util/1.7/batik-util-1.7.jar"/>

        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-bridge/1.7/batik-bridge-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-script/1.7/batik-script-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-js/1.7/batik-js-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-xml/1.7/batik-xml-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-awt-util/1.7/batik-awt-util-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-gvt/1.7/batik-gvt-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-transcoder/1.7/batik-transcoder-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-svggen/1.7/batik-svggen-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-extension/1.7/batik-extension-1.7.jar"/>
        <pathelement location="${m2Repository}/org/apache/xmlgraphics/batik-ext/1.7/batik-ext-1.7.jar"/>
       
   </path>


They are in the 3.3.0 zip file in dir optional/export-fo