Page 1 of 1

iText pdf conversion & appending PDFs

PostPosted: Sat Oct 08, 2011 1:47 am
by ovin08
I have tried a lot of libraries (PdfBox included) that either say they Merge or Concatenate pdf's but the only thing those libraries do is add the second pdf as a new page to the first. I have attached 3 files that will show what I am trying to do.
base.pdf is the itext Document that i have already started to dynamically generate. add.pdf is an example of the text i want to add. And result.pdf is what i want the file to look like. Both the base.pdf and add.pdf lengths will very so it wont be the same for each document i process.

I have noticed that there is code in the docx4j-extras/PdfViaIText package that might be able to help me. But I'm not exactly sure how to run this code, and there are no examples that use it. If i simply drop the docx4j-extras/PdfViaIText/org/docx4j/convert/out/pdf/viaIText/Conversion.java into the source of docx4j i get a compile error in the traverseBlockLevelContent method on line 191.
Code: Select all
The method getEGContentBlockContent() is undefined for the type ContentAccessor

And when i look through the SdtContentBlock file, i see the getEGContentBlockContent method, but it looks like the code is trying to search for the getEGContentBlockContent() method inside of ContentAccessor. Are the return types correct for the getSdtContent() method in the SdtBlock file?
I think this Conversion.java file might help me along the way if i can get it running.

Thanks!

Re: docx to pdf conversion problem:Fop issue

PostPosted: Sat Oct 08, 2011 1:24 pm
by jason
ovin08 wrote:I have tried a lot of libraries (PdfBox included) that either say they Merge or Concatenate pdf's but the only thing those libraries do is add the second pdf as a new page to the first. I have attached 3 files that will show what I am trying to do.
base.pdf is the itext Document that i have already started to dynamically generate. add.pdf is an example of the text i want to add. And result.pdf is what i want the file to look like. Both the base.pdf and add.pdf lengths will very so it wont be the same for each document i process.


So basically, you just want to append the contents of the second document flowing immediately on, without a page break?

No doubt you've googled "pdf remove page break"? The thing that makes this a bit tricky is putting the stuff into a single page object, with logic to start the next one once you've filled the page. You could also try researching "reflow"

ovin08 wrote:I have noticed that there is code in the docx4j-extras/PdfViaIText package that might be able to help me. But I'm not exactly sure how to run this code, and there are no examples that use it. If i simply drop the docx4j-extras/PdfViaIText/org/docx4j/convert/out/pdf/viaIText/Conversion.java into the source of docx4j i get a compile error in the traverseBlockLevelContent method on line 191.


From the CreatePdf sample:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
                        // As of docx4j 2.5.0, only viaXSLFO is supported.
                        // The viaIText and viaHTML source code can be found in src/docx4j-extras directory
                       
                        org.docx4j.convert.out.pdf.PdfConversion c
//                              = new org.docx4j.convert.out.pdf.viaHTML.Conversion(wordMLPackage);
                                = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);
//                              = new org.docx4j.convert.out.pdf.viaIText.Conversion(wordMLPackage);
 
Parsed in 0.015 seconds, using GeSHi 1.0.8.4


http://www.docx4java.org/trac/docx4j/changeset/1682 makes the viaIText stuff compile against svn trunk tip. Note for other readers: In pom.xml, you'll need to uncomment the iText dependency.

As per the comment above, you're largely on your own with this (viaXSLFO is the standard docx4j way of doing PDF output), though if you decide to improve it, I'm happy to accept patches.

Keep us informed as to your findings. thanks/good luck!