Plutext

Posted: **Wed Apr 07, 2010 10:31 pm**

I have tried export to PDF and it appear to work well for the most part but there are a couple of thing that I would like some help with.

Currently Field codes are not supported I would like to look at contributing some code to support Field codes in the export to PDF process. Currently when this happens it print in bold red letters UNSUPPORTED. Can you give me some advice on where to start?

Also images in PDF, I exported a docx which contained 2 images but the resulting PDF did not have the image in it, is this a bug or not supported also.

Thank you for your help in advance.
Regards,
Stuart Ledwich

Posted: **Wed Apr 07, 2010 10:50 pm**

stuart.ledwich wrote:Currently Field codes are not supported I would like to look at contributing some code to support Field codes in the export to PDF process. Currently when this happens it print in bold red letters UNSUPPORTED. Can you give me some advice on where to start?

You need a template which matches the field in http://dev.plutext.org/svn/docx4j/trunk ... cx2fo.xslt
(The red letters UNSUPPORTED are the default template)

You'll probably want to call a Java extension function to do the actual processing. See http://dev.plutext.org/svn/docx4j/trunk ... rsion.java about half way down for examples of these. An effective pattern is to feed the DOM node into the extension function, where it can be converted to a JAXB object and processed via docx4j. The extension can then return a DOM node, or a string as appropriate (probably a string for field resolution).

stuart.ledwich wrote:Also images in PDF, I exported a docx which contained 2 images but the resulting PDF did not have the image in it, is this a bug or not supported also.

Images are generally supported. What type of image is it? You could attach a docx with just the image in it to this thread, or email it to me (jason@plutext.org) and I'll take a look.

Posted: **Thu Apr 08, 2010 1:37 am**

thanks for coming back to me so quickly. I have included a simple docx to try. I have exported it and the text appears but no image. Thank again.

Posted: **Thu Apr 08, 2010 2:42 am**

Just to flesh out my previous post.

I have also included the output PDF document.

Here is the code I used to convert the document:

Code: Select all: String inputfilepath="/home/sledwich/temp/Nice blue hills"; Date now = new Date(); WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new File(inputfilepath+".docx")); // Fonts identity mapping – best on Microsoft Windows wordMLPackage.setFontMapper(new IdentityPlusMapper()); // Set up converter org.docx4j.convert.out.pdf.PdfConversion c = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage); // Write to output stream OutputStream os = new java.io.FileOutputStream("/tmp/test.pdf"); c.output(os);

Posted: **Thu Apr 08, 2010 9:52 am**

Thanks for this report. Now fixed in SVN. http://dev.plutext.org/trac/docx4j/changeset/1118
That was a regression introduced post v2.3.0 when i refactored the image handling extensions.

Posted: **Thu Apr 08, 2010 11:10 am**

Thank you - that fixed the problem.

I going to start looking at the field codes tomorrow, Thanks again.

Posted: **Thu Apr 08, 2010 12:30 pm**

Great, would be good to have field support in the exporters. (If you can make it work for pdf, i'll easily be able to add the same functionality to html)

Posted: **Fri Apr 23, 2010 7:12 pm**

I have made a very small change to avoid certain fields in the docx, this is largely due to Fields put their content both into the document body and therefore the field itself has very little function in the pdf output, so this patch simply hides them.

it is contributed on the basis of the document at http://dev.plutext.org/docx4j/docx4j_In ... butor.docx

Another problem with Images
I also seem to have hit another problem with images again, the difference seems to be the tag the images are stored in, previously you corrected an image for me on this thread - but this one appears in a w:pict tag which does not appear to work. I have included an example image that is showing the problem.

Hope you can help. Thank you very much for all your assistance so far.

Posted: **Sat Apr 24, 2010 1:23 am**

Hello Stuart

Thanks for reporting the problem with E10 images.

http://dev.plutext.org/trac/docx4j/changeset/1122 fixes this. The key line is:

Code: Select all: String imgRelId = converter.imageData.getOtherAttributes().get( new QName("http://schemas.openxmlformats.org/officeDocument/2006/relationships", "id")); //NB r:id is not given by getId()!

cheers .. Jason

Posted: **Thu May 13, 2010 11:41 pm**

Hi Jason,

Just trying to look at converting this doc which contains a picture in the header do you know if this is something that we can convert to pdf?

Posted: **Sat May 15, 2010 1:20 am**

Hi Stuart

In principle, it should work.

Where docx2fo.xslt says something like

Code: Select all: <xsl:apply-templates select="java:org.docx4j.model.structure.HeaderFooterPolicy.getFirstHeader($wmlPackage)"/>

it is fetching the XML for the header, and applying the templates to it. This should include image related templates.

If its not working, to find out where things are going wrong, set log4j logging for Conversion to DEBUG, or set its setSaveFO method before running the conversion.

That way you'll be able to see the intermediate XSL FO file, where you can look to see what it has produced in the header.

cheers .. Jason

Posted: **Tue May 18, 2010 1:31 am**

Jason,

Thanks for your reply. I tried it but have found that I seem to get an exception everytime I try to convert the pdf. Its hitting the following exception.

Code: Select all: 8070 [main] ERROR docx4j.XmlUtils - java.lang.ClassCastException: org.docx4j.openpackaging.parts.WordprocessingML.StyleDefinitionsPart cannot be cast to org.docx4j.openpackaging.parts.WordprocessingML.BinaryPartAbstractImage ; Line#: 505; Column#: 18 javax.xml.transform.TransformerException: java.lang.ClassCastException: org.docx4j.openpackaging.parts.WordprocessingML.StyleDefinitionsPart cannot be cast to org.docx4j.openpackaging.parts.WordprocessingML.BinaryPartAbstractImage

Full Exception output has been added as an attachment. thanks

exception.txt.zip: Full exception output; (1.01 KiB) Downloaded 374 times

Posted: **Thu May 20, 2010 12:33 am**

Hi Stuart

The problem is that org.docx4j.model.images.WordXmlPictureE10.handleImageRel is assuming the image is a rel of the main document:

BinaryPartAbstractImage part = (BinaryPartAbstractImage)wmlPackage.getMainDocumentPart()
.getRelationshipsPart().getPart(rel);

whereas in this case it is a rel of the header.

We need a way to pass in the source part. This means passing a parameter through the templates, or better, defining a modelState to keep track of which part the XSLT is currently processing.

I've got a fair bit on my plate at the moment, so I'm not sure how soon I'll get to this, even though it is probably only an hour's work. I'll see if I can look at it before next week.

.. Jason

Posted: **Mon May 24, 2010 1:46 am**

Implemented in http://dev.plutext.org/trac/docx4j/changeset/1127
and 1128

Plutext

Export to PDF

Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF

Re: Export to PDF