Page 1 of 1

convert docx to pdf

PostPosted: Mon Oct 19, 2009 12:59 pm
by eric
Code: Select all
10:11:28,140 DEBUG PhysicalFonts:254 - Processing physical font: file:/c:/windows/fonts/estre.ttf
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.fop.fonts.EmbedFontInfo.getPanose()Lorg/foray/font/format/Panose;
   at org.docx4j.fonts.PhysicalFont.<init>(PhysicalFont.java:56)
   at org.docx4j.fonts.PhysicalFonts.addPhysicalFont(PhysicalFonts.java:263)
   at org.docx4j.fonts.PhysicalFonts.discoverPhysicalFonts(PhysicalFonts.java:117)
   at org.docx4j.fonts.IdentityPlusMapper.<clinit>(IdentityPlusMapper.java:72)
   at org.docx4j.openpackaging.packages.WordprocessingMLPackage.getFontMapper(WordprocessingMLPackage.java:333)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.declareFonts(Conversion.java:106)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:201)
   at test.TestConvertPdf.main(TestConvertPdf.java:37)


org.docx4j.convert.out.pdf.PdfConversion c
= new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);

i user the method to change format,occur this problem???

Re: convert docx to pdf

PostPosted: Mon Oct 19, 2009 1:22 pm
by jason
You either don't have fop-patched-0.95.756436.jar on your path, or you have the standard fop.jar there as well.

Re: convert docx to pdf

PostPosted: Tue Oct 20, 2009 4:24 am
by eric
i had change docx to pdf,but have some problems:
1) the table format is wrong,there is four rows in docx,but two rows left after change to pdf;
2) it is ignore the page break,there is four pages in docx , but one page left after change to pdf;
3) it is not spport the chinese.

Re: convert docx to pdf

PostPosted: Tue Oct 20, 2009 7:47 am
by jason
Hi Eric

eric wrote:1) the table format is wrong,there is four rows in docx,but two rows left after change to pdf;


Are you using viaHTML? Make sure you are using NG; alternatively try viaXSLFO.

eric wrote:2) it is ignore the page break,there is four pages in docx , but one page left after change to pdf;


try viaXSLFO. As mentioned yesterday, you might find one of these alternatives is a better starting point for you.

eric wrote:3) it is not spport the chinese.


Is it a font problem, or a RTL type problem?

If you are using viaXSLFO, you could look at the intermediate XSL FO which is generated, and see whether that looks right or not.

cheers ,, Jason

Re: convert docx to pdf

PostPosted: Tue Oct 20, 2009 9:48 am
by eric
Are you using viaHTML? Make sure you are using NG; alternatively try viaXSLFO.


i am using viaXSLFO!

now the problem is after delete/remove the static code "IdentityPlusMapper",it can change format success!but not remove,it will scan the Window Fonts,and occur this error:

Code: Select all
06:50:23,421 DEBUG Conversion:368 - <w:pPr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:ns8="http://schemas.openxmlformats.org/schemaLibrary/2006/main"/>

   at org.docx4j.model.properties.PropertyFactory.createProperties(PropertyFactory.java:59)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.createFoAttributes(Conversion.java:525)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.createBlockForPPr(Conversion.java:372)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.xalan.extensions.ExtensionHandlerJavaPackage.callFunction(ExtensionHandlerJavaPackage.java:298)
   at org.apache.xalan.extensions.ExtensionHandlerJavaPackage.callFunction(ExtensionHandlerJavaPackage.java:438)
   at org.apache.xalan.extensions.ExtensionsTable.extFunction(ExtensionsTable.java:220)
   at org.apache.xalan.transformer.TransformerImpl.extFunction(TransformerImpl.java:473)
   at org.apache.xpath.functions.FuncExtFunction.execute(FuncExtFunction.java:206)
   at org.apache.xpath.XPath.execute(XPath.java:335)
   at org.apache.xalan.templates.ElemCopyOf.execute(ElemCopyOf.java:132)
   at org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:393)
   at org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:176)
   at org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2411)
   at org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:1374)
   at org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2411)
   at org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:1374)
   at org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2411)
   at org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:1374)
   at org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:393)
   at org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:176)
   at org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2411)
   at org.apache.xalan.transformer.TransformerImpl.applyTemplateToNode(TransformerImpl.java:2281)
   at org.apache.xalan.transformer.TransformerImpl.transformNode(TransformerImpl.java:1367)
   at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:709)
   at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1284)
   at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1262)
   at org.docx4j.XmlUtils.transform(XmlUtils.java:615)
   at org.docx4j.XmlUtils.transform(XmlUtils.java:539)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:262)
   at test.TestConvertPdf.main(TestConvertPdf.java:37)
java.lang.NullPointerException
06:50:23,437 ERROR Conversion:402 - java.lang.NullPointerException
06:50:23,437 DEBUG PropertyResolver:359 - in getEffectiveRPrv


Is it a font problem, or a RTL type problem?

If you are using viaXSLFO, you could look at the intermediate XSL FO which is generated, and see whether that looks right or not.


where to see "RTL type "?

Re: convert docx to pdf

PostPosted: Tue Oct 20, 2009 12:20 pm
by jason
eric wrote:at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.createBlockForPPr(Conversion.java:372)


I've committed a fix for that NPE.

Are you able to compile the docx4j source code, or would you like me to upload a nightly?

Re: convert docx to pdf

PostPosted: Wed Oct 21, 2009 3:52 am
by eric
Are you able to compile the docx4j source code, or would you like me to upload a nightly?


i woule like to download nightly version~~~thanks , jason

Re: convert docx to pdf

PostPosted: Wed Oct 21, 2009 6:28 am
by jason
eric wrote:i woule like to download nightly version


Try http://dev.plutext.org/docx4j/docx4j-nightly-20091021.jar

cheers .. Jason

Re: convert docx to pdf

PostPosted: Wed Oct 21, 2009 1:50 pm
by eric
hi,jason,i user the upper lib replace mine.it can run success!but the format is wrong!
before change:
1.JPG
1.JPG (51.86 KiB) Viewed 5735 times


alter change:
2.JPG
2.JPG (52.41 KiB) Viewed 5735 times


i don't know why?

Re: convert docx to pdf

PostPosted: Thu Oct 22, 2009 2:09 am
by jason
You mean your chinese characters have become '###'?

Does the problem still occur if you simplify the document by removing the image?

Re: convert docx to pdf

PostPosted: Thu Oct 22, 2009 3:23 am
by eric
You mean your chinese characters have become '###'?


not only the chinese characters have become '###',and the line break is ignore,the font'style also changed!

Does the problem still occur if you simplify the document by removing the image?


it still the same after remove the image;

in fact,what i need is the same format after docx change to pdf,but it seems not the same.

Re: convert docx to pdf

PostPosted: Thu Oct 22, 2009 5:59 am
by jason
There was a change in this area, to use org.docx4j.model.properties.PropertyFactory, and with it org.docx4j.model.properties.run.Font. But this was a refactoring, and should not have changed the behaviour.

(There is also org.docx4j.model.styles.StyleTree, but that's only used by HTML NG2)

I had thought that maybe the new jar isn't taking account of the default document font, which might explain how you are seeing a different font, but there hasn't been any change there for XSL FO.

So you will have to do some digging. I suggest you make a simple document with 2 paragraphs. The 1st paragraph just says "some text", with no changes to font. The 2nd paragraph is the same in the font you want to work with.

Look at the debug output for what it says about your fonts. I've uploaded a new nightly http://dev.plutext.org/docx4j/docx4j-ni ... 091022.jar which improves the font diagnostics, so you should be able to see what is happening.

Also useful if you are wondering what has changed between the jars you have tried: If you have logging set at debug level on org.docx4j.convert.out.pdf.viaXSLFO.Conversion, you should be able to see the intermediate XSLFO, and compare what you were getting before with what you are getting now.

Re: convert docx to pdf

PostPosted: Thu Oct 22, 2009 6:47 am
by dqkit
Eric might mean that the font style has been removed.
And I found that too, mostly, the space between fonts and lines was wrong!

Re: convert docx to pdf

PostPosted: Thu Oct 22, 2009 8:24 am
by jason
dqkit wrote:I found that too, mostly, the space between fonts and lines was wrong


Its a bit hard for me or anybody else to make any changes based on your comments, unless you can be a bit more specific.

At minimum,:
  • what is the bit of OpenXML which is not interpreted, or interpreted incorrectly; attach XML for the relevant w:p (or w:tbl etc)
  • suggestion of CSS or XSL FO you expect.

Note that only a small subset of Word's run and paragraph properties are supported; for these please see org.docx4j.model.properties.PropertyFactory

That class is designed to make it easy for docx4j users to add support for more properties. So please do so, and contribute your additions back as a patch.

thanks .. Jason

Re: convert docx to pdf

PostPosted: Fri Oct 30, 2009 10:43 pm
by td16
Any idea why these exceptions occur? I have tables and images in my document.

Code: Select all
org.docx4j.openpackaging.exceptions.Docx4JException: FOP issues
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:278)
   
Caused by: javax.xml.transform.TransformerException: org.apache.fop.fo.ValidationException: "fo:table-cell" is missing child elements.
Required content model: marker* (%block;)+ (See position 27:38)
   at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:501)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:270)
   ... 4 more

Caused by: org.apache.fop.fo.ValidationException: "fo:table-cell" is missing child elements.
Required content model: marker* (%block;)+ (See position 27:38)
   at org.apache.fop.events.ValidationExceptionFactory.createException(ValidationExceptionFactory.java:38)
   at org.apache.fop.events.EventExceptionManager.throwException(EventExceptionManager.java:54)
   at org.apache.fop.events.DefaultEventBroadcaster$1.invoke(DefaultEventBroadcaster.java:152)
   at $Proxy38.missingChildElement(Unknown Source)
   at org.apache.fop.fo.FONode.missingChildElementError(FONode.java:564)
   at org.apache.fop.fo.flow.table.TableCell.finalizeNode(TableCell.java:113)
   at org.apache.fop.fo.FONode.endOfNode(FONode.java:329)
   at org.apache.fop.fo.flow.table.TableCell.endOfNode(TableCell.java:105)
   at org.apache.fop.fo.FOTreeBuilder$MainFOHandler.endElement(FOTreeBuilder.java:348)
   at org.apache.fop.fo.FOTreeBuilder.endElement(FOTreeBuilder.java:177)
   at org.apache.xalan.transformer.TransformerIdentityImpl.endElement(TransformerIdentityImpl.java:1101)
   at org.apache.xerces.parsers.SAXParser.endElement(SAXParser.java:1403)
   at org.apache.xerces.validators.common.XMLValidator.callEndElement(XMLValidator.java:1550)
   at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1149)
   at org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381)
   at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1098)
   at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:484)
   ... 5 more

Re: convert docx to pdf

PostPosted: Sat Oct 31, 2009 12:08 am
by jason
Please attach a docx exhibiting the problem.

Tables should work well (left to right anyway); with current svn or nightly build, the borders/shading etc should also work.

Re: convert docx to pdf

PostPosted: Mon Nov 09, 2009 9:25 pm
by td16
I am using the latest nightly build (20091106). Getting a different exception now...

Code: Select all
org.docx4j.openpackaging.exceptions.Docx4JException: FOP issues
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:293)
   at java.lang.Thread.run(Unknown Source)
Caused by: javax.xml.transform.TransformerException: org.apache.fop.fo.pagination.PageProductionException: Subsequences exhausted in page-sequence-master "twoside", cannot recover. (See position 14:48)
   at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:501)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:285)
   ... 4 more
Caused by: org.apache.fop.fo.pagination.PageProductionException: Subsequences exhausted in page-sequence-master "twoside", cannot recover. (See position 14:48)
   at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1111)
   at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:484)
   ... 5 more
org.docx4j.openpackaging.exceptions.Docx4JException: FOP issues
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:293)
   at java.lang.Thread.run(Unknown Source)
org.docx4j.openpackaging.exceptions.Docx4JException: FOP issues
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:293)
   at java.lang.Thread.run(Unknown Source)
Caused by: javax.xml.transform.TransformerException: org.apache.fop.fo.pagination.PageProductionException: Subsequences exhausted in page-sequence-master "twoside", cannot recover. (See position 14:48)
   at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:501)
Caused by: javax.xml.transform.TransformerException: org.apache.fop.fo.pagination.PageProductionException: Subsequences exhausted in page-sequence-master "twoside", cannot recover. (See position 14:48)
   at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:501)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:285)
   at org.docx4j.convert.out.pdf.viaXSLFO.Conversion.output(Conversion.java:285)
   ... 4 more
Caused by: org.apache.fop.fo.pagination.PageProductionException: Subsequences exhausted in page-sequence-master "twoside", cannot recover. (See position 14:48)
   at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1111)
   at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:484)
   ... 5 more
   ... 4 more
Caused by: org.apache.fop.fo.pagination.PageProductionException: Subsequences exhausted in page-sequence-master "twoside", cannot recover. (See position 14:48)
   at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1111)
   at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:484)
   ... 5 more

Re: convert docx to pdf

PostPosted: Mon Nov 09, 2009 11:13 pm
by jason
I've not seen that error.

Can you please attach a document which exhibits the problem?