Page 1 of 1

Time to convert docx to pdf

PostPosted: Tue Dec 28, 2010 7:56 am
by driguh
Good night. I am developing an application and am using docx4j to convert my documents. In this case, the conversion is lasting about 4.5 minutes for each document, and of course, I'm not satisfied with that. I am using via XSLFO, and I wonder if other modes are faster. you have this knowledge?

I would also like help using the viaHTML for example, because despite having checked out the project docx4j correctly. I do not know how to use these classes that are in the folder docx4j-extras. What is the procedure to put them together with the other?

Now, thank you for your attention.

Re: Time to convert docx to pdf

PostPosted: Tue Dec 28, 2010 8:12 am
by jason
driguh wrote:the conversion is lasting about 4.5 minutes for each document, and of course, I'm not satisfied with that. I am using via XSLFO


I suspect something is wrong with your environment. Unless you are using a very old CPU, chances are you need to give the JVM more memory.

There was a previous post where someone had experienced something similar (on her laptop iirc, but things were fine in their deployment environment).

Suggest you try to sort this out, before moving to one of the other methods for generating PDF.

Re: Time to convert docx to pdf

PostPosted: Wed Dec 29, 2010 6:08 am
by driguh
I've done some tests here, and increased the available memory for the jvm to 1 gb. Now the conversion "only" lasts two minutes.

This conversion is performed on a docx file of 2 MB, from text and tables.

this document has 750 pages and 111,900 words. My biggest problem is in a chapter of over nearly 300 pages that consists of a single large table with hundreds of lines. Only in this chapter, the conversion viaXSLFO spends 82 seconds.

Based on these parameters, the total time spent is normal? is there any other way to accomplish this conversion in less time?

Thank you for your attention.

Re: Time to convert docx to pdf

PostPosted: Wed Dec 29, 2010 8:10 am
by driguh
I made a graph of memory usage:

Re: Time to convert docx to pdf

PostPosted: Wed Dec 29, 2010 9:39 am
by jason
driguh wrote:This conversion is performed on a docx file of 2 MB, from text and tables.

this document has 750 pages and 111,900 words. My biggest problem is in a chapter of over nearly 300 pages that consists of a single large table with hundreds of lines. Only in this chapter, the conversion viaXSLFO spends 82 seconds.

Based on these parameters, the total time spent is normal? is there any other way to accomplish this conversion in less time?


I haven't converted documents of that length myself, but I suspect that is not abnormal :(

Conversion via XSL FO is in 2 steps. The first step creates an XSL FO file, and then in the second step, FOP is used to convert that to PDF.

It would be useful to know how long the second step takes.

Setting org.docx4j.convert.out.pdf.viaXSLFO.Conversion.setSaveFO(File save) will save the XSL FO file for you.

You can then use FOP on that, quite independent of docx4j, and tell us how long that takes.

It would also be useful to see how long FOP takes on just the 300 page table.

Back in docx4j, you can also turn off log4j debugging to speed things up.

As you are aware, there are the two alternative methods of creating a PDF. I have no idea whether they will be faster or slower, or whether their output quality will be satisfactory for you.

You can try them by adding their sources so they get compiled, and their dependencies (noted I think in the pom) - hopefully that will suffice - or if it doesn't, then by downloading the earlier version of docx4j in which they are part of the main source tree (they were moved for v 2.5, so try v2.4).

Please let us know what you find.

Re: Time to convert docx to pdf

PostPosted: Tue Nov 08, 2011 4:34 am
by wiram
Hi driguh


Pleas, give me Your source code for docx4j,
I want to know how your code work for speed up conversion :roll:

Re: Time to convert docx to pdf

PostPosted: Tue Nov 08, 2011 11:12 am
by lucasfgc
He doesn't speeded up yet...

He is trying to do it!

Re: Time to convert docx to pdf

PostPosted: Tue Nov 08, 2011 5:26 pm
by wiram
hi, lucasfgc & jason

pleas, give me effective suggestion for speed up...
I have increase jvm heap memory but no effect..
I try to convert 3 page of docx to pdf which contain only text and one table but it take 1 min :? :cry: :cry:


--
Regards,
Wiram Rathod