Plutext

Posted: **Thu Nov 03, 2011 3:39 am**

Hi All,

when i am trying to convert 100's of pages of docx file it takes so much time so how can i increase the performance of this conversion.
thanks
Ravi

Posted: **Thu Nov 03, 2011 9:16 am**

Convert what to what?

Where is the performance issue? Is it on load, or elsewhere?

Posted: **Fri Nov 04, 2011 6:12 pm**

Hi,

Whenever we are trying to convert 1000 of pages of docx files to pdf file it takes so much time so how can i increase performance of application.

Posted: **Sat Nov 05, 2011 1:19 am**

Please, read Jason's post carefully and answer...

Are you converting what kind of docx ??
Text ? Table ? Figure ?

How much is "so much time" for you ?

Give us more information...

Posted: **Sat Nov 05, 2011 1:53 am**

sorry that i am unable to understand jason's question.

My docx file containing everything means it contains everything text, tables, images & etc. we are converting this file to FO and then we get our pdf file. this conversion period takes time so how can i reduse this period.???

Posted: **Sun Nov 06, 2011 2:47 pm**

When Ravi first posted, the subject didn't say "PDF". At least that much is now clear :-)

I have done some performance testing on the PDF output before. My testing was with multiple threads (1 document per thread), and my stats count the total number of pages across *all* threads.

On the fastest hardware I have, I got 40-60 pages per second. With slower hardware, you might get 2-15 pages per second.

What CPU / RAM are you using? Have you optimised your JVM memory with -XmX etc?

Note that I didn't get 40-60 pages per second from a single thread - I don't have that number. These are just indicative figures .. they'd vary based on what is in your document.

Speaking generally, the PDF output process has 2 steps:
(1) creating the XSL FO (which docx4j does, not FOP),
(2) creating the PDF from the XSL FO (which FOP does).

You need to work out which of these 2 steps is taking the most time, and determine whether the step can be sped up, or another approach needed.

If the problem is step 2, then see the FOP mailing list where there have been some recent posts about performance. If it came to it, you could try another FO processor.

A thought from left-field: does your output have page numbers and cross-references? If not, you may be able to split it into say 4 chunks, and process each concurrently, then join the resulting 4 PDFs into a single one again.

Plutext

PDF output performance for long documents

PDF output performance for long documents

Re: Facing performance issue

Re: Facing performance issue

Re: PDF output performance for long documents

Re: PDF output performance for long documents

Re: PDF output performance for long documents