Page 1 of 1

Any limitation on Mergedocx document size

PostPosted: Wed Jun 21, 2017 3:45 pm
by shlgw1
We are using the enterprise version 3.3.0.8 to merge several thousand word document into a single one.

When the merged document is opened in Microsoft word, there is an "Unspecified Error", /word/styles.xml line 61087, column 0 and the document cannot be loaded properly.

The size of the merged document is 40M in size. Is there any limitation on the number of documents to be merged using mergedocx ?

Thank you.

Re: Any limitation on Mergedocx document size

PostPosted: Wed Jun 21, 2017 8:39 pm
by jason
The maximum number of style definitions Word 2007 and later support is 4079: https://support.microsoft.com/en-us/hel ... ns-in-word

You are probably hitting that limit.

Be sure to use StyleHandler.USE_EARLIER to create fewer styles, if that is appropriate for your documents.

Prior to v3.3.0.11, there are circumstances where tables can cause unnecessary extra styles to be created. v3.3.0.11 avoids creating those extra styles; please use it with StyleHandler.USE_EARLIER.

Use the link https://www.plutext.com/dn/downloads/14 ... -trial.zip to fill in a form to get the 3.3.0.11 trial, or see https://www.plutext.com/m/index.php/products

If you are a licensed user, please contact support@plutext.com for an update. If you continue to have issues, please email support@plutext.com

Re: Any limitation on Mergedocx document size

PostPosted: Mon Jun 26, 2017 3:27 pm
by shlgw1
We shall try to test out the latest version. On the other hand, we found that if we use USE_EARLIER for merging the document, the format of the subsequent documents will be distorted. RENAME_RETAIN does not have the problem.

Please find two samples of the merged document for reference. One use USE_EARLIER and the other use RENAME_RETAIN .

style.RENAME_RETAIN.docx
No distortion, one letter per page
(55.79 KiB) Downloaded 404 times


style.USE_EARLIER.docx
The format is distorted. Letter across one page.
(54.63 KiB) Downloaded 355 times


The contents of the document has been anonymized.

Re: Any limitation on Mergedocx document size

PostPosted: Tue Jun 27, 2017 7:09 pm
by jason
I expect this is related to what is described in the user manual at page 24-25. Could you please attach your input docx (anonymized is fine)?

I made a 1 page docx from your attached docx, and both USE_EARLIER and RENAME_RETAIN worked fine for me (using the latest version, and DocumentBuilderIncremental).

Could you please try the latest version and let me know if USE_EARLIER works for you without distortion?

Re: Any limitation on Mergedocx document size

PostPosted: Thu Jul 20, 2017 7:03 pm
by shlgw1
We have tried using DocumentBuilderIncremental with USE_EARLIER in the latest version of Plutext-Enterprise-3.3.0.11.jar

If the first letter is English first and then Chinese, the merged letter is correct.

However, if the first letter is Chinese and then English letter, the English letter will be distorted (some text falls to next page).

Re: Any limitation on Mergedocx document size

PostPosted: Fri Jul 21, 2017 1:47 pm
by jason
USE_EARLIER means re-use the styles in the first document.

In this case, you are re-using the Normal style.

In the Chinese document only, your Normal style has: <w:spacing w:line="360" w:lineRule="atLeast"/>

So when that document is first, that line spacing is also applied to paragraphs in the English document which use Normal style.

So the results are as expected.

In summary, if you want to use USE_EARLIER without distortion, you need to ensure that any styles defined in an earlier document but used in a later one have the same definition. If this isn't true, you should be using RENAME_RETAIN!

Note, your documents also use different rPrDefault values;

English:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
    <w:rPrDefault>
      <w:rPr>
        <w:rFonts w:asciiTheme="minorHAnsi" w:hAnsiTheme="minorHAnsi" w:eastAsiaTheme="minorEastAsia" w:cstheme="minorBidi"/>
        <w:kern w:val="2"/>
        <w:sz w:val="24"/>
        <w:szCs w:val="22"/>
        <w:lang w:val="en-US" w:eastAsia="zh-TW" w:bidi="ar-SA"/>
      </w:rPr>
    </w:rPrDefault>
 
Parsed in 0.001 seconds, using GeSHi 1.0.8.4


Chinese:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
    <w:rPrDefault>
      <w:rPr>
        <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:eastAsia="新細明體" w:cs="Times New Roman"/>
        <w:sz w:val="20"/>
        <w:lang w:val="en-US" w:eastAsia="zh-TW" w:bidi="ar-SA"/>
      </w:rPr>
    </w:rPrDefault>
 
Parsed in 0.001 seconds, using GeSHi 1.0.8.4


Note the different font size (12pt versus 10pt, the sz values are half points).

See "Document Defaults" in the User Manual for more about this, but basically, in the USE_EARLIER case, only the first set of values can be taken into account. Where these aren't overridden by style settings or ad hoc formatting, you'll see an effect in the output document.