Page 1 of 1

ToC update doesn't work

PostPosted: Fri Nov 18, 2016 2:02 am
by klausT
Hi Jason,

I have added a Content Control around the Toc in the Word template.

But when executing the following code (Docx4j version 3.3.1)

Code: Select all
TocGenerator tocGenerator = new TocGenerator(wordMLPackage);
tocGenerator.updateToc(false, true);


I get this error:
org.docx4j.toc.TocException: No ToC content control found
at org.docx4j.toc.TocGenerator.updateToc(TocGenerator.java:494)


Is this related to the fact that we don't use the professional version of Docx4J?
But if so, I would expect a "better" error message.
Or does this have another root cause? Perhaps the Toc in Word needs a binding?

Thanks,
Klaus

Re: ToC udpate doesn't work

PostPosted: Fri Nov 18, 2016 6:16 pm
by jason
This should work. You don't need the Enterprise Ed for it.

Assuming the ToC content control is of the expected format (which from your image appears to be the case), maybe OpenDoPE RemovalHandler is happening and that is removing the ToC content control before the updateToc operation?

In that xslt, we have:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
      <!--  @since 3.3.0 -->
      <xsl:when test="$all">
        <xsl:apply-templates select="w:sdtContent/node()" />
       </xsl:when>  
 
Parsed in 0.000 seconds, using GeSHi 1.0.8.4


In the Docx4J facade we have:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
        protected static void removeSDTs(WordprocessingMLPackage wmlPackage)throws Docx4JException {
        RemovalHandler removalHandler;
        removalHandler = new RemovalHandler();
        removalHandler.removeSDTs(wmlPackage.getMainDocumentPart(), RemovalHandler.Quantifier.ALL, (String[])null);
                for (Part part:wmlPackage.getParts().getParts().values()) {
                        if (part instanceof HeaderPart) {
                                removalHandler.removeSDTs((HeaderPart)part, RemovalHandler.Quantifier.ALL, (String[])null);
                        }
                        else if (part instanceof FooterPart) {
                                removalHandler.removeSDTs((FooterPart)part, RemovalHandler.Quantifier.ALL, (String[])null);
                        }
                }
        }
 
Parsed in 0.014 seconds, using GeSHi 1.0.8.4


You could try:

1. All OpenDoPE steps except FLAG_BIND_REMOVE_XML
2. updateToc
3. FLAG_BIND_REMOVE_XML

Still, it would be reasonable to want to keep the ToC content control in the output docx, right?

Re: ToC udpate doesn't work

PostPosted: Sat Nov 19, 2016 2:14 am
by klausT
Thanks Jason for the various hints.
It was indeed the RemovalHandler which caused this issue.
Now the ToC is updated. That's great!

But...
The formatting looks different than in the template. Please see the attached screenshots.
The template is defined without dots but the resulting document has dots.
Technically, we are just calling
Code: Select all
tocGenerator.updateToc(false)


Which brings me to the next question:
I would like to generate the ToC with page numbers. But when calling
Code: Select all
tocGenerator.updateToc(true)

we get
org.docx4j.toc.TocException: Error in toc web service at http://converter-eval.plutext.com:80/v1 ... 00/convert
HTTP response: 502 Bad Gateway
at org.docx4j.toc.TocGenerator.getPageNumbersMapViaService(TocGenerator.java:604)
at org.docx4j.toc.TocGenerator.getPageNumbersMap(TocGenerator.java:560)
at org.docx4j.toc.TocGenerator.populateToc(TocGenerator.java:414)
at org.docx4j.toc.TocGenerator.generateToc(TocGenerator.java:319)
at org.docx4j.toc.TocGenerator.updateToc(TocGenerator.java:516)
at org.docx4j.toc.TocGenerator.updateToc(TocGenerator.java:465)


If I get it right, this WebService is normally called for PDF generation.
So, I am a little surprised to see this when generating a Word document.

Cheers,
Klaus

Re: ToC udpate doesn't work

PostPosted: Sat Nov 19, 2016 9:23 am
by jason
Hi Klaus

klausT wrote:The template is defined without dots but the resulting document has dots.


I will reply separately regarding the dots..

klausT wrote:org.docx4j.toc.TocException: Error in toc web service at http://converter-eval.plutext.com:80/v1 ... 00/convert
HTTP response: 502 Bad Gateway


The test/demo server converter-eval.plutext.com was down; that is now fixed, so it should work now.

To generate page numbers, we need to figure out where the page breaks happen. And for that, you need a page layout model. Docx4j itself does not have that. Instead, it uses the page layouts calculated from the layouts the PDF output process generates.

As you may be aware, there are 2 main ways of getting PDF output. The older cheap & cheerful XSL FO based output, and Plutext's commercial PDF Converter.

The page numbering produced by Plutext's commercial PDF Converter is much faster, and more accurate.

But you can use the XSL FO based approach if you wish, just by adding the relevant jar.

If you do decide to use the commercial PDF Converter, please install your own instance. converter-eval.plutext.com is only intended to give interested people an easy way to try some of their own documents, so they can get a sense of whether it is likely to be worth their while to install their own instance.

You can download a package for your OS from http://converter-eval.plutext.com/

Hope that makes sense.

cheers .. Jason

Re: ToC update doesn't work

PostPosted: Tue Nov 22, 2016 8:08 am
by klausT
Hi Jason,

Thanks for the detailed explanation.
All this makes perfect sense.

Unfortuantely, we still get the message that the converter-eval.plutext.com is down.
Could you please check one more time?

Cheers...Klaus

Re: ToC update doesn't work

PostPosted: Tue Nov 22, 2016 2:22 pm
by jason
Hi Klaus

Works for me. Could it be that you have a firewall which is getting in the way?

From the same host you are running your Java code on, please try using curl:

Code: Select all
curl -v -X POST --data-binary @yourdocx.docx -o out.pdf http://converter-eval.plutext.com:80/v1/00000000-0000-0000-0000-000000000000/convert

Don't forget the '@' character!

Windows users may need to download/install curl first. Beware that in PowerShell, curl is an alias for Invoke-WebRequest, so to use curl itself, first ensure that it is installed, then include the .exe explicitly:

curl.exe -v --data-binary @yourdocx.docx -o out.pdf http://converter-eval.plutext.com:80/v1/00000000-0000-0000-0000-000000000000/convert

Alternatively, in PowerShell, you can use Invoke-WebRequest (or Invoke-RestMethod):

Invoke-WebRequest  -Method Post -InFile yourdocx.docx -Uri http://converter-eval.plutext.com:80/v1/00000000-0000-0000-0000-000000000000/convert -ContentType  'application/octet-stream' -OutFile out.pdf


The beta now has a bit more logging of what is going on; enable with:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting

    <logger name="org.docx4j.services.client">
                <level value="debug"/>
        </logger>      
    <logger name="org.docx4j.toc">
                <level value="debug"/>
        </logger>      
 
Parsed in 0.001 seconds, using GeSHi 1.0.8.4

Re: ToC update doesn't work

PostPosted: Thu Nov 24, 2016 12:56 pm
by jason
klausT wrote:The formatting looks different than in the template. Please see the attached screenshots.
The template is defined without dots but the resulting document has dots.


Should work better following https://github.com/plutext/docx4j/commi ... 3b39a13b04

You can try it using http://www.docx4java.org/docx4j/docx4j- ... 161124.jar