Plutext

Posted: **Fri Nov 18, 2016 2:02 am**

Hi Jason,

I have added a Content Control around the Toc in the Word template.

But when executing the following code (Docx4j version 3.3.1)

Code: Select all: TocGenerator tocGenerator = new TocGenerator(wordMLPackage); tocGenerator.updateToc(false, true);

I get this error:

org.docx4j.toc.TocException: No ToC content control found
at org.docx4j.toc.TocGenerator.updateToc(TocGenerator.java:494)

Is this related to the fact that we don't use the professional version of Docx4J?
But if so, I would expect a "better" error message.
Or does this have another root cause? Perhaps the Toc in Word needs a binding?

Thanks,
Klaus

Posted: **Fri Nov 18, 2016 6:16 pm**

This should work. You don't need the Enterprise Ed for it.

Assuming the ToC content control is of the expected format (which from your image appears to be the case), maybe OpenDoPE RemovalHandler is happening and that is removing the ToC content control before the updateToc operation?

In that xslt, we have:

Syntax: [ Download ] [ Hide ]

Using xml Syntax Highlighting

<!--  @since 3.3.0 -->
<xsl:when test="$all">
<xsl:apply-templates select="w:sdtContent/node()" />
</xsl:when>
Parsed in 0.000 seconds,  using GeSHi 1.0.8.4

In the Docx4J facade we have:

Syntax: [ Download ] [ Hide ]

Using java Syntax Highlighting

protectedstaticvoid removeSDTs(WordprocessingMLPackage wmlPackage)throws Docx4JException {

        RemovalHandler removalHandler;

        removalHandler =new RemovalHandler();

        removalHandler.removeSDTs(wmlPackage.getMainDocumentPart(), RemovalHandler.Quantifier.ALL, (String[])null);
for(Part part:wmlPackage.getParts().getParts().values()){
if(part instanceof HeaderPart){

                                removalHandler.removeSDTs((HeaderPart)part, RemovalHandler.Quantifier.ALL, (String[])null);
}
elseif(part instanceof FooterPart){

                                removalHandler.removeSDTs((FooterPart)part, RemovalHandler.Quantifier.ALL, (String[])null);
}
}
}
Parsed in 0.015 seconds,  using GeSHi 1.0.8.4

You could try:

1. All OpenDoPE steps except FLAG_BIND_REMOVE_XML
2. updateToc
3. FLAG_BIND_REMOVE_XML

Still, it would be reasonable to want to keep the ToC content control in the output docx, right?

Posted: **Sat Nov 19, 2016 2:14 am**

Thanks Jason for the various hints.
It was indeed the RemovalHandler which caused this issue.
Now the ToC is updated. That's great!

But...
The formatting looks different than in the template. Please see the attached screenshots.
The template is defined without dots but the resulting document has dots.
Technically, we are just calling

Code: Select all: tocGenerator.updateToc(false)

Which brings me to the next question:
I would like to generate the ToC with page numbers. But when calling

Code: Select all: tocGenerator.updateToc(true)

we get

org.docx4j.toc.TocException: Error in toc web service at http://converter-eval.plutext.com:80/v1 ... 00/convert
HTTP response: 502 Bad Gateway
at org.docx4j.toc.TocGenerator.getPageNumbersMapViaService(TocGenerator.java:604)
at org.docx4j.toc.TocGenerator.getPageNumbersMap(TocGenerator.java:560)
at org.docx4j.toc.TocGenerator.populateToc(TocGenerator.java:414)
at org.docx4j.toc.TocGenerator.generateToc(TocGenerator.java:319)
at org.docx4j.toc.TocGenerator.updateToc(TocGenerator.java:516)
at org.docx4j.toc.TocGenerator.updateToc(TocGenerator.java:465)

If I get it right, this WebService is normally called for PDF generation.
So, I am a little surprised to see this when generating a Word document.

Cheers,
Klaus

Posted: **Sat Nov 19, 2016 9:23 am**

Hi Klaus

klausT wrote:The template is defined without dots but the resulting document has dots.

I will reply separately regarding the dots..

klausT wrote:org.docx4j.toc.TocException: Error in toc web service at http://converter-eval.plutext.com:80/v1 ... 00/convert
HTTP response: 502 Bad Gateway

The test/demo server converter-eval.plutext.com was down; that is now fixed, so it should work now.

To generate page numbers, we need to figure out where the page breaks happen. And for that, you need a page layout model. Docx4j itself does not have that. Instead, it uses the page layouts calculated from the layouts the PDF output process generates.

As you may be aware, there are 2 main ways of getting PDF output. The older cheap & cheerful XSL FO based output, and Plutext's commercial PDF Converter.

The page numbering produced by Plutext's commercial PDF Converter is much faster, and more accurate.

But you can use the XSL FO based approach if you wish, just by adding the relevant jar.

If you do decide to use the commercial PDF Converter, please install your own instance. converter-eval.plutext.com is only intended to give interested people an easy way to try some of their own documents, so they can get a sense of whether it is likely to be worth their while to install their own instance.

You can download a package for your OS from http://converter-eval.plutext.com/

Hope that makes sense.

cheers .. Jason

Posted: **Tue Nov 22, 2016 8:08 am**

Hi Jason,

Thanks for the detailed explanation.
All this makes perfect sense.

Unfortuantely, we still get the message that the converter-eval.plutext.com is down.
Could you please check one more time?

Cheers...Klaus

Posted: **Tue Nov 22, 2016 2:22 pm**

Hi Klaus

Works for me. Could it be that you have a firewall which is getting in the way?

From the same host you are running your Java code on, please try using curl:

Code: Select all: curl -v -X POST --data-binary @yourdocx.docx -o out.pdf http://converter-eval.plutext.com:80/v1/00000000-0000-0000-0000-000000000000/convert Don't forget the '@' character! Windows users may need to download/install curl first. Beware that in PowerShell, curl is an alias for Invoke-WebRequest, so to use curl itself, first ensure that it is installed, then include the .exe explicitly: curl.exe -v --data-binary @yourdocx.docx -o out.pdf http://converter-eval.plutext.com:80/v1/00000000-0000-0000-0000-000000000000/convert Alternatively, in PowerShell, you can use Invoke-WebRequest (or Invoke-RestMethod): Invoke-WebRequest -Method Post -InFile yourdocx.docx -Uri http://converter-eval.plutext.com:80/v1/00000000-0000-0000-0000-000000000000/convert -ContentType 'application/octet-stream' -OutFile out.pdf

The beta now has a bit more logging of what is going on; enable with:

Syntax: [ Download ] [ Hide ]

Using xml Syntax Highlighting

<logger name="org.docx4j.services.client">
<level value="debug"/>
</logger>
<logger name="org.docx4j.toc">
<level value="debug"/>
</logger>
Parsed in 0.001 seconds,  using GeSHi 1.0.8.4

Posted: **Thu Nov 24, 2016 12:56 pm**

klausT wrote:The formatting looks different than in the template. Please see the attached screenshots.
The template is defined without dots but the resulting document has dots.

Should work better following https://github.com/plutext/docx4j/commi ... 3b39a13b04

You can try it using http://www.docx4java.org/docx4j/docx4j- ... 161124.jar

Plutext

ToC update doesn't work

ToC update doesn't work

Re: ToC udpate doesn't work

Re: ToC udpate doesn't work

Re: ToC udpate doesn't work

Re: ToC update doesn't work

Re: ToC update doesn't work

Re: ToC update doesn't work