Page 1 of 1

Adding TOC support in PDF output

PostPosted: Thu Oct 18, 2012 6:21 am
by mgrela

I recently evaluated docx4j for a scenario which required to generate both docx and a PDF from the same content and I discovered, that the TOC recipe found in the examples that come with the source doesn't work when making a PDF. I began to wonder - how hard would it be to implement a subset of the field processing rules available for docx so that you could make a TOC? Is there any work in progress for this feature? Where should I start if I wanted to do it?

Re: Adding TOC support in PDF output

PostPosted: Thu Oct 18, 2012 7:05 am
by jason
There is no work in progress on this front that I am aware of.

From docx-java-f6/table-of-contents-t187.html you can see there are 2 scenarios. One is to interpret w:instrText such as

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
<w:instrText xml:space="preserve"> TOC \o "1-3" \h \z \u </w:instrText>
Parsed in 0.000 seconds, using GeSHi

and generate a TOC from that. It wouldn't be particularly hard to add support for some subset of TOC options.

The other would be to interpret a complete TOC generated by Word (which my immediate inclination would be to avoid, but which might be better if you are trying to copy exactly what Word does)

The approach I'd take is to add/modify a template in docx2fo.xslt which matches a TOC field, and either processes that in a different mode, or hands processing off to an extension function. I guess if a style inherits from a heading style, it also needs to be included in the TOC, which suggests an extension function.

There is some code in package org.docx4j.model.fields which you may want to look at.

Re: Adding TOC support in PDF output

PostPosted: Fri Oct 19, 2012 5:44 am
by mgrela
Thanks for this information. I will check out the docx2fo.xslt.