Page 1 of 1

Add TOC to imported HTML-> Word document

PostPosted: Tue Aug 12, 2014 3:21 pm
by coolrb
Hello, I am using docx4j to convert HTML to Word document. The problem, I am facing is I want to add TOC in second page. Please find my code
Code: Select all
documentPart.getContents().getBody().getContent().add(2, getTocParaUsingDocx4j());

However, I am not able to add TOC in second page?

Re: Add TOC to imported HTML-> Word document

PostPosted: Tue Aug 12, 2014 7:58 pm
by jason
What is getTocParaUsingDocx4j() ?!?

Re: Add TOC to imported HTML-> Word document

PostPosted: Tue Aug 12, 2014 8:03 pm
by coolrb
jason wrote:What is getTocParaUsingDocx4j() ?!?

It returns a P object, please see code below
Code: Select all
public P getTocParaUsingDocx4j(){
      
      
      P paragraphForTOC = objectFactory.createP();             
       
      R r = objectFactory.createR();
        FldChar fldchar = objectFactory.createFldChar();
        fldchar.setFldCharType(STFldCharType.BEGIN);
        fldchar.setDirty(true);
        r.getContent().add(getWrappedFldChar(fldchar));
        paragraphForTOC.getContent().add(r);

        R r1 = objectFactory.createR();
        Text txt = new Text();
        txt.setSpace("preserve");
        txt.setValue("TOC \\o \"1-3\" \\h \\z \\u \\h");
        r.getContent().add(objectFactory.createRInstrText(txt) );
        paragraphForTOC.getContent().add(r1);

        FldChar fldcharend = objectFactory.createFldChar();
        fldcharend.setFldCharType(STFldCharType.END);
        R r2 = objectFactory.createR();
        r2.getContent().add(getWrappedFldChar(fldcharend));
        paragraphForTOC.getContent().add(r2);

        return paragraphForTOC;
   }

Re: Add TOC to imported HTML-> Word document

PostPosted: Mon Aug 18, 2014 4:54 pm
by coolrb
I could generate TOC in appropriate page.
- I first added a Div element in my HTML page with style
Code: Select all
<div id="toc">
  <p>Table Of Contents</p>
  </div>
#toc {-fs-page-sequence: start;page-break-after: always;page-break-before: always;}

And then in my java code:
Code: Select all
String xpath = "//w:r[w:t[contains(text(),'Table Of Contents')]]";
      List<Object> list = documentPart.getJAXBNodesViaXPath(xpath, true);
         for(Object obj : list){
            R r = (R)obj;
            P p = (P)r.getParent();
            p.getContent().add(getTocParaUsingDocx4j());
         }

Though this solved my problem, but when I open word document I get a dialog with following message
Code: Select all
This document contains fields that may refer ....

Any solution to this dialog?
I also tried plutext's enterprise version. Though TOC was generated without any issue, but when I refresh TOC manually in word document, TOC is replaced by following text
Code: Select all
Table Of Contents
Word did not find any entries for your table of contents.
In your document, select the words to include in the table of contents, and then on the Home tab, under Styles, click a heading style. Repeat for each heading that you want to include, and then insert the table of contents in your document. To manually create a table of contents, on the Document Elements tab, under Table of Contents, point to a style and then click the down arrow button. Click one of the styles under Manual Table of Contents, and then type the entries manually.

I am still clueless when should I try further with plutext's solution or look for alternative. Any suggestion?

Re: Add TOC to imported HTML-> Word document

PostPosted: Mon Aug 18, 2014 7:28 pm
by jason
The TOC code in the Enterprise Ed can generate a new ToC, or update an existing one. To update an existing one, it must be in a ToC content control (which is how modern versions of Word do it).

Since your getTocParaUsingDocx4j method doesn't add the ToC content control, the Enterprise ToC update code won't have done anything.

You could test it by telling it to insert a new ToC. By the way, there have been various improvements to the ToC code recently, so I suggest you use 3.1.0.4 or 3.1.0.5.

Re: Add TOC to imported HTML-> Word document

PostPosted: Mon Aug 18, 2014 7:58 pm
by coolrb
Since your getTocParaUsingDocx4j method doesn't add the ToC content control, the Enterprise ToC update code won't have done anything.

Well, please find below code to generate TOC using plutext enterprise version
Code: Select all
Toc.setTocHeadingText("Table Of Contents");
        TocGenerator.generateToc(wordMLPackage, 5,"TOC \\o \"3-3\" \\h \\z \\u \\h \"Title,1,Heading 1,1,Heading 2,3,Appendix 1,1,Appendix 2,2,Unnumbered Heading,1,h1,1,h2,3\" ", false);

Manual TOC update after TOC generation gives error. Any suggestion?

Re: Add TOC to imported HTML-> Word document

PostPosted: Mon Aug 18, 2014 11:00 pm
by jason
The first thing to do is check that Word is happy with your field code, by inserting that TOC field into your docx in Word (Ctrl-F9).

If the field code is OK (and just checked - it seems OK), then its time to look into what the Enterprise jar is doing.

What version are you using? You can tell from the name of the jar file.

If it isn't 3.1.0.5, please try that, from http://www.plutext.com/dn/downloads/140 ... -trial.zip

If it is 3.1.0.5, the easiest way for me to help you would be for you to attach your input docx, or email it to support@plutext.com

Re: Add TOC to imported HTML-> Word document

PostPosted: Wed Aug 20, 2014 8:24 pm
by coolrb
If it isn't 3.1.0.5, please try that, from http://www.plutext.com/dn/downloads/140 ... -trial.zip

I am using 3.1.0.4. I tried to download 3.1.0.5 trial version. However I could not able to successfully download it. Last time, I got a direct link from support team. Should I request you to send me direct link.

Re: Add TOC to imported HTML-> Word document

PostPosted: Wed Aug 20, 2014 10:46 pm
by jason
If you can send your docx to support@plutext.com, I suggest you do that. Otherwise, yeah, you can just ask for a direct link.