Page 1 of 1

Using XHTML file as header

PostPosted: Wed May 24, 2017 8:42 pm
by ludgea
I'm using 2 xhtml files for my docx file.

One is the body of the docx and the other one is udes by the header.

1st file is named c.xhtml and 2nd one is named cc.xhtml.

In order to test my file, I use a simple xhtml :
Code: Select all
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy for Java (vers. 2009-12-01), see jtidy.sourceforge.net" />
<title></title>
</head>
<body>
<p>Hello world</p>
</body>
</html>


Unfortunately, it seems docx4j is taking the body file and use it in the header too (as you can see in the attached docx).

Here's the code I'm using :
Code: Select all
        String inputfilepath = "Offers/" + param.getKey1() + "_" + param.getKey2() + "/c.xhtml";
        String inputfilepath2 = "Offers/" + param.getKey1() + "_" + param.getKey2() + "/cc.xhtml";


        // Create an empty docx package
        WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
        ObjectFactory objectFactory = Context.getWmlObjectFactory();


        NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
        wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
        ndp.unmarshalDefaultNumbering();

        XHTMLImporterImpl xHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
        xHTMLImporter.setHyperlinkStyle("Hyperlink");

        wordMLPackage.getMainDocumentPart().getContent().addAll(xHTMLImporter.convert(new File(inputfilepath), null));

        //Header Part start
        HeaderPart headerPart = new HeaderPart();
        Relationship rel = wordMLPackage.getMainDocumentPart().addTargetPart(headerPart);

        Hdr hdr = Context.getWmlObjectFactory().createHdr();
        hdr.getContent().addAll(xHTMLImporter.convert(new File(inputfilepath2), null));
        wordMLPackage.getDocumentModel().getSections().get(0).getHeaderFooterPolicy().getFirstHeader();
        headerPart.setJaxbElement(hdr);

        List<SectionWrapper> sections = wordMLPackage.getDocumentModel().getSections();

        SectPr sectPr = sections.get(sections.size() - 1).getSectPr();
        // There is always a section wrapper, but it might not contain a sectPr

        if (sectPr == null) {
            sectPr = objectFactory.createSectPr();
            wordMLPackage.getMainDocumentPart().addObject(sectPr);
            sections.get(sections.size() - 1).setSectPr(sectPr);
        }

        HeaderReference headerReference = objectFactory.createHeaderReference();
        headerReference.setId(rel.getId());
        headerReference.setType(HdrFtrRef.DEFAULT);
        sectPr.getEGHdrFtrReferences().add(headerReference);


EDIT :
I was wrong saying the 2nd xhtml wasn't taken.
It seems the 2 xhtml files are used in the header. So I guess the first one is declared as the first part of the header.

Re: Using XHTML file as header

PostPosted: Wed May 24, 2017 11:40 pm
by jason
Seems like there is 2 possibilities:

1. inputfilepath2 doesn't contain what you think it contains, or

2. xHTMLImporter.convert is remembering state from the last time it was invoked.

I guess the latter is possible (though nobody has complained about it before, and its been a long time since I've looked at it, so I have no specific recollection).

You could try xHTMLImporter.setRenderer(new DocxRenderer()) before you invoke convert the second time. Does this make a difference?

Alternatively/better, we need a simple test case to demonstrate the issue (easily put together, but its getting late here now).

Re: Using XHTML file as header

PostPosted: Wed May 24, 2017 11:56 pm
by ludgea
If you select all the header (CTRL+A) and copy paste it, you will see the "Hello World" at the end so I'm assuming possibility 1 is discarded and I tend to think it is indeed possibility 2 (although it's kind of surprising as you said).

I tried to add the code as written below :
Code: Select all
        xHTMLImporter.setRenderer(new DocxRenderer());
        Hdr hdr = Context.getWmlObjectFactory().createHdr();
        hdr.getContent().addAll(xHTMLImporter.convert(new File(inputfilepath2), null));
        wordMLPackage.getDocumentModel().getSections().get(0).getHeaderFooterPolicy().getFirstHeader();
        headerPart.setJaxbElement(hdr);


But the problem remains.

I'm going to set an easier test with lighter files.
You will be able to see it tomorrow then.

Thank you again :)

EDIT :
So, for the demonstration, I simplified the xhtml input files.
Here are the code for each of them.

c.xhtml
Code: Select all
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy for Java (vers. 2009-12-01), see jtidy.sourceforge.net" />
<title></title>
</head>
<body>
<p>Hello world</p>
</body>
</html>


cc.xhtml
Code: Select all
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy for Java (vers. 2009-12-01), see jtidy.sourceforge.net" />
<title></title>
</head>
<body>
<p>Hello world 2</p>
</body>
</html>


Here's the code I use to create my docx file :
Code: Select all
        String inputfilepath = "Offers/" + param.getKey1() + "_" + param.getKey2() + "/c.xhtml";
        String inputfilepath2 = "Offers/" + param.getKey1() + "_" + param.getKey2() + "/cc.xhtml";

        // Create an empty docx package
        WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
        ObjectFactory objectFactory = Context.getWmlObjectFactory();

        NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
        wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
        ndp.unmarshalDefaultNumbering();

        XHTMLImporterImpl xHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
        xHTMLImporter.setHyperlinkStyle("Hyperlink");

        wordMLPackage.getMainDocumentPart().getContent().addAll(xHTMLImporter.convert(new File(inputfilepath), null));

        //Header Part start
        HeaderPart headerPart = new HeaderPart();
        Relationship rel = wordMLPackage.getMainDocumentPart().addTargetPart(headerPart);

        xHTMLImporter.setRenderer(new DocxRenderer());
        Hdr hdr = Context.getWmlObjectFactory().createHdr();
        hdr.getContent().addAll(xHTMLImporter.convert(new File(inputfilepath2), null));
        wordMLPackage.getDocumentModel().getSections().get(0).getHeaderFooterPolicy().getFirstHeader();
        headerPart.setJaxbElement(hdr);

        List<SectionWrapper> sections = wordMLPackage.getDocumentModel().getSections();

        SectPr sectPr = sections.get(sections.size() - 1).getSectPr();
        // There is always a section wrapper, but it might not contain a sectPr

        if (sectPr == null) {
            sectPr = objectFactory.createSectPr();
            wordMLPackage.getMainDocumentPart().addObject(sectPr);
            sections.get(sections.size() - 1).setSectPr(sectPr);
        }

        HeaderReference headerReference = objectFactory.createHeaderReference();
        headerReference.setId(rel.getId());
        headerReference.setType(HdrFtrRef.DEFAULT);
        sectPr.getEGHdrFtrReferences().add(headerReference);

        // Saving file
        wordMLPackage.save(new java.io.File("Offers/" + param.getKey1() + "_" + param.getKey2() + "/html_output.docx"));


Finally, you will find attached the generated docx with the result inside.

Hope it will help.

Re: Using XHTML file as header

PostPosted: Thu May 25, 2017 12:16 am
by ludgea
I found a solution to the problem.

Your answer was on the right path though.

I just initialized a new XHTMLImporterImpl.

Thus the code looks like this now :
Code: Select all
        XHTMLImporterImpl xHTMLImporter2 = new XHTMLImporterImpl(wordMLPackage);
        xHTMLImporter.setHyperlinkStyle("Hyperlink");
        Hdr hdr = Context.getWmlObjectFactory().createHdr();
        hdr.getContent().addAll(xHTMLImporter2.convert(new File(inputfilepath2), null));
        wordMLPackage.getDocumentModel().getSections().get(0).getHeaderFooterPolicy().getFirstHeader();
        headerPart.setJaxbElement(hdr);


And as always, the resulting file attached.

Thank you again for your help :D

Re: Using XHTML file as header

PostPosted: Thu May 25, 2017 3:01 pm
by jason
Yeah, looking at the code this morning, that's the right way to do it, since the constructor does:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
imports = Context.getWmlObjectFactory().createBody();
 
Parsed in 0.014 seconds, using GeSHi 1.0.8.4


and that's the only way to initialise that object. In summary, as you've discovered, the XHTMLImporterImpl object should not be re-used.