Page 1 of 1

Inserting HTML in a table cell

PostPosted: Mon Jul 30, 2012 6:43 pm
by Empirica
Hi everyone,

I noticed some interesting behaviour when it comes to adding HTML to a table cell (via TraversalUtil or XPath). More concrete, I replace a placeholder by some formatted text coming from a DB. What I noticed now is that if the placeholder is a single <w:p> in a table cell, the code will run fine, but the document won't open (i.e. error message saying that the document structure got corrupted). However, if I add an empty paragraph to the template right before my placeholder text (I have two paragraphs in the table cell now), the code will not only run through and I can open the document and have the expected result, although there is now the addional paragraph, that I must delete manually.
My conclusion is that, there seems to be a problem when replacing a paragraph within a table cell with some formatted text (HTML) when there is no sibling paragraph in the very same cell.

Has anyone of you made the same experience and maybe found a solution or a work arround?

Please find my code below:

Code: Select all
   //Replace Placeholders with HTML
   //This method is called whenever an HTML placeholder is found while traversing over the template document
   private void replaceHTML(P p, String placeholderValue, Properties templateProperties){
      
      //index = -1 if the paragraph is in a table cell
      int index = givenPackage.getMainDocumentPart().getContent().indexOf(p);
      
      try{            
   
         //Retrieves a formatted text from a database
         placeholderValue = replacePlaceholdersByValue(placeholderValue, templateProperties);
         
         //Create HTML            
         String html = "<html>" + placeholderValue + "</html>";
         AlternativeFormatInputPart afiPart = null;

         logger.info("Trying to create an html part.");
         afiPart = new AlternativeFormatInputPart(new PartName("/hw" + String.valueOf(htmlCounter) + ".html")); //CAUTION: each html part needs a new name!!
         htmlCounter++;

         //Parse Content
         logger.info("Get the Bytes and set the Content type of the html part.");

         afiPart.setBinaryData(html.getBytes("UTF-8"));
         afiPart.setContentType(new ContentType("text/html"));

         Relationship altChunkRel = null;

         logger.info("adding the Target Path...");
         altChunkRel = givenPackage.getMainDocumentPart().addTargetPart(afiPart);


         //Add HTML to document
         logger.info("Adding HTML to the document..");
         CTAltChunk ac = Context.getWmlObjectFactory().createCTAltChunk();
         ac.setId(altChunkRel.getId());                                          
         
         //Paragraph in a table cell
         if (p.getParent() instanceof org.docx4j.wml.Tc){
            Tc parent = (Tc) p.getParent();            
            int subIndex = parent.getContent().indexOf(p);
            
            parent.getContent().set(subIndex,ac);
         
         //Regular paragraph
         }else if (p.getParent() instanceof org.docx4j.wml.Body){
            logger.debug("ersetze html part an der Stelle "+index);   
            givenPackage.getMainDocumentPart().getContent().set(index,ac);
         }else{
            logger.error("Parent object class could not be processed");
         }
                        
      }catch(Exception e){
         e.printStackTrace();
      }
   }

Re: Inserting HTML in a table cell

PostPosted: Mon Jul 30, 2012 11:27 pm
by jason
So Word doesn't like a tc which contains just an altChunk? 2007 or 2010?

As an alternative you could have docx4j convert XHTML to docx content. See the ConvertInXHTML* examples.

docx4j can convert an altChunk for you. See AltChunkXHTMLRoundTrip, which contains:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
        public static void main(String[] args)  throws Exception {

                WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
                MainDocumentPart mdp = wordMLPackage.getMainDocumentPart();
               
                mdp.addParagraphOfText("Paragraph 1");

                // Add the XHTML altChunk
                String xhtml = "<html><head><title>Import me</title></head><body><p>Hello World!</p></body></html>";
                mdp.addAltChunk(AltChunkType.Xhtml, xhtml.getBytes());
               
                mdp.addParagraphOfText("Paragraph 3");
               
                // Round trip
                WordprocessingMLPackage pkgOut = mdp.convertAltChunks();
               
                // Display result
                System.out.println(
                                XmlUtils.marshaltoString(pkgOut.getMainDocumentPart().getJaxbElement(), true, true));
               
        }
 
Parsed in 0.016 seconds, using GeSHi 1.0.8.4

Re: Inserting HTML in a table cell

PostPosted: Wed Aug 01, 2012 9:47 pm
by Empirica
Thanks jason for your quick reply.

We use Word 2007.

I had a look into the ConvertInXHTML examples and was impressed by the new XHTMLImporter class and features. I tried the ConvertInXHTMLFragment.java sample as this is what we basically intend to implement and it ran smoothly. Encouraged by that, I replaced the XHTML String by something simple, an unordered list (i.e. "<ul><li>Bullet1</li></ul>"), but this time the sample threw an Exception (see the message below). Could it be a problem with processing unordered lists or am I missing something ?

Code: Select all
Exception in thread "main" java.lang.NullPointerException
   at org.docx4j.convert.in.xhtml.ListHelper.getUnorderedList(ListHelper.java:27)
   at org.docx4j.convert.in.xhtml.XHTMLImporter.addNumbering(XHTMLImporter.java:852)
   at org.docx4j.convert.in.xhtml.XHTMLImporter.traverse(XHTMLImporter.java:745)
   at org.docx4j.convert.in.xhtml.XHTMLImporter.traverse(XHTMLImporter.java:761)
   at org.docx4j.convert.in.xhtml.XHTMLImporter.traverse(XHTMLImporter.java:761)
   at org.docx4j.convert.in.xhtml.XHTMLImporter.convert(XHTMLImporter.java:385)
   at com.empirica.cosys.database.testing.TestXHTMLImporter.main(TestXHTMLImporter.java:46)


Testing the convertAltChunks() method, as you suggested, brought up the same result as described in my initial post.

Re: Inserting HTML in a table cell

PostPosted: Thu Aug 02, 2012 9:25 pm
by jason
That problem will go away if you use the nightly build http://www.docx4java.org/docx4j/docx4j- ... 120608.jar or build from source.