Page 1 of 1

How to Find the Table Containing a Given Paragraph

PostPosted: Sun Sep 20, 2015 12:38 am
by wbansa
Hi,

I am trying to implement getEffectivePPr(P p) in org.docx4j.model.PropertyResolver including Table Style Properties, which are left TODO in the existing code. To do so I try to find the table cell, table row, and table in the ancestors of my paragraph. Yet the parent of the table cell is a JAXBElement and doesn't implement the CHILD interface, so it is not possible to get to the table row.

A possible solution seems to be to use a modified org.docx4j.wml.Tr with
Code: Select all
@XmlElementRef(name = "tc", namespace = "http://schemas.openxmlformats.org/wordprocessingml/2006/main", type = Tc.class)
Can somebody who masters the mysteries of JAXB and docx4j please judge, whether this is a real solution or will create difficulties elsewhere.

Wolfgang

Below code demonstrates the problem. (It should execute in Eclipse when supplied wih the docx4j libraries.)
Code: Select all
package de.bansa.docx4j.test;

import java.util.List;

import javax.xml.bind.JAXBException;

import org.docx4j.XmlUtils;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.wml.Document;
import org.docx4j.wml.P;
import org.docx4j.wml.Tbl;
import org.docx4j.wml.Tc;
import org.docx4j.wml.Tr;
import org.jvnet.jaxb2_commons.ppp.Child;

public class TableProblem {

   @SuppressWarnings("deprecation")
   public static void main(String[] args) throws Docx4JException, JAXBException {
      String docString =
         " <w:document xmlns:w=\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\">    <w:body>"+
         "      <w:tbl>"+
         "         <w:tblPr>"+
         "            <w:tblStyle w:val=\"FormatvorlageT1\"/>"+
         "            <w:tblW w:w=\"0\" w:type=\"auto\"/>"+
         "            <w:tblLook w:firstRow=\"1\" w:lastRow=\"0\" w:firstColumn=\"1\" w:lastColumn=\"0\" w:noHBand=\"0\" w:noVBand=\"1\" w:val=\"04A0\"/>"+
         "         </w:tblPr>"+
         "         <w:tblGrid>   <w:gridCol w:w=\"9212\"/></w:tblGrid>"+
         "         <w:tr w:rsidTr=\"009E5619\">"+
         "            <w:trPr> <w:cnfStyle w:val=\"100000000000\"/> </w:trPr>"+
         "            <w:tc>"+
         "               <w:tcPr><w:tcW w:w=\"9212\" w:type=\"dxa\"/></w:tcPr>"+
         "               <w:p>"+
         "                  <w:pPr> <w:spacing w:before=\"360\"/> <w:jc w:val=\"center\"/></w:pPr>"+
         "                  <w:r>"+
         "                     <w:t>Text of myParagraph</w:t>                  </w:r>"+
         "               </w:p>"+
         "            </w:tc>"+
         "         </w:tr>"+
         "      </w:tbl>"+
         "      <w:sectPr>"+
         "         <w:pgSz w:w=\"11906\" w:h=\"16838\"/>"+
         "         <w:pgMar w:top=\"1417\" w:right=\"1417\" w:bottom=\"1134\" w:left=\"1417\" w:header=\"708\" w:footer=\"708\" w:gutter=\"0\"/>"+
         "         <w:cols w:space=\"708\"/>"+
         "         <w:docGrid w:linePitch=\"360\"/>"+
         "      </w:sectPr>"+
         "   </w:body>"+
         "</w:document>";
      
      Document doc = (Document) org.docx4j.XmlUtils.unmarshalString(docString);
      System.out.println(org.docx4j.XmlUtils.marshaltoString(doc));

      // Find my paragraph (should be done using XPATH, which I didn't get to work here)
      List <Object> docContent = doc.getContent();
      Tbl tbl = (Tbl) XmlUtils.unwrap(docContent.get(0));
      List <Object> tblContent = tbl.getContent();
      Tr tr = (Tr) tblContent.get(0);
      List<Object> trContent = tr.getContent();
      Tc tc = (Tc) XmlUtils.unwrap(trContent.get(0));
      List<Object> tcContent = tc.getContent();
      P myParagraph = (P) tcContent.get(0);
      
      // Now I have the paragraph in question and try to find the containing table and its TblPr
      System.out.println("Paragraph text: " + myParagraph.toString());
      // get containing Tc
      Child expectedTc = (Child) myParagraph.getParent();
      System.out.println("Parent of myParagraph is a " + expectedTc.getClass().getName() );
      
      // get containing Tr
      // fails unless I use a modified Tr.java:
        // @XmlElementRef(name = "tc", namespace = "http://schemas.openxmlformats.org/wordprocessingml/2006/main", type = Tc.class),
      // instead of
      // @XmlElementRef(name = "tc", namespace = "http://schemas.openxmlformats.org/wordprocessingml/2006/main", type = JAXBElement.class),
      Object expectedTr =  expectedTc.getParent();
      System.out.println("Parent of parent of myParagraph is a " + expectedTr.getClass().getName() );
      if (!(expectedTr instanceof Child)){
         System.out.println("   which doesn't implement getParent()");
          System.exit(1);
      }      
      
      // containing table
      Tbl tableOfMyParagraph = (Tbl) ((Child) expectedTr).getParent();
      System.out.println("Table of myParagraph is a " + tableOfMyParagraph.getClass().getName() );
      
      System.out.println("Its TblPr:\n" + org.docx4j.XmlUtils.marshaltoString(tableOfMyParagraph.getTblPr()));
   }

}

Re: How to Find the Table Containing a Given Paragraph

PostPosted: Mon Sep 21, 2015 1:10 pm
by jason
TraversalUtil, if invoked, can correct parent/child relationships:

https://github.com/plutext/docx4j/blob/ ... .java#L155

But you could use TraversalUtil instead of XPath; you could also make your own implementation of the callback which captured the table, tr, tc ancestors along the way.

ps there is code for getCellPStyle in the commercial Enterprise Ed, so the nuances in this area are understood...

wbansa wrote: possible solution seems to be to use a modified org.docx4j.wml.Tr with

@XmlElementRef(name = "tc", namespace = "http://schemas.openxmlformats.org/wordprocessingml/2006/main", type = Tc.class)
Can somebody who masters the mysteries of JAXB and docx4j please judge, whether this is a real solution or will create difficulties elsewhere.


That change should be ok, and in fact matches the annotation:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
@XmlRootElement(name = "tc")
public class Tc
 
Parsed in 0.014 seconds, using GeSHi 1.0.8.4