Page 1 of 1

JAXBElements and BinaryData

PostPosted: Fri Apr 05, 2013 11:12 pm
by fachingw
Hello,

by using TraversalUtil i can get the logical structure of my docx-File, e.g.:
...
org.docx4j.wml.P
org.docx4j.wml.R
org.docx4j.wml.CTObject
org.docx4j.vml.CTShape
org.docx4j.vml.CTImageData
org.docx4j.vml.officedrawing.CTOLEObject
...

But how can i get the actual binary data for CTImageData and CTOLEObject. I can get all MetafileWmfPart, MetafileEmfPart and OleObjectBinaryPart with getParts, but then i don't know which Part belongs to which CTxxx and so the the logical structure is lost.
Is there a way to get the the Part for a JAXBElement or vice versa? I want to extract ether the OLEObject or the Image, but not both.

Thanks
Winfried

Re: JAXBElements and BinaryData

PostPosted: Fri Apr 05, 2013 11:30 pm
by jason
Step 1: From the element (CTImageData or CTOLEObject), you get the "relationship ID".

Step 2: Then you get the relationships part for the part containing that element, and look up the relationship by the relationship ID from step 1.

Step 3: The relationship has a target; the target is the part you want.

For an example of this in action, see org.docx4j.model.images.WordXmlPictureE10.createWordXmlPictureFromE10

hope this helps

Re: JAXBElements and BinaryData

PostPosted: Sat Apr 06, 2013 12:00 am
by fachingw
Hello Jason,

i tried your suggestion, but CTImageData geRelid() gives me null, and in CTOLEObject there is no getRelid(), only a getShapeid()?:

Code: Select all
...
  org.docx4j.wml.P
    org.docx4j.wml.R
      org.docx4j.wml.CTObject
        org.docx4j.vml.CTShape
          org.docx4j.vml.CTImageData
          Id: rId7 RelID=null
        org.docx4j.vml.officedrawing.CTOLEObject
        Id: rId8 ShapeID=_x0000_i1027
...


This is my TraversalUtil.Callback:

Code: Select all
   class MyAnyCallback implements TraversalUtil.Callback
   {

     String indent = "";
     
     public List<Object> apply(Object o)
     {

        String cl=o.getClass().getName();
        System.out.println(indent + cl);
        if (o instanceof org.docx4j.vml.CTImageData)
        {
           org.docx4j.vml.CTImageData x=(org.docx4j.vml.CTImageData)o;
           System.out.println(indent+"Id: "+x.getId()+" RelID="+x.getRelid());
        }
        if (o instanceof org.docx4j.vml.officedrawing.CTOLEObject)
        {
           org.docx4j.vml.officedrawing.CTOLEObject x=(org.docx4j.vml.officedrawing.CTOLEObject)o;
           System.out.println(indent+"Id: "+x.getId()+" ShapeID="+x.getShapeID());
        }
        return null;
     }

     public boolean shouldTraverse(Object o)
     {
        return true;
     }

     // Depth first
     public void walkJAXBElements(Object parent)
     {

        indent += "  ";

        List children = getChildren(parent);
        if (children != null) {

           for (Object o : children) {

              // if its wrapped in javax.xml.bind.JAXBElement, get its
              // value
              o = XmlUtils.unwrap(o);

              this.apply(o);

              if (this.shouldTraverse(o)) {
                 walkJAXBElements(o);
              }

           }
        }

        indent = indent.substring(0, indent.length() - 2);
     }

     public List<Object> getChildren(Object o)
     {
        return TraversalUtil.getChildrenImpl(o);
     }   
   }


Thanks
Winfried

Re: JAXBElements and BinaryData

PostPosted: Sat Apr 06, 2013 7:12 am
by jason
Please post a short sample docx or the contents of your document.xml

Re: JAXBElements and BinaryData

PostPosted: Tue Apr 09, 2013 1:38 am
by fachingw
Hello Jason,

I created a sample docx-File using Office and inserted two OLE-Objects using Copy/Paste from Accelrys-Draw. With getParts() i get 2 OleObjectBinaryPart and 2 MetafileEmfPart,
but i don't know which Emf belongs to which OleObject. With TraversalUtil i get the hierarchy of JAXC-Elements and know which CTImageData belongs to which CTOLEObject. So now i just need the connection CTImageData->MetafileEmfPart and CTOLEObject->OleObjectBinaryPart. CTImageData has relid=null, and CTOLEObject has no relid, only a shapeid?

Code: Select all
  org.docx4j.wml.P
    org.docx4j.wml.ProofErr
    org.docx4j.wml.R
      org.docx4j.wml.Text
    org.docx4j.wml.ProofErr
    org.docx4j.wml.R
      org.docx4j.wml.Text
  org.docx4j.wml.P
    org.docx4j.wml.CTBookmark
    org.docx4j.wml.CTMarkupRange
  org.docx4j.wml.P
  org.docx4j.wml.P
    org.docx4j.wml.R
      org.docx4j.wml.Text
  org.docx4j.wml.P
    org.docx4j.wml.R
      org.docx4j.wml.CTObject
        org.docx4j.vml.CTShapetype
          org.docx4j.vml.CTStroke
          org.docx4j.vml.CTFormulas
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
            org.docx4j.vml.CTF
          org.docx4j.vml.CTPath
          org.docx4j.vml.officedrawing.CTLock
        org.docx4j.vml.CTShape
        Id: null
          org.docx4j.vml.CTImageData
          Id: rId5 RelID=null
        org.docx4j.vml.officedrawing.CTOLEObject
        Id: rId6 ShapeID=_x0000_i1032
  org.docx4j.wml.P
  org.docx4j.wml.P
    org.docx4j.wml.R
      org.docx4j.wml.Text
  org.docx4j.wml.P
    org.docx4j.wml.R
      org.docx4j.wml.CTObject
        org.docx4j.vml.CTShape
        Id: null
          org.docx4j.vml.CTImageData
          Id: rId7 RelID=null
        org.docx4j.vml.officedrawing.CTOLEObject
        Id: rId8 ShapeID=_x0000_i1030


Thanks
Winfried

Re: JAXBElements and BinaryData

PostPosted: Tue Apr 09, 2013 8:19 am
by jason
Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
package org.docx4j.samples;

import java.util.ArrayList;
import java.util.List;

import org.docx4j.TraversalUtil;
import org.docx4j.XmlUtils;
import org.docx4j.TraversalUtil.CallbackImpl;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.Part;
import org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart;
import org.docx4j.vml.CTImageData;
import org.docx4j.vml.CTShape;
import org.docx4j.vml.officedrawing.CTOLEObject;
import org.docx4j.wml.CTObject;

public class OLEObjectFinder {

       
        /**
         * Example of how to find OLEObjects and their related parts
         *
        <w:object w:dxaOrig="1087" w:dyaOrig="1074">
          <v:shapetype id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
            <v:stroke joinstyle="miter"/>
            <v:formulas>
              <v:f eqn="if lineDrawn pixelLineWidth 0"/>
              <v:f eqn="sum @0 1 0"/>
              <v:f eqn="sum 0 0 @1"/>
              <v:f eqn="prod @2 1 2"/>
              <v:f eqn="prod @3 21600 pixelWidth"/>
              <v:f eqn="prod @3 21600 pixelHeight"/>
              <v:f eqn="sum @0 0 1"/>
              <v:f eqn="prod @6 1 2"/>
              <v:f eqn="prod @7 21600 pixelWidth"/>
              <v:f eqn="sum @8 21600 0"/>
              <v:f eqn="prod @7 21600 pixelHeight"/>
              <v:f eqn="sum @10 21600 0"/>
            </v:formulas>
            <v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
            <o:lock v:ext="edit" aspectratio="t"/>
          </v:shapetype>
          <v:shape id="_x0000_i1032" type="#_x0000_t75" style="width:54.35pt;height:53.65pt" o:ole="">
            <v:imagedata r:id="rId5" o:title=""/>
          </v:shape>
          <o:OLEObject Type="Embed" ProgID="MDLDrawOLE.MDLDrawObject.1" ShapeID="_x0000_i1032" DrawAspect="Content" ObjectID="_1426939992" r:id="rId6">
            <o:FieldCodes>\s</o:FieldCodes>
          </o:OLEObject>
        </w:object>    
       
         or
         
        <w:object w:dxaOrig="1074" w:dyaOrig="992">
          <v:shape id="_x0000_i1030" type="#_x0000_t75" style="width:53.65pt;height:49.6pt" o:ole="">
            <v:imagedata r:id="rId7" o:title=""/>
          </v:shape>
          <o:OLEObject Type="Embed" ProgID="MDLDrawOLE.MDLDrawObject.1" ShapeID="_x0000_i1030" DrawAspect="Content" ObjectID="_1426939993" r:id="rId8">
            <o:FieldCodes>\s</o:FieldCodes>
          </o:OLEObject>
        </w:object>
         
         */

        public static void main(String[] args) throws Exception {

                String inputfilepath = System.getProperty("user.dir") + "/OleObject.docx";
                               
                WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));         
                MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
                               
        Finder finder = new Finder(CTObject.class); // <----- change this to suit
                new TraversalUtil(documentPart.getContent(), finder);
               
                System.out.println("got " + finder.results.size() + " of type " +  finder.typeToFind.getName() );
               
                for (Object o : finder.results) {
                                               
                        CTObject object = (CTObject) XmlUtils.unwrap(o);
                       
                        CTOLEObject oleObject = getOLEObject(object);
                        if (oleObject==null) {
                                System.out.println("No OLEObject in this Object");                     
                        } else {
                                Part target = documentPart.getRelationshipsPart().getPart(oleObject.getId());
                                System.out.println(target.getClass().getName());
                                // expect org.docx4j.openpackaging.parts.WordprocessingML.OleObjectBinaryPart
                        }
                       
                        CTShape shape = getShape(object);
                        if (shape==null) {
                                System.out.println("No Shape in this Object");                 
                        } else {
                                CTImageData imageData = (CTImageData)XmlUtils.unwrap(shape.getPathOrFormulasOrHandles().get(0));
                                        // You should make this more robust
                                if (imageData==null) {
                                        System.out.println("No imagedata in this Object");                                                             
                                } else {
                                        Part target = documentPart.getRelationshipsPart().getPart(imageData.getId());
                                        System.out.println(target.getClass().getName());
                                        // expect org.docx4j.openpackaging.parts.WordprocessingML.MetafileEmfPart
                                }
                        }
                }
                                                               
        }
       
          public static class Finder extends CallbackImpl {
                 
                  protected Class<?> typeToFind;
                 
                  protected Finder(Class<?> typeToFind) {
                          this.typeToFind = typeToFind;
                  }
                       
                        public List<Object> results = new ArrayList<Object>();
                       
                        @Override
                        public List<Object> apply(Object o) {
                               
                                // Adapt as required
                                if (o.getClass().equals(typeToFind)) {
                                        results.add(o);
                                }
                                return null;
                        }
          }
         
          private static CTOLEObject getOLEObject(CTObject object) {
                 
                  for (Object o : object.getAnyAndAny()) {
                          Object o2 = XmlUtils.unwrap(o);
                          if(o2 instanceof CTOLEObject) return (CTOLEObject)o2;                  
                  }
                  return null;
          }
         
          private static CTShape getShape(CTObject object) {
                 
                  for (Object o : object.getAnyAndAny()) {
                          Object o2 = XmlUtils.unwrap(o);
                          if(o2 instanceof CTShape) return (CTShape)o2;                  
                  }
                  return null;
          }
               
}
 
Parsed in 0.024 seconds, using GeSHi 1.0.8.4