Page 1 of 1

How to create an oleObject.bin (for video or pdf)

PostPosted: Wed Apr 29, 2009 10:42 pm
by btjandra
Hi Jason,

I am trying to embed video/package into my docx file. I wonder how the OleObject.bin get created. Is docx4j support this, if so how do I generate the oleObject.bin file.
Please advice!

Thanks!
Betty

Re: How to create an oleObject.bin

PostPosted: Thu Apr 30, 2009 1:29 am
by jason
Hi Betty

Yes have org.docx4j.openpackaging.parts.WordprocessingML.OleObjectBinaryPart

I have used that to extract an OLE object. As you'll see, it uses POI FS to do the work.

Here is some sample code:

Code: Select all
         // 1. Load the Package
         WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));
         
         // 2. Fetch the OLE document part
         OleObjectBinaryPart olePart = (OleObjectBinaryPart)wordMLPackage.getParts().get(new PartName("/word/embeddings/oleObject1.bin") );         
         
         // 3. Write it
         File f=new File("/home/dev/ole.bin");            
         FileOutputStream out=new FileOutputStream(f);
         olePart.writeDataToOutputStream(out);
         out.close();


When I had a quick look at creating an OLE object (from a PDF) a few months ago, it didn't work. It seems to be a bit of a black art. See http://www.nabble.com/Can-POIFS-convert ... 68081.html

This is what I tried:

Code: Select all
         // The object we embed
         is = new java.io.FileInputStream("/home/dev/testing/fig1.pdf" );         
         OleObjectBinaryPart olePart = new OleObjectBinaryPart();         
         olePart.setBinaryData(is);
         Relationship relOleObject = wordMLPackage.getMainDocumentPart().addTargetPart(olePart);
         
         // The image the user sees, that they click on to open the object
         Relationship relImage = wordMLPackage.getMainDocumentPart().addTargetPart(imagePart); // imagePart defined outside this snippet

         // Contains ${ImageId}, ${OLEShapeID}, ${OLEObjectID}, ${OLEObjectRid}
            String ml = "<w:p xmlns:w=\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\" xmlns:r=\"http://schemas.openxmlformats.org/officeDocument/2006/relationships\" xmlns:wp=\"http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing\" xmlns:v=\"urn:schemas-microsoft-com:vml\" xmlns:o=\"urn:schemas-microsoft-com:office:office\"><w:r><w:object w:dxaOrig=\"11881\" w:dyaOrig=\"9181\"><v:shapetype id=\"_x0000_t75\" coordsize=\"21600,21600\" o:spt=\"75\" o:preferrelative=\"t\" path=\"m@4@5l@4@11@9@11@9@5xe\" filled=\"f\" stroked=\"f\"><v:stroke joinstyle=\"miter\"/><v:formulas><v:f eqn=\"if lineDrawn pixelLineWidth 0\"/><v:f eqn=\"sum @0 1 0\"/><v:f eqn=\"sum 0 0 @1\"/><v:f eqn=\"prod @2 1 2\"/><v:f eqn=\"prod @3 21600 pixelWidth\"/><v:f eqn=\"prod @3 21600 pixelHeight\"/><v:f eqn=\"sum @0 0 1\"/><v:f eqn=\"prod @6 1 2\"/><v:f eqn=\"prod @7 21600 pixelWidth\"/><v:f eqn=\"sum @8 21600 0\"/><v:f eqn=\"prod @7 21600 pixelHeight\"/><v:f eqn=\"sum @10 21600 0\"/></v:formulas><v:path o:extrusionok=\"f\" gradientshapeok=\"t\" o:connecttype=\"rect\"/><o:lock v:ext=\"edit\" aspectratio=\"t\"/></v:shapetype><v:shape id=\"_x0000_i1025\" type=\"#_x0000_t75\" style=\"width:594pt;height:459pt\" o:ole=\"\"><v:imagedata r:id=\"${ImageId}\" o:title=\"\"/></v:shape><o:OLEObject Type=\"Embed\" ProgID=\"AcroExch.Document.7\" ShapeID=\"${OLEShapeID}\" DrawAspect=\"Content\" ObjectID=\"${OLEObjectID}\" r:id=\"${OLEObjectRid}\"/></w:object></w:r></w:p>";
           
            java.util.HashMap<String, String>mappings = new java.util.HashMap<String, String>();           
            mappings.put("ImageId", relImage.getId()  ); 
            mappings.put("OLEShapeID", "_x0000_i1025"  );
            mappings.put("OLEObjectID", "_1291469606"  );
            mappings.put("OLEObjectRid", relOleObject.getId()  );

            wordMLPackage.getMainDocumentPart().addObject(
                  org.docx4j.XmlUtils.unmarshallFromTemplate(ml, mappings ) );
           
            ContentTypeManager ctm = wordMLPackage.getContentTypeManager();
           
            // There as an override, try adding as a default!
            ctm.addDefaultContentType("bin",
                  org.docx4j.openpackaging.contenttype.ContentTypes.OFFICEDOCUMENT_OLE_OBJECT);


But you might have better results with your video.

Do a Google search for MS-OLEDS - it explains how an OLE is supposed to be structured.

Let us know how you go! Thanks .. Jason

Re: How to create an oleObject.bin

PostPosted: Thu Apr 30, 2009 2:47 am
by btjandra
Hi Jason,

I am able to create the OleObject.bin file now, when I bring up my docx file with Microsoft Word, I see the embeded icon but I cannot double click the icon to bring up the video. So put the video off and tried with including another word document, still the double click will not open my embeded document. Not sure what I did wrong. Here's my code:


public Part createOleBinPart(String filename) throws Exception
{
// PartName partName = new PartName("embeddings/oleobject1.bin");
Part olePart = new OleObjectBinaryPart();

FileInputStream fi = new FileInputStream("d:\\temp\\mydoc.docx");
((BinaryPart)olePart).setBinaryData(fi);
// ((OleObjectBinaryPart)olePart).initPOIFSFileSystem();
// ((OleObjectBinaryPart)olePart).writePOIFSFileSystem();

return olePart;
}


private CTFormulas createCommonFormulas()
{
CTFormulas formulas = msVmlOf.createCTFormulas();
CTF cctf = msVmlOf.createCTF();
cctf.setEqn("if lineDrawn pixelLineWidth 0");
formulas.getF().add(cctf);

CTF ctf = msVmlOf.createCTF();
ctf.setEqn("sum #0 1 0");
formulas.getF().add(ctf);

CTF ctf1 = msVmlOf.createCTF();
ctf1.setEqn("sum 0 0 @1");
formulas.getF().add(ctf1);

CTF ctf3 = msVmlOf.createCTF();
ctf3.setEqn("prod @2 1 2");
formulas.getF().add(ctf3);

CTF ctf4 = msVmlOf.createCTF();
ctf4.setEqn("prod @3 21600 pixelWidth");
formulas.getF().add(ctf4);

CTF ctf5 = msVmlOf.createCTF();
ctf5.setEqn("prod @3 21600 pixelHeight");
formulas.getF().add(ctf5);

CTF ctf6 = msVmlOf.createCTF();
ctf6.setEqn("sum @0 0 1");
formulas.getF().add(ctf6);

CTF ctf7 = msVmlOf.createCTF();
ctf7.setEqn("prod @6 1 2");
formulas.getF().add(ctf7);

CTF ctf8 = msVmlOf.createCTF();
ctf8.setEqn("prod @7 21600 pixelWidth");
formulas.getF().add(ctf8);

CTF ctf9 = msVmlOf.createCTF();
ctf9.setEqn("sum @8 21600 0");
formulas.getF().add(ctf9);

CTF ctf10 = msVmlOf.createCTF();
ctf10.setEqn("prod @7 21600 pixelHeight");
formulas.getF().add(ctf10);

CTF ctf11 = msVmlOf.createCTF();
ctf11.setEqn("sum @10 21600 0");
formulas.getF().add(ctf11);

return formulas;
}
private CTObject createOle(String oleID, String imgID)
{
CTObject ctObj = wmlOf.createCTObject();
CTShapetype shapeType = msVmlOf.createCTShapetype();
shapeType.setIdAttr("_X00016");
shapeType.setCoordsize("21600,21600");
shapeType.setSpt(Float.valueOf("75"));
shapeType.setPreferrelative(com.microsoft.schemas.office.office.STTrueFalse.TRUE);
shapeType.setFilled(STTrueFalse.FALSE);
shapeType.setStroked(STTrueFalse.FALSE);
shapeType.setPath("m@4@5l@4@11@9@11@9@5xe");
CTStroke stroke = msVmlOf.createCTStroke();
stroke.setJoinstyle(STStrokeJoinStyle.MITER);
javax.xml.namespace.QName qStroke = new javax.xml.namespace.QName(
"urn:schemas-microsoft-com:vml", "stroke");
JAXBElement jeStroke = this.getJAXBElement(qStroke, CTFormulas.class, stroke);
shapeType.getEGShapeElements().add(jeStroke);

CTFormulas formulas = this.createCommonFormulas();
javax.xml.namespace.QName qname = new javax.xml.namespace.QName(
"urn:schemas-microsoft-com:vml", "formulas");
JAXBElement jeFormulas = this.getJAXBElement(qname, CTFormulas.class, formulas);
shapeType.getEGShapeElements().add(jeFormulas);
CTPath path = msVmlOf.createCTPath();
path.setExtrusionok(com.microsoft.schemas.office.office.STTrueFalse.FALSE);
path.setGradientshapeok(STTrueFalse.T);
path.setConnecttype(STConnectType.RECT);
javax.xml.namespace.QName qname1 = new javax.xml.namespace.QName(
"urn:schemas-microsoft-com:vml", "path");
JAXBElement jePath = this.getJAXBElement(qname1, CTPath.class, path);
shapeType.getEGShapeElements().add(jePath);


CTLock lock = msOfficeOf.createCTLock();
lock.setExt(STExt.EDIT);
lock.setAspectratio(com.microsoft.schemas.office.office.STTrueFalse.T);
javax.xml.namespace.QName qname3 = new javax.xml.namespace.QName(
"urn:schemas-microsoft-com:office:office", "lock");
JAXBElement jeLock = this.getJAXBElement(qname3, CTLock.class, lock);
shapeType.getEGShapeElements().add(jeLock);

CTShape shape = msVmlOf.createCTShape();
shape.setIdAttr("_x0000_i1025");
// shape.setAlt("Microsoft Office Signature Line...");
// shape.setSpid("_x0000_s2555");
shape.setType("#_x0000_t136");
shape.setStyle("width:52.5pt;height:40.5pt");
shape.setOle("");
CTImageData imageData = msVmlOf.createCTImageData();
imageData.setId(imgID);
imageData.setTitle("");
javax.xml.namespace.QName qnameImg = new javax.xml.namespace.QName(
"urn:schemas-microsoft-com:vml", "imagedata");
JAXBElement imgJe = this.getJAXBElement(qnameImg, CTImageData.class, imageData);

shape.getPathOrFormulasOrHandles().add(imgJe);
// shape.getPathOrFormulasOrHandles().add(jeTextPath);

javax.xml.namespace.QName qname7 = new javax.xml.namespace.QName(
"urn:schemas-microsoft-com:vml", "shapetype");
JAXBElement jeShapeType = this.getJAXBElement(qname7, CTShapetype.class, shapeType);

ctObj.getAnyAndAny().add(jeShapeType);

javax.xml.namespace.QName qname8 = new javax.xml.namespace.QName(
"urn:schemas-microsoft-com:vml", "shape");
JAXBElement jeShape = this.getJAXBElement(qname8, CTShape.class, shape);

ctObj.getAnyAndAny().add(jeShape);

CTOLEObject oleObj = this.msOfficeOf.createCTOLEObject();
oleObj.setType(STOLEType.EMBED);
oleObj.setProgID("Package");
oleObj.setShapeID("_x0000_i1025");
oleObj.setDrawAspect(STOLEDrawAspect.CONTENT);
oleObj.setObjectID("_1302517361");
oleObj.setId(oleID);

javax.xml.namespace.QName qOle = new javax.xml.namespace.QName(
"urn:schemas-microsoft-com:office:office", "OLEObject");
JAXBElement jeOle = this.getJAXBElement(qOle, CTOLEObject.class, oleObj);

ctObj.setDxaOrig(BigInteger.valueOf(1051));
ctObj.setDyaOrig(BigInteger.valueOf(811));
ctObj.getAnyAndAny().add(jeOle);

return ctObj;

}

public Document buildDocument(MainDocumentPart docPart,
WordprocessingMLPackage wp)
{
org.docx4j.wml.ObjectFactory of = new org.docx4j.wml.ObjectFactory();
Body body = of.createBody();
docPart.setJAXBContext(this.getMicrosoftJAXBContext());

P oleP = this.getNextParagraph();
R oleR = wmlOf.createR();
CTObject oleObj = this.createOle(oleEmbID, imagePackID);
JAXBElement jeOle = wmlOf.createRObject(oleObj);

oleR.getRunContent().add(jeOle);
oleP.getParagraphContent().add(oleR);
body.getEGBlockLevelElts().add(oleP);

body.getEGBlockLevelElts().add(sectPr);

}

public static void main(String[] args)
{
try
{
wp = WordprocessingMLPackage.createPackage();
wp.getContentTypeManager().addDefaultContentType("bin", ContentTypes.OFFICEDOCUMENT_OLE_OBJECT);

//Create EMF OlePackage
Part imagePartOle = testDoc.createImagePart("imagePackage.emf");
docPart.addTargetPart(imagePartOle);

//Create OleBinPart
Part oleBinPart = testDoc.createOleBinPart("oleobject1.bin");
docPart.addTargetPart(oleBinPart);



I have to commented out these 2 lines of code because it is given me an exception:
//((OleObjectBinaryPart)olePart).initPOIFSFileSystem();
//((OleObjectBinaryPart)olePart).writePOIFSFileSystem();

Without them it will still generate the oleObject1.bin, but not able to double click to open the embeded file.
org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:111)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151)

What did I do wrong?

Thanks for your help!

Re: How to create an oleObject.bin

PostPosted: Thu Apr 30, 2009 3:23 am
by btjandra
Hi Jason,
Attached is my document created using docx4j, if you rename it to .docx, when you bring it up you will see my little docx icon and I cannot double click it to open.

Please help!

Thanks!
Betty

Re: How to create an oleObject.bin

PostPosted: Thu Apr 30, 2009 5:07 am
by jason
I have to commented out these 2 lines of code because it is given me an exception:
//((OleObjectBinaryPart)olePart).initPOIFSFileSystem();
//((OleObjectBinaryPart)olePart).writePOIFSFileSystem();


I think you will need those lines (or some variant on them) if you want to create an OLE object (even though when I look more closely at the code snippet I posted, I don't have them).

Without that, you will embed a binary blob, but it won't be an OLE object.

What exception did they give you?

docx4j sources contains m2/org/apache/poi/hwpf/3.4.0/hwpf-3.4.0.jar which is the subset of POI necessary to support this OLE stuff (and import from a binary doc). Are you using that (which you should be, at least for now), or do you have POI proper on your path?

Re: How to create an oleObject.bin

PostPosted: Thu Apr 30, 2009 5:45 pm
by btjandra
Hi Jason,
The exception I got for adding those 2 statements:


org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:111)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151)
at org.docx4j.openpackaging.parts.WordprocessingML.OleObjectBinaryPart.initPOIFSFileSystem(OleObjectBinaryPart.java:85)
at com.gdais.emtk.word.services.TestScratchDoc.createOleBinPart(TestScratchDoc.java:2612)
at com.gdais.emtk.word.services.TestScratchDoc.main(TestScratchDoc.java:2953)


Also, if I include a .doc file from word 2003, it doesn't complaint but still it will not allow me to double click that embeded file.

Is there another API that I need for word 2007?

Thanks for your help!
Betty

Re: How to create an oleObject.bin

PostPosted: Fri May 01, 2009 12:17 am
by jason
As I said earlier, this seems to be a black art. See the link above about embedding a pdf.

As a general strategy, you might try to round trip successfully first. In other words, create a document in Word containing the thing you want to embed. Then open it with docx4j, and use the POIFS stuff to look at it, then re-write it. If you can still click on the OLE object in Word and open it, you are making progress!

Without having looked at it further, that POI error message sounds like a bug (ie POI is trying to prevent people from embedding a docx file by mistake, when what they really mean to do is simply save it to disk?).

Have a look (or ask) on the POI mailing lists for any example of successful embedding...

As an aside, debugging the POIFS stuff is made harder by the fact that the terminology POI uses is different to the terminology in the spec (because the implementation predates the publication). If it were me, I'd now rename everything to match the spec.

Re: How to create an oleObject.bin

PostPosted: Fri Sep 11, 2009 3:59 am
by jason
Notes to self regarding ole and pdf.

See http://blogs.msdn.com/brian_jones/archi ... -file.aspx

In this post, I am going to show you how to generate the IStorage and the image representing the embedded object by invoking the OLE Server associated with PDF files. To create the underlying data for a non-Office embedded object we need to look up the prog id of the application associated with the file format extension. To get this data we need to look under \HKCR\.XXX within the registry, where XXX is the file format extension (ex. PDF). Under this path you should see at least two sub keys: "(Default)" and "Content Type." The value specified for "(Default)" represents the prog id of the application associated with the file format. On my computer, the prog id associated with PDF files is "AcroExch.Document."

Since we don't know the structure of the embedded object we shouldn't use the content type associated with the file format extension. Instead, we should use the generic content type for embedded objects, which is "application/vnd.openxmlformats-officedocument.oleObject."

Our next step is to create the IStorage and an image representation for the embedded object. As mentioned in the Solution section above, we need to invoke the OLE Server associated with PDF files. Below is the C++ code needed to accomplish this task:


Code: Select all
//********** This snippet is C++ code *************//
HRESULT PackageOleObject(LPCTSTR inputFile, LPCTSTR outputFile)
{
HRESULT hr = S_OK;
IStoragePtr pStorage = NULL;
IOleObjectPtr pOle = NULL;
IDataObjectPtr pdo = NULL;
FORMATETC fetc;
STGMEDIUM stgm;
HENHMETAFILE hmeta;

// Create a compound storage document.
hr = StgCreateStorageEx (
outputFile,
STGM_READWRITE | STGM_SHARE_EXCLUSIVE | STGM_CREATE | STGM_TRANSACTED,
STGFMT_DOCFILE,
0,
NULL,
NULL,
IID_IStorage,
reinterpret_cast<void**>(&pStorage));
CheckHr(hr);
   
// Create OLE package from file.
hr = OleCreateFromFile(CLSID_NULL, inputFile, ::IID_IOleObject,
OLERENDER_NONE, NULL, NULL, pStorage, (void**)&pOle);

hr = OleRun(pOle);
CheckHr(hr);

hr = pOle->QueryInterface(IID_IDataObject, (void**)&pdo);
CheckHr(hr);

fetc.cfFormat = CF_ENHMETAFILE;
fetc.dwAspect = DVASPECT_CONTENT;
fetc.lindex = -1;
fetc.ptd = NULL;
fetc.tymed = TYMED_ENHMF;

stgm.hEnhMetaFile = NULL;
stgm.tymed = TYMED_ENHMF;
hr = pdo->GetData(&fetc, &stgm);
CheckHr(hr);

// Create image metafile for object.
CopyEnhMetaFile(stgm.hEnhMetaFile, emfFile);

hr = pStorage->Commit(STGC_DEFAULT );
CheckHr(hr);

pOle->Close(0);
DeleteEnhMetaFile(stgm.hEnhMetaFile);
DeleteEnhMetaFile(hmeta);   
   
return hr;
}


The above C++ code snippet will create two output files that represent the IStorage and the image representation for our embedded object.


For OLE via JNI, see http://javabyexample.wisdomplug.com/jav ... va-ii.html

Candidate projects reviewed in that article:


Re: How to create an oleObject.bin (for video or pdf)

PostPosted: Wed May 22, 2013 6:44 pm
by jason
Plutext now offers a commercial extension for docx4j/pptx4j/xlsx4j called docx4j OLE Helper.

This makes it easy to create a suitable oleObject.bin for video or PDF etc for inclusion in a docx, pptx, or xlsx.

Feel free to email sales@plutext.com for more info.