Page 1 of 1

Docx format changed in Word 2011 for Mac?

PostPosted: Mon Dec 13, 2010 7:24 pm
by kjetilhp
Hi

I'm trying out docx4j, my main usage is to replace content from a file/database.

I've tried the simple unmarshal example using ${paramater} and it works great with your sample files in svn, but when creating documents using Word 2010 (on OSX) all replacements ends up with the text "null" instead of the right values...

I've also looked into using data controls and looking at your samples in CustomXmlBinding... it seems that you have to have Visual Studio to create data enabled word documents? I can't find anything in Word2010 to add more than form elements...

/K

Re: Docx format changed in Word 2010?

PostPosted: Mon Dec 13, 2010 8:39 pm
by jason
Do you mean Word 2010 (which runs on Windows), or Word 2011 for the Mac?

kjetilhp wrote:I've tried the simple unmarshal example using ${paramater} and it works great with your sample files in svn, but when creating documents using Word 2010 (on OSX) all replacements ends up with the text "null" instead of the right values...


I haven't installed Word 2011 for the Mac yet, so if you'd like a quick response, you'll need to attach a sample docx please (renamed .zip) that I can try.

kjetilhp wrote:I've also looked into using data controls and looking at your samples in CustomXmlBinding... it seems that you have to have Visual Studio to create data enabled word documents? I can't find anything in Word2010 to add more than form elements...


From http://www.officeformachelp.com/2010/11 ... -features/ it does look like Word 2011 for the Mac may not have the same content control tools built in as the Windows versions (2007 and 2010) have on their developer tab.

You don't need Visual Studio.

I assume Word 2011 for the Mac contains underlying support for content controls and binding, but evidently this needs to be checked!

Re: Docx format changed in Word 2010?

PostPosted: Mon Dec 13, 2010 11:08 pm
by kjetilhp
attached are the in/out files, in file saved in Word 2011 for Mac
(archive.zip)

as for content controls, how do you then go about creating documents with this support (on a Mac) if Word doesn't support them? direct XML editing?

Another note, I tried the MediaChartSample files with the CustomXmlBinding and it seems to have unexpected effects, as I read the sample code it should only change the name, but as you can see from the attached docs it fills in on other field an removes som content from other controls. (archive.1.zip)

Re: Docx format changed in Word 2010?

PostPosted: Thu Dec 16, 2010 11:43 pm
by jason
kjetilhp wrote:as for content controls, how do you then go about creating documents with this support (on a Mac) if Word doesn't support them? direct XML editing?


In some basic testing I did, Word 2011 for Mac seems to open and save docx containing content controls, but there seems to be no way to change them (either via the UI, or via VBA or AppleScript).

So unfortunately, your options are to edit the XML directly (using the Flat OPC representation would be easiest, assuming Word 2011 supports this - if it doesn't you could use docx4j to convert to/from it), or you could modify docx4all (which already has some support for content controls) - this latter I am keen to see done.

If you inject an updated custom xml part, I'm not sure whether Word 2011 will update the document surface to reflect the new values - I didn't try that yet.

kjetilhp wrote:Another note, I tried the MediaChartSample files with the CustomXmlBinding and it seems to have unexpected effects, as I read the sample code it should only change the name, but as you can see from the attached docs it fills in on other field an removes som content from other controls. (archive.1.zip)


I haven't looked at the attachments to compare, but someone previously noted that certain types of content controls (check boxes or lists iirc) were altered. The code would need to be changed to handle these cases.

I will have a look at your document to see where the nulls are coming from, but haven't done so yet.

Re: Docx format changed in Word 2011 for Mac?

PostPosted: Fri Dec 17, 2010 5:33 pm
by jason
I had a look at your UnmarshallFromTemplate example.

I was able to generate the expected output from it (ie using your document as input, running docx4j on Windows).

So docx4j doesn't seem to have any problem with your docx.

Could you please try it again on your platform, to make sure you can reproduce the problem?

Re: Docx format changed in Word 2011 for Mac?

PostPosted: Fri Dec 17, 2010 11:48 pm
by kjetilhp
just did, even created a brand new document in Word 2007 on Windows XP (VMware)

code:

Code: Select all
String inputfilepath = path + "CONSULTING_AGREEMENT.docx";
boolean save = true;
String outputfilepath = path + "CONSULTING_AGREEMENT_READY.docx";

WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));
MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
org.docx4j.wml.Document wmlDocumentEl = (org.docx4j.wml.Document) documentPart.getJaxbElement();

String xml = XmlUtils.marshaltoString(wmlDocumentEl, true);

HashMap<String, String> mappings = new HashMap<String, String>();

        mappings.put("company", "Bygg og Bedrag AS");
        mappings.put("c_name", "Lurium Slavesen");
        mappings.put("date", "31.12.2010");
        mappings.put("from_date", "01.01.2011");
        mappings.put("to_date", "31.12.2011");
        mappings.put("address1", "Snikveien 4");
        mappings.put("address2", "0101 OSLO");
        mappings.put("c_address1", "Blaasbortveien 10");
        mappings.put("c_address2", "0707 GOKK");
        mappings.put("project", "HighTechIntegration v1");
        mappings.put("dollar", "950");
        mappings.put("hours", "50");
        mappings.put("witness", "Frk. Ella Fryd");

        //valorize template
        Object obj = XmlUtils.unmarshallFromTemplate(xml, mappings);

        //change  JaxbElement
        documentPart.setJaxbElement((Document) obj);
        if (save) {
            SaveToZipFile saver = new SaveToZipFile(wordMLPackage);
            saver.save(outputfilepath);
        }


attached are in and out files

Re: Docx format changed in Word 2011 for Mac?

PostPosted: Mon Dec 20, 2010 4:57 pm
by jason
There are two (related) problems.

Here is the XML for ${date} in your docx:

Code: Select all
           <w:r w:rsidRPr="00C35621">
              <w:rPr>
                <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman"/>
                <w:b/>
                <w:noProof/>
              </w:rPr>
              <w:t>${</w:t>
            </w:r>
            <w:r>
              <w:rPr>
                <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman"/>
                <w:b/>
                <w:noProof/>
              </w:rPr>
              <w:t>date</w:t>
            </w:r>
            <w:r w:rsidRPr="00C35621">
              <w:rPr>
                <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman"/>
                <w:b/>
                <w:noProof/>
              </w:rPr>
              <w:t>}</w:t>
            </w:r>


You can see that the text is split across various nodes. unmarshallFromTemplate won't work properly unless it sees a contiguous string equal to the key. http://dev.plutext.org/trac/docx4j/changeset/1367 improves the behaviour/diagnostics.

Turning off spelling/grammar checking will stop a common cause of splitting up the variable/key name.

Also, the text is inside Word fields. You might need to get rid of them?

Re: Docx format changed in Word 2011 for Mac?

PostPosted: Mon Dec 20, 2010 8:34 pm
by kjetilhp
thx, I'll try to modify the doc and see what happens, probably safer to go with the xml data binding for replacing text and values...

Re: Docx format changed in Word 2011 for Mac?

PostPosted: Mon Dec 20, 2010 11:07 pm
by jason
kjetilhp wrote:probably safer to go with the xml data binding for replacing text and values...


Agreed (lucky you have that windows VM)