Page 1 of 1

Adding data to an existing xlsx file

PostPosted: Sat Nov 26, 2011 7:10 am
by benpoole
Hi all

This is driving me crazy, so I'd appreciate some guidance. Briefly, I am loading in a pre-existing Excel 2007 file, and looking to add some data in a couple of worksheets: just basic content row-by-row. The file I load contains things like sharedStrings and styles, and that's where I'm coming un-stuck, in terms of adding content.

Every approach I've tried so far results in a crippled file that has to be repaired by Excel as it's opened. This repair involves the wholesale removal of any data I tried to add, plus a report that the styles file had mismatched tags (e.g. "ns2:ext" at the start, but "ns7:ext" at the end), so I'm on a loser.

For example, if I understand things correctly, in order to write a string value to a cell, I need to do something like this (pseudo-ish code obviously!):

Code: Select all
Cell myCell = Context.getsmlObjectFactory().createCell();
myCell.setV(CELL VALUE INDEX FROM SHAREDSTRINGS);
myCell.setT(STCellType.S); // Tells Excel it's a shared string
myCell.setR(CELL REFERENCE); // cell's ref in worksheet (e.g. A3)
row.getC().add(myCell);


Is that right? And if so, what's a good approach to handling sharedstrings and pushing IDs to and from them? It's that bit that confuses me...

Re: Adding data to an existing xlsx file

PostPosted: Fri Dec 02, 2011 3:06 pm
by jason
As a starting point, please see Eric Wite's video at http://youtu.be/fjGM7r13u9A

Re: Adding data to an existing xlsx file

PostPosted: Fri Dec 02, 2011 10:21 pm
by benpoole
Thanks Jason, I will check that out.

Re: Adding data to an existing xlsx file

PostPosted: Mon Dec 05, 2011 9:17 pm
by benpoole
Eric's video confirmed what I read in another post: that it's a lot simpler to ignore shared strings, and simply write to individual cells (i.e. setting "t" to "str" rather than "s"). That's all fine, and it works. The issue I have is that the resulting workbook has errors in it that Excel has to repair. In doing so, the data written to the file is removed. I've pored over correct and incorrect XML for the relevant components and these are my findings:

- styles.xml is always rendered invalid by the process.
The generated XML closes off a load of "ns2:..." namespace tags with "ns:7..." tags, which is obviously wrong. This is easily fixed in a text editor, but I'm really not sure what's happening there: my code doesn't actually manipulate styles parts at all.

- Worksheet parts are generated with the rows in the wrong order. For example, I had rows 4 and 5 preceding rows 2 and 3 in the generated output. Once I corrected the XML by hand, the resulting xlsx could be opened with the data preserved and no errors reported.

I'm so close!

I'm going to try and work out what's going on—presumably there's some way I can stop the rows coming out in the wrong order—but the styles stuff has me really scratching my head.

Re: Adding data to an existing xlsx file

PostPosted: Tue Dec 06, 2011 1:25 am
by jason
Could you please attach your styles part, so I can look at it?

Re: Adding data to an existing xlsx file

PostPosted: Wed Dec 07, 2011 8:56 am
by benpoole
Attached are two styles.xml files. The "OK" one is as per the original working Excel file which my code opens, adds to, and re-saves. The "broken" file is the version we get once it's been saved. Note the mismatch with the "ns7" tags at the end (the rest is "ns2"). Once these are corrected manually the file works OK.

Re: Adding data to an existing xlsx file

PostPosted: Fri Dec 09, 2011 2:36 pm
by jason
Hi Ben

As you describe, the file ends with:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
  <ns2:extLst>
    <ns2:ext uri="{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}">
      <x14:slicerStyles xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main" xmlns="" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"
                       xmlns:ns7="http://schemas.openxmlformats.org/spreadsheetml/2006/main" defaultSlicerStyle="SlicerStyleLight1"/>
      </ns7:ext>
      </ns7:extLst>
      </ns7:styleSheet>
 
Parsed in 0.001 seconds, using GeSHi 1.0.8.4


What are the steps to reproduce creating that broken styles.xml file? If you can give me those steps, I'll fix it ASAP.

Re: Adding data to an existing xlsx file

PostPosted: Fri Dec 16, 2011 11:31 am
by benpoole
Thanks Jason. Apologies for the delay, been doing some other docx4j pieces!

Anyway, back to broken styles: my code doesn't actually manipulate cell / row styles in any way, but my theory is that it breaks because I'm manipulating an existing file (which has the usual shared strings etc.), and adding new cells using lots of lines like this:

Code: Select all
somecell.setT(STCellType.STR);


… in other words, I'm mixing shared strings (which I don't change or manipulate) with new in-line strings, so as to avoid trying to update the sharedString table. Does that sound like it could be a cause? Certainly the before / after sharedStrings.xml files look quite different (although both are well-formed), and all the tags in the edited sharedStrings xml start with the ns2 prefix.

Re: Adding data to an existing xlsx file

PostPosted: Sat Dec 17, 2011 11:15 am
by jason
What version of JAXB and/or JDK/JRE are you using?

Re: Adding data to an existing xlsx file

PostPosted: Sat Dec 17, 2011 11:52 am
by benpoole
It's IBM's Java 6 JVM in Domino, uses its own JAXB implementation.

Re: Adding data to an existing xlsx file

PostPosted: Sat Dec 17, 2011 10:00 pm
by jason
That's likely to be the problem then.

It could be http://www-01.ibm.com/support/docview.w ... wg1PM28389

Try adding and using the JAXB Reference Implementation.

Re: Adding data to an existing xlsx file

PostPosted: Mon Dec 19, 2011 2:05 am
by benpoole
Well this is a different IBM JVM from that used in WAS, but I wouldn't be surprised if it was the same core issue. It should be using the standard Java 6 JAXB implementation, but who knows for sure!

I will see if I can replicate the problem in a non-Domino context, thus running the code in a different JVM. Thanks!

Re: Adding data to an existing xlsx file

PostPosted: Thu Dec 22, 2011 10:41 am
by benpoole
An update:

I've implemented the basic code (decoupled from the Domino datasource I'm processing) and run it outside of the Domino JVM. I suspect your JAXB theory may have some legs: using the default Mac OS X Java 6 JVM I don't get any issues with styles.xml!

I've noticed something else which was puzzling me though: I *thought* I was adding rows to a specific worksheet in the workbook I open, but the rows I'm creating are actually getting added to sheet1.xml, not sheet3.xml as intended, so this explains the other broken element in my file.

So, looking promising on the processing front… although the Domino JAXB issue, if definite, looks to be rather the show-stopper, unless I can wheel in an alternative JAXB implementation.

Re: Adding data to an existing xlsx file

PostPosted: Sat Dec 24, 2011 5:01 am
by benpoole
Yep that did it! I got my code running outside of Domino (just in Eclipse in OS X / Java 6), and then moved it "as is" to Domino. The same code running on the Domino JVM produces the malformed styles.xml file, which Excel then has to repair.

I downloaded the latest reference JAXB implementation from the Oracle site, and installed that in the standard Domino classpath (jvm/lib/ext)—that seems to have done the trick, as the generated file now opens without requiring repair. This is what you want to see in the console when the code runs:

Code: Select all
INFO org.docx4j.jaxb.Context  - JAXB: RI not present.  Trying Java 6 implementation.
INFO org.docx4j.jaxb.Context  - JAXB: Using Java 6 implementation.
INFO org.docx4j.jaxb.Context  - loading Context jc
INFO org.docx4j.jaxb.Context  - loaded com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl .. loading others ..
INFO org.docx4j.jaxb.Context  - .. others loaded ..
INFO org.docx4j.openpackaging.contenttype.ContentTypeManager  - Detected SpreadhseetMLPackage package


…i.e. it loads "others" ;-)

Merry Christmas!