My company and I continue to make progress manipulating MS Office documents
with docx4j. Having the full-fidelity JAXB tree of every part of every
document type is indispensible. POI is simply unsatisfactory for the xml
formats, as unsatisfactory as the Hawaiian foodstuff.
I have a few patches that I've accumulated. They're separated out here into 4
pieces, so they can be analyzed/accepted/rejected separately. They will all
apply cleanly on trunk as of today's revision 1342.
These patches are submitted in accordance with the docx4j contributor policy
as described here:
http://dev.plutext.org/docx4j/docx4j_In ... butor.docx
- configure the maven surefire plugin from trying to run certain classes as
JUnit tests. The following samples get picked up as tests, since they match
- configure the 'copy-resources' part of the build to pick up xslt and xml
files in the src/main/java and src/pptx/java parts of the tree. Otherwise
these are missing for me from the resultant jar. (Note: are release jars
built some other way than maven that picks up these resources?)
- declare a JAXB type adapter (package-info.java) for java.math.BigInteger
(BigIntegerAdapter.java) in the org/docx4j/wml package. This works around
a bug in old JAXB releases, where null numbers are serialized out as empty
string "", and cause deserialization errors later, on the way back in. I
would normally try to ensure a proper JAXB on installation, but I need to
run in environments where I cannot easily do so. This simple workaround
avoids great pain.
- re-raise several caught/swallowed Exceptions in XmlUtils.java. They are
re-raised as RuntimeException where the enclosing method sigature doesn't
allow any declared exception.
- guard against null elements in a couple of places (ResolvedLayout.java,
TextStyles.java) The null elements guarded against occurred with files
generated by OpenOffice, I imagine that MS Office always has the elements
in question present.