Importing a WordprocessingMLPackage obj for performance

I am using WordprocessingMLPackage.load(new ByteArrayInputStream(b)) to load in the wordMLPackage on each request that requries docx manipulation but want to explore alternatives to this to improve performance down to 10s of milliseconds rather than seconds. The options explored so far include
1. WordprocessingMLPackage.load(bais) - perf metrics can be at best 500ms using a test docx
2. wordMLPackage.clone(); - perf metrics are similar to load() at 500ms using a test docx
3. Serializing results of WordprocessingMLPackage.load(bais) - WordprocessingMLPackage not serializable
4. Creating wordMLPackage from org.docx4j.xmlPackage.Package using FlatOpcXmlImporter - perf metrics can be at best 50ms using a test docx
5. Pooling wordMLPackage objects for re-use - overkill at this stage
Option 4 seems postive however i am concerned that this approach is not thread safe. In this basic test harness the FlatOpcXmlImporter object is constructed reusing the unmarshalled wmlPackageEl Package object and unsure whether this creates a deep clone or is basically referencing the same Parts objects and possible threading issues. Does anyone have an opinion on this?
Also, are there other options that have been successfully in improving the performance of obtaining WordprocessingMLPackage objects and avoiding the constant rebuilding of parts and relationships?
1. WordprocessingMLPackage.load(bais) - perf metrics can be at best 500ms using a test docx
2. wordMLPackage.clone(); - perf metrics are similar to load() at 500ms using a test docx
3. Serializing results of WordprocessingMLPackage.load(bais) - WordprocessingMLPackage not serializable
4. Creating wordMLPackage from org.docx4j.xmlPackage.Package using FlatOpcXmlImporter - perf metrics can be at best 50ms using a test docx
5. Pooling wordMLPackage objects for re-use - overkill at this stage
Option 4 seems postive however i am concerned that this approach is not thread safe. In this basic test harness the FlatOpcXmlImporter object is constructed reusing the unmarshalled wmlPackageEl Package object and unsure whether this creates a deep clone or is basically referencing the same Parts objects and possible threading issues. Does anyone have an opinion on this?
- Code: Select all
RandomAccessFile f = new RandomAccessFile(inputfilepath, "r");
byte[] b = new byte[(int)f.length()];
final ByteArrayInputStream bais = new ByteArrayInputStream(b);
f.read(b);
StreamSource source = new StreamSource(bais);
org.docx4j.xmlPackage.Package wmlPackageEl = ((JAXBElement<org.docx4j.xmlPackage.Package>) u
.unmarshal(source)).getValue();
for (int i = 0; i < 100; i++) {
long start = System.currentTimeMillis();
FlatOpcXmlImporter xmlPackage = new FlatOpcXmlImporter(wmlPackageEl);
wordMLPackage = (WordprocessingMLPackage) xmlPackage.get();
times.add(new Long(System.currentTimeMillis() - start));
}
Also, are there other options that have been successfully in improving the performance of obtaining WordprocessingMLPackage objects and avoiding the constant rebuilding of parts and relationships?