Page 1 of 1

OutOfMemory at saving PresentationMLPackage

PostPosted: Thu Mar 16, 2017 1:24 am
by wingmaster
Hi,

we had some OutOfMemoryExceptions at saving pptx files. We use pptx files as templates, change some information (mainly replace some text) and save the new pptx file. Unfortunately, we can't reproduce the Exception. The same template with the same data doesn't make trouble if we try to reproduce it after an exception occurs. There is no memory leak - the memory gets freed after each exception. The maximum heap size is 4096 MB, the pptx template has 20 slides and a size of 2.5 MB. The generated pptx file has also 20 slides (none removed or created) and circa the same size (2.5 MB). I'm not allowed to share the template because it's confidential, sorry. Any suggestions are greatly appreciated.

This is the code we use for saving:
Code: Select all
File file = new File(fileNameForGeneratedReport);
FileOutputStream outStream = new FileOutputStream(file);
Save save = new Save(presentationMLPackage);
save.save(outStream);    // <-- Exception occurs here
outStream.flush()


The stack trace (always the same):
Code: Select all
java.lang.OutOfMemoryError: Java heap space
   at java.util.Arrays.copyOf(Arrays.java:2453)
   at java.util.Arrays.copyOf(Arrays.java:2427)
   at java.util.ArrayList.grow(ArrayList.java:254)
   at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:228)
   at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:220)
   at java.util.ArrayList.add(ArrayList.java:452)
   at org.docx4j.openpackaging.contenttype.ContentTypeManager.buildTypes(ContentTypeManager.java:731)
   at org.docx4j.openpackaging.contenttype.ContentTypeManager.marshal(ContentTypeManager.java:786)
   at org.docx4j.openpackaging.io3.stores.ZipPartStore.saveContentTypes(ZipPartStore.java:213)
   at org.docx4j.openpackaging.io3.Save.save(Save.java:176)


docx4j version: 3.2.1
we use a ibm java version, that comes with a WebSphere Application Server:
Code: Select all
<prompt>:~>/<path to java>/java_1.7_64/bin/./java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470sr9fp60ifix-20161110_01(SR9 FP60)+IV90630+IV90578))
IBM J9 VM (build 2.6, JRE 1.7.0 Linux amd64-64 Compressed References 20161005_321282 (JIT enabled, AOT enabled)
J9VM - R26_Java726_SR9_20161005_1259_B321282
JIT  - tr.r11_20161001_125404
GC   - R26_Java726_SR9_20161005_1259_B321282_CMPRSS
J9CL - 20161005_321282)
JCL - 20161021_01 [b]based on Oracle jdk7u121-b15[/b]

Re: OutOfMemory at saving PresentationMLPackage

PostPosted: Sun Mar 19, 2017 5:22 pm
by jason
It may be a bit difficult to provide specific assistance here, since:

1. you don't provide a reproducible testcase
2. you can't provide your pptx
3. you use an old version of docx4j
4. you use the IBM JDK, not Sun/Oracle

I'm not meaning to be critical; just being up front about the things which make this more challenging.

That said, here are some thoughts:

- its odd you always get the error in ContentTypeManager.buildTypes, because the lists there should be pretty small... it would be worth verifying they are actually small

- if you increase heap size to 6GB or 8GB, does the error still occur?

- if you use the Oracle JDK instead of IBM, what happens? See also https://www.google.com.au/search?q=ibm+ ... heap+space

- you could try current docx4j (v3.3.3), but that shouldn't make a difference

By the way, current docx4j contains code to anonymise a docx, so it can be shared without disclosing anything. That code needs a little work to generalise it to support pptx. See https://github.com/plutext/docx4j/blob/ ... ingle.java

Re: OutOfMemory at saving PresentationMLPackage

PostPosted: Mon Mar 20, 2017 11:00 pm
by wingmaster
Don't worry, I share your opinion that it's difficult to provide specific assistance. Nevertheless, some ideas might lead to the right end or at least the right direction.

Before I respond to your thoughts, I want to add some additional information:

example pptx
Now I have the permission to share one pptx, that caused the trouble (see attachment).
Update: I have the permission to share the pptx file only with your personally (not the entire world). Do you mind giving me your email address?

heap dumps
While responding to your post I realized that I forgot to write something about the heap dumps: We analysed the heap dumps for every OutOfMemoryException. The leak suspect number one is always "org/docx4j/openpackaging/contenttype/CTTypes" with more than 1 GB (e.g. see the attached screenshot). The first Exception occurred while the jvm tryed to allocate 3.193.524.368 bytes at once while 1.471.647.632 bytes have been available. We suggest, that the 3.193.524.368 bytes were meant for the CTTypes. If you have additional questions about the heap dumps, feel free to ask.



Now about your thoughts:

size of ContentTypeManager.buildTypes
You said it would be worth verifying that they are actually small. What would "small" be? I guess that this question would now be a different one due to the information about the heap dumps.

increase heap size
Since we can't reproduce the error and are not free to choose the heap size, we can't test what happens with more heap :-/.

different JRE
The same as with the heap size: Can't reproduce and are not free to choose.

docx4j version
That's something we can try (and already thought about that step). Feedback would be available after about 7 months, because we have fixed release cycles. The exception occurred only in production systems, not in test environments.

Re: OutOfMemory at saving PresentationMLPackage

PostPosted: Tue Mar 21, 2017 10:06 pm
by jason
You can email the pptx to jason@plutext.org

3.193.524.368 bytes = 3GB!

https://github.com/plutext/docx4j/blob/ ... Types.java

by small I'd guess 10KB to 100KB.