Page 1 of 1

Getting OutOfMemoryError when importing XHTML

PostPosted: Tue Apr 29, 2014 10:59 am
by tenk
Hi, thank you for the great library. I've been playing with docx4j which works nice but I'm now stuck with XHTML importing feature.

I have a rather large XHTML (1,23 MB - 21 images) which is producing "java.lang.OutOfMemoryError: Java heap space" error when trying to convert it to DOCX. Java code for converting XHTML > DOCX is taken from sample file ConvertInXHTMLDocument.java.

Here is the stack trace:
Code: Select all
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
   at java.awt.image.DataBufferInt.<init>(DataBufferInt.java:75)
   at java.awt.image.Raster.createPackedRaster(Raster.java:467)
   at java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:1032)
   at java.awt.GraphicsConfiguration.createCompatibleImage(GraphicsConfiguration.java:149)
   at java.awt.GraphicsConfiguration.createCompatibleImage(GraphicsConfiguration.java:178)
   at org.docx4j.org.xhtmlrenderer.util.ImageUtil.createCompatibleBufferedImage(ImageUtil.java:117)
   at org.docx4j.org.xhtmlrenderer.util.ImageUtil.convertToBufferedImage(ImageUtil.java:246)
   at org.docx4j.org.xhtmlrenderer.util.ImageUtil$AbstractFastScaler.getScaledInstance(ImageUtil.java:301)
   at org.docx4j.org.xhtmlrenderer.util.ImageUtil.getScaledInstance(ImageUtil.java:174)
   at org.docx4j.org.xhtmlrenderer.util.ImageUtil.getScaledInstance(ImageUtil.java:204)
   at org.docx4j.org.xhtmlrenderer.swing.AWTFSImage$NewAWTFSImage.scale(AWTFSImage.java:72)
   at org.docx4j.org.xhtmlrenderer.docx.Docx4jReplacedElementFactory.createReplacedElement(Docx4jReplacedElementFactory.java:60)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.calcDimensions(BlockBox.java:677)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.calcDimensions(BlockBox.java:631)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layout(BlockBox.java:778)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layout(BlockBox.java:735)
   at org.docx4j.org.xhtmlrenderer.layout.InlineBoxing.layoutInlineBlockContent(InlineBoxing.java:203)
   at org.docx4j.org.xhtmlrenderer.layout.InlineBoxing.layoutContent(InlineBoxing.java:165)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layoutInlineChildren(BlockBox.java:964)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layoutChildren(BlockBox.java:943)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layout(BlockBox.java:818)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layout(BlockBox.java:735)
   at org.docx4j.org.xhtmlrenderer.layout.BlockBoxing.layoutContent(BlockBoxing.java:61)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layoutChildren(BlockBox.java:947)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layout(BlockBox.java:818)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layout(BlockBox.java:735)
   at org.docx4j.org.xhtmlrenderer.layout.BlockBoxing.layoutContent(BlockBoxing.java:61)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layoutChildren(BlockBox.java:947)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layout(BlockBox.java:818)
   at org.docx4j.org.xhtmlrenderer.render.BlockBox.layout(BlockBox.java:735)
   at org.docx4j.org.xhtmlrenderer.docx.DocxRenderer.layout(DocxRenderer.java:214)
   at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.convert(XHTMLImporterImpl.java:657)


And I'm attaching my html with images. I have "-Xms512m -Xmx1024m" JVM options included when starting my web app with Jetty.

I don't know where is the problem but I've been debugging step by step through "xhtmlrenderer" and something caught my eye. In class Docx4jReplacedElementFactory.java, line 60 says: "fsImage.scale(cssWidth, cssHeight);" and variables cssWidth and cssHeight had some crazy high values like 9000x5000 (even though image is 400x300). Maybe it helps.

Thanks, T

Re: Getting OutOfMemoryError when importing XHTML

PostPosted: Tue Apr 29, 2014 8:30 pm
by jason
If you temporarily try giving it more memory (eg 4 or 8 gig depending on your PC), does it complete as expected (ie with images sized as expected)?

If you remove all (or most) of the img elements, does this significantly alter memory usage?

Re: Getting OutOfMemoryError when importing XHTML

PostPosted: Wed Apr 30, 2014 5:25 am
by tenk
Hi jason, thank you for quick response.

jason wrote:If you temporarily try giving it more memory (eg 4 or 8 gig depending on your PC), does it complete as expected (ie with images sized as expected)?


I've tried with "-Xms4g -Xmx5g" and still no success. When I removed 7 images it worked but image sizes were not as expected (they were stretched).

jason wrote:If you remove all (or most) of the img elements, does this significantly alter memory usage?


With 7 images removed I reduced the heap size to "-Xms1g -Xmx2g" and again got OutOfMemoryError. There is no problem with plain text and just a few images.

Re: Getting OutOfMemoryError when importing XHTML

PostPosted: Tue May 06, 2014 10:05 pm
by geoM
Hi tenk,

i had the same problem.
You can try without the attributes width and height, or with both attributes.

Hope this helps.