Page 1 of 1

Different file size on generated xlsx

PostPosted: Fri Oct 20, 2017 5:14 am
by grohli
Hello,

sometimes my generated document starts with:
Code: Select all
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:worksheet xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:ns2="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:xdr="http://schemas.openxmlformats.org/drawingml/2006/spreadsheetDrawing" xmlns:ns4="http://schemas.microsoft.com/office/excel/2006/main" xmlns:ns5="http://schemas.microsoft.com/office/excel/2008/2/main">
    <ns2:cols>

So there is a ns2 in every Element

But most of the time it starts like this:
Code: Select all
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:xdr="http://schemas.openxmlformats.org/drawingml/2006/spreadsheetDrawing" xmlns:ns4="http://schemas.microsoft.com/office/excel/2006/main" xmlns:ns5="http://schemas.microsoft.com/office/excel/2008/2/main">
    <cols>


The ns2 seems to be the cause for the increased file-size in that matter.
When opening the files in Excel the content is the same. I don't see a practical problem here, but I am confused why this happend.

My code is based on this sample file: https://github.com/plutext/docx4j/blob/ ... sheet.java

Do you need further information?

Re: Different file size on generated xlsx

PostPosted: Sat Oct 21, 2017 4:48 am
by jason
In general, you don't need to worry about differences in file size: Two XML documents can contain the same info, but be slightly different (see canonical xml).

That said, when JAXB marshals to XML, we should be using a namespace prefix mapper so you get those well known namespaces, and not ns2 etc.

If you can narrow down some more when it happens, we can set the namespace prefix mapper appropriately. For example: https://github.com/plutext/docx4j/blob/ ... .java#L745

Back to file sizes, for completeness: different zip utilities (eg Java vs Microsoft Office) might use different compression methods; see https://en.wikipedia.org/wiki/Zip_(file_format)