Page 1 of 1

toHTML, pct table widths are mangled to some fixed width.

PostPosted: Tue Jan 28, 2020 12:48 pm
by Laszlo
I have a table that is set by MSWord to be 100% (see attached docx):

Code: Select all
<w:tblW w:w="5000" w:type="pct"/>


And for kicks and giggles, I also included a table set to 50%:

Code: Select all
<w:tblW w:w="2500" w:type="pct"/>


However, when I import this docx via Docx4J.toHTML, the resulting HTML gives fixed lengths for the table widths, which just doesn't make any sense.

The 100% table:
Code: Select all
<table ... style="... width: 165mm;">


The 50% table:
Code: Select all
<table ... style="... width: 3.25in;">



From what I can tell, the when the type is given as pct, the width value shall represent fiftieths of a percent (https://standards.iso.org/ittf/PubliclyAvailableStandards/c071691_ISO_IEC_29500-1_2016.zip p.4546). So 5000/50 = 100pct. This is consistent with the 50% table being marked with 2500, which is half of 5000.

The tables widths are a pct in the docx file. They need to stay that way. 165mm isn't going to fit on a lot of screens. :D I could speculate this was done based on the w:pgSz and w:pgMar and some fancy twip math...but that doesn't seen to be true...but if it were true, how do we get that value from docx4j?

Is this some glaring bug, or am I really missing something with my configuration here? Any suggestions? This is a big blocker for me.

Code: Select all
InputStream inputStream = new FileInputStream("100pct_and_50pct_tables.docx");
ByteArrayOutputStream htmlOutputStream = new ByteArrayOutputStream()
WordprocessingMLPackage wordMLPackage = Docx4J.load(inputStream);
HTMLSettings htmlSettings = Docx4J.createHTMLSettings();
htmlSettings.setWmlPackage(wordMLPackage);
SdtWriter.registerTagHandler("HTML_ELEMENT", new SdtToListSdtTagHandler());
Docx4jProperties.setProperty("docx4j.Convert.Out.HTML.OutputMethodXML", true);
Docx4J.toHTML(htmlSettings, htmlOutputStream, Docx4J.FLAG_EXPORT_PREFER_XSL);
String html = new String(htmlOutputStream.toByteArray(), StandardCharsets.UTF_8);

Re: toHTML, pct table widths are mangled to some fixed width

PostPosted: Thu Jan 30, 2020 12:46 pm
by jason

Re: toHTML, pct table widths are mangled to some fixed width

PostPosted: Fri Jan 31, 2020 6:30 am
by Laszlo
Awesome!!

I actually came up with almost the identical fix! You beat me to it, which is fine. :D The change to StyleUtil, though, I would not have known to do that!

Thank you so much!
Laszlo