toHTML, pct table widths are mangled to some fixed width.
Posted: Tue Jan 28, 2020 12:48 pm
I have a table that is set by MSWord to be 100% (see attached docx):
And for kicks and giggles, I also included a table set to 50%:
However, when I import this docx via Docx4J.toHTML, the resulting HTML gives fixed lengths for the table widths, which just doesn't make any sense.
The 100% table:
The 50% table:
From what I can tell, the when the type is given as pct, the width value shall represent fiftieths of a percent (https://standards.iso.org/ittf/PubliclyAvailableStandards/c071691_ISO_IEC_29500-1_2016.zip p.4546). So 5000/50 = 100pct. This is consistent with the 50% table being marked with 2500, which is half of 5000.
The tables widths are a pct in the docx file. They need to stay that way. 165mm isn't going to fit on a lot of screens. I could speculate this was done based on the w:pgSz and w:pgMar and some fancy twip math...but that doesn't seen to be true...but if it were true, how do we get that value from docx4j?
Is this some glaring bug, or am I really missing something with my configuration here? Any suggestions? This is a big blocker for me.
- Code: Select all
<w:tblW w:w="5000" w:type="pct"/>
And for kicks and giggles, I also included a table set to 50%:
- Code: Select all
<w:tblW w:w="2500" w:type="pct"/>
However, when I import this docx via Docx4J.toHTML, the resulting HTML gives fixed lengths for the table widths, which just doesn't make any sense.
The 100% table:
- Code: Select all
<table ... style="... width: 165mm;">
The 50% table:
- Code: Select all
<table ... style="... width: 3.25in;">
From what I can tell, the when the type is given as pct, the width value shall represent fiftieths of a percent (https://standards.iso.org/ittf/PubliclyAvailableStandards/c071691_ISO_IEC_29500-1_2016.zip p.4546). So 5000/50 = 100pct. This is consistent with the 50% table being marked with 2500, which is half of 5000.
The tables widths are a pct in the docx file. They need to stay that way. 165mm isn't going to fit on a lot of screens. I could speculate this was done based on the w:pgSz and w:pgMar and some fancy twip math...but that doesn't seen to be true...but if it were true, how do we get that value from docx4j?
Is this some glaring bug, or am I really missing something with my configuration here? Any suggestions? This is a big blocker for me.
- Code: Select all
InputStream inputStream = new FileInputStream("100pct_and_50pct_tables.docx");
ByteArrayOutputStream htmlOutputStream = new ByteArrayOutputStream()
WordprocessingMLPackage wordMLPackage = Docx4J.load(inputStream);
HTMLSettings htmlSettings = Docx4J.createHTMLSettings();
htmlSettings.setWmlPackage(wordMLPackage);
SdtWriter.registerTagHandler("HTML_ELEMENT", new SdtToListSdtTagHandler());
Docx4jProperties.setProperty("docx4j.Convert.Out.HTML.OutputMethodXML", true);
Docx4J.toHTML(htmlSettings, htmlOutputStream, Docx4J.FLAG_EXPORT_PREFER_XSL);
String html = new String(htmlOutputStream.toByteArray(), StandardCharsets.UTF_8);