Page 1 of 1

Converting numbering list to html

PostPosted: Fri Oct 16, 2015 1:40 am
by CrazyHog
Hello.
I have simple docx file with numbering list.
I try to convert it to HTML by docx4j 3.2.2 with this solution:
Code: Select all
SdtWriter.registerTagHandler("HTML_ELEMENT", new SdtToListSdtTagHandler());
HTMLSettings htmlSettings = Docx4J.createHTMLSettings();
htmlSettings.setWmlPackage(wordMLPackage);
Docx4J.toHTML(htmlSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);

But in the resulting HTML file the "display:block" style is set and as the result, digits are not displayed:
Code: Select all
<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"><head><meta content="text/html; charset=utf-8" http-equiv="Content-Type" /><style><!--/*paged media */ div.header {display: none }div.footer {display: none } /*@media print { */@page { size: A4; margin: 10%; @top-center {content: element(header) } @bottom-center {content: element(footer) } }/*element styles*/ .del  {text-decoration:line-through;color:red;} .ins {text-decoration:none;background:#c0ffc0;padding:1px;}
/* TABLE STYLES */

/* PARAGRAPH STYLES */
.DocDefaults {display:block;margin-bottom: 4mm;line-height: 115%;font-size: 11.0pt;}
.a {display:block;}
.a3 {display:block;position: relative; margin-left: 0.5in;}

/* CHARACTER STYLES */ span.a0 {display:inline;}
--></style><script type="text/javascript"><!--function toggleDiv(divid){if(document.getElementById(divid).style.display == 'none'){document.getElementById(divid).style.display = 'block';}else{document.getElementById(divid).style.display = 'none';}}
--></script></head><body>
 
  <!-- userBodyTop goes here -->
  <div class="document"><ol>
 
  <li class="a3 a DocDefaults " style="position: relative; margin-left: 0.5in;text-indent: -0.25in;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">One</span><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">2</span></li>
 
  <li class="a3 a DocDefaults " style="position: relative; margin-left: 0.5in;text-indent: -0.25in;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">Two</span></li>
 
  <li class="a3 a DocDefaults " style="position: relative; margin-left: 0.5in;text-indent: -0.25in;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">Three</span><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';white-space:pre-wrap;"> </span></li></ol></div>
  <!-- userBodyTail goes here -->
  </body></html>

It looks like:
One2
Two
Three

How can I fix this problem? Thank you

Re: Converting numbering list to html

PostPosted: Mon Oct 19, 2015 10:09 am
by jason
Seems like this post is in the wrong forum?

What is your intended output?

Are you looking for HTML which looks the same as the docx in a browser (ie the user sees the same numbers)? If that's all you want, try without the TagHandler

For further assistance, you'll need to post your docx.

Re: Converting numbering list to html

PostPosted: Mon Oct 19, 2015 7:31 pm
by CrazyHog
jason wrote:Seems like this post is in the wrong forum?

Yes, sorry. Unfortunatly I noticed it after I've sent the post.

jason wrote:What is your intended output?

Are you looking for HTML which looks the same as the docx in a browser (ie the user sees the same numbers)? If that's all you want, try without the TagHandler

When I used the docx4j 2.8 library, there weren't problem with numbering list with the same docx file. Users saw the numbers in the browsers as in the Word application.
Of course, I tried to convert without TagHandler and had the same result:
Code: Select all
Docx4J.toHTML(wordMLPackage, null, null, os);


jason wrote:For further assistance, you'll need to post your docx.

Please, see the docx file attached. But I cann't manage file contents, users can attach all files they want (with simple content, of course).

Re: Converting numbering list to html

PostPosted: Tue Oct 20, 2015 4:57 pm
by jason
To manually number the paragraphs in v3.2.2, set

htmlSettings.getFeatures().remove(ConversionFeatures.PP_HTML_COLLECT_LISTS);

Otherwise, HTML li will be relied on to set the number/bullet. The problem in this case is that as you say, the CSS sets the display property to 'block', which on <li> prevents its usual behaviour.

The following commit to 3.3.0 branch sets block:list-item

https://github.com/plutext/docx4j/commi ... 3f3dfa7907

Re: Converting numbering list to html

PostPosted: Tue Oct 20, 2015 7:50 pm
by jason

Re: Converting numbering list to html

PostPosted: Tue Oct 20, 2015 11:50 pm
by CrazyHog
Thank you for your prompt response.
When I switched to 3.3.0 library I got the following result:
Code: Select all
<!-- userBodyTop goes here -->
  <div class="document">
  <li class="a3 a DocDefaults " style="display: list-item;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">One</span></li>
  <li class="a3 a DocDefaults " style="display: list-item;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">Two</span></li>
  <li class="a3 a DocDefaults " style="display: list-item;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">Three</span><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';white-space:pre-wrap;"> </span></li></div>
  <!-- userBodyTail goes here -->

ie all list elements are displayed as separate bullet item (without <ol> surrounding tag). Only when I registered TagHandler:
Code: Select all
SdtWriter.registerTagHandler("HTML_ELEMENT", new SdtToListSdtTagHandler());

html was generated with numbering list (with <ol> tag):
Code: Select all
<div class="document"><ol>
 
  <li class="a3 a DocDefaults " style="display: list-item;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">One</span></li>
 
  <li class="a3 a DocDefaults " style="display: list-item;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">Two</span></li>
 
  <li class="a3 a DocDefaults " style="display: list-item;"><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';">Three</span><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';white-space:pre-wrap;"> </span><span class="a0 " style="font-size: 14.0pt;;font-family: 'Times New Roman';white-space:pre-wrap;"> </span></li></ol>


v3.2.2 with workaround above works too!
Thanks again!