Page 1 of 1

Bullets in w:lvlText

PostPosted: Mon Dec 21, 2009 4:15 pm
by pro-tbl
Hi jason,

I am Handling the list in both html->docx and docx->html. I have few problems in html->docx.

1. problem in setting the Lvltext.

In case of OL, i can able to set the value.

Lvl.LvlText levelText = lvl.getLvlText();
levelText.setVal("some value"); set the values according to the list depth i.e., %1. , %3.

But in case of UL, How can i set the exact lvlText values?.
<w:lvlText w:val="?"/>, Even some editors not able to show the values. It seems the values are some unicode characters. docx4j have any method to read the values?.

Re: HTML to docx: bullets

PostPosted: Sat Jan 02, 2010 1:48 am
by jason
pro-tbl wrote:Even some editors not able to show the values. It seems the values are some unicode characters.


Word evidently uses the "private use area" for a bullet in w:lvlText. (You'll see this if you open in SC UniPad or similar a document generated by Word 2007 which contains bullets)

In relation to symbols, see description of @char in ECMA-376 > Part4 > 2 WordprocessingML Reference Material > 2.3 Paragraphs and Rich Formatting > 2.3.3 Run Content >2.3.3.29 sym (Symbol Character) at eg http://www.documentinteropinitiative.org/implnotes/ECMA-376/dde33d13-9c57-472a-939e-f67e187f8667.aspx

I presume the same approach is taken with bullets, but haven't confirmed that in the spec. (If you find the reference, please post it here)

The closest docx4j gets to providing anything here is going the other way in http://dev.plutext.org/trac/docx4j/browser/trunk/docx4j/src/main/java/org/docx4j/convert/out/html/SymbolWriter.java

If you can contribute code to create the various bullets Word commonly uses in lvlText, I'd gratefully accept it. The essence of what is required to reproduce Word's bullets is simply to find their character codes (use eg SC UniPad); you can then use Java's \uxxxx syntax. But I wonder whether there is any reason to use "private use area" bullets when creating bullets in docx4j: why not use Unicode bullet \u2022 and other unicode characters. See http://nadeausoftware.com/articles/2007/11/latency_friendly_customized_bullets_using_unicode_characters. So probably all we need to do is to define some suitably named constants (ie BULLET_X, where X could be the unicode name/description)

cheers .. Jason