Page 1 of 1

how can i get the style of the content in a docx?

PostPosted: Thu Jun 18, 2009 7:38 am
by dqkit
Hi, all.
i am new to docx4j. i am working on checking the format of then content.
And I get several problems. :(
firstly,how can i get the font style in a .docx? such as Name,Size,Color..
secondly,how can i get the footer infomations? i wanna get the info especially the pageNum.
finallly,can i read the document page by page and then check the font style?
thanks for reply!

Re: how can i get the style of the content in a docx?

PostPosted: Thu Jun 18, 2009 8:16 am
by jason
dqkit wrote:firstly,how can i get the font style in a .docx? such as Name,Size,Color..


Each run of text (w:r) can potentially have different font properties. This can be explicit direct formatting, it can be via the style hierarchy, or ultimately, the document default.

Have a look at getEffectiveRPr in http://dev.plutext.org/trac/docx4j/brow ... olver.java

dqkit wrote:secondly,how can i get the footer infomations? i wanna get the info especially the pageNum.


Have a look at http://dev.plutext.org/trac/docx4j/brow ... olicy.java

but note, the page numbers won't be present explicitly.

What are you trying to do? Sometimes LastRenderedPageBreak is your friend.

dqkit wrote:finallly,can i read the document page by page and then check the font style?


See above. LastRenderedPageBreak may be close enough to "page by page" for you; font style will vary as per answer to your first question.

good luck .. Jason

Re: how can i get the style of the content in a docx?

PostPosted: Thu Jun 18, 2009 10:32 am
by dqkit
jason, thanks very much !
I am trying...

Re: how can i get the style of the content in a docx?

PostPosted: Thu Jun 18, 2009 2:44 pm
by dqkit
:(
jason, i still have no luck..
what i am gonna do is to check the format of the docx files. for example, Heading 1 should use the font Arial Size 14, and the people who hand in to me that is Times New Roma Size 16, so i am gonna tell him he's wrong..

my docx file is as the following link.
there are header and footer in the docx, how can i get the text and the style of that ?
could you please give an exmaple code ?
thanks a lot, jason.

Re: how can i get the style of the content in a docx?

PostPosted: Thu Jun 18, 2009 4:13 pm
by jason
In HeaderFooterPolicy, there is the following code:

Code: Select all
Hdr hdr = (Hdr) wordmlPackage.getHeaderFooterPolicy().getFirstHeader().getJaxbElement();


http://dev.plutext.org/trac/docx4j/brow ... l/Hdr.java

You can then look at the contents of the header with its getEGBlockLevelElts() -- just like going through the contents of w:body. See the OpenMainDocument and Traverse sample in the samples dir.

If you are still having problems I'll be happy to have a quick at your code and example document (as opposed to writing the code from scratch for you). Rather than posting a binary, it's easiest if you do save as .xml in Word, and if you can probably just send the relevant header.

Oh, does your document have sections in it? Headers/footers are defined per section. Maybe you care about that, or maybe you don't. HeaderFooterPolicy currently only deals with the sectPr which is last child of w:body. Another approach would be to iterate throug the document rels, looking at each header/footer part.

Re: how can i get the style of the content in a docx?

PostPosted: Fri Jun 19, 2009 5:35 am
by dqkit
thank you, jason !
with my poor English, now i can get the content of the footer.
thanks a lot.

but, there is another questtion. it seems there is no pageNum in Footer Part, is that correct ?

my code:
Code: Select all
InputStream is = new FileInputStream(fileName);
      org.docx4j.openpackaging.packages.Package pkg = new LoadFromZipNG()
            .get(is);
      FooterPart footer = (FooterPart) pkg.getParts().get(
            new PartName("/word/footer1.xml"));
      Ftr doc = (Ftr) footer.getJaxbElement();
      for (Object o : doc.getEGBlockLevelElts()) {
         if (o instanceof P) {
            P p = (P) o;
            String val = p.toString().trim();
            if (val.length() > 0) {
               System.out.println(p.toString());
            }            
            
         }         
      }      
      is.close();

Re: how can i get the style of the content in a docx?

PostPosted: Fri Jun 19, 2009 8:07 am
by jason
Given you have

Code: Select all
Ftr doc = (Ftr) footer.getJaxbElement();


Please post its contents here using:

Code: Select all
XmlUtils.marshallToString(doc, true, true);