Page 1 of 1

[WORD] - extract information

PostPosted: Thu Feb 13, 2014 8:56 pm
by marcel_cap
Hello,

I need a way to extract information from a word document. (I am able to choose the .docx form)

For instance extract a list of elements ....

Do you have ideas ? or lead to follow ?


Thanks,

Re: [WORD] - extract information

PostPosted: Fri Feb 14, 2014 6:53 am
by jason
XPath, TraversalUtil or TextUtil - see the Getting Started document for more info.

Re: [WORD] - extract information

PostPosted: Wed Feb 04, 2015 11:11 pm
by marcel_cap
Thankyou,

Is it possible to extract text whith a REGEXP ?

Re: [WORD] - extract information

PostPosted: Thu Feb 05, 2015 11:35 am
by jason
XPath 2 has support for regex, so you could try a suitable implementation.

Or you could traverse into paragraphs, extract text, and run your regex on that.