Page 1 of 1

HTML conversion including page breaks

PostPosted: Fri Aug 03, 2012 3:39 am
by shukii
Hi there!

Say I have some html, that I'm parsing using docx4j to write in .docx format to the output stream:

.....
<p>Blah blah blah</p>
<br/>
<p>Blah blah blah</p>
....

How do I get docx4j to treat the br tag as a page break?

Thanks

Re: HTML conversion including page breaks

PostPosted: Sat Aug 04, 2012 9:57 am
by jason
There are 2 steps to this:

(1) knowing whether/how FlyingSaucer does it (ie what CSS do you need for FlyingSaucer to create a page break in its PDF output)
(2) adding support to docx4j's XHTMLImporter

Regarding (1), I think the CSS is just page-break-before or page-break-after

So docx4j's XHTMLImporter needs to support page-break-before or page-break-after. Should be do-able. Not sure whether FlyingSaucer will expose that on a br element, or a div, or a p, but that would become clear during implementation.