Page 1 of 1

Docx4j xhtml to pdf retain styles/css

PostPosted: Mon Mar 31, 2014 9:04 pm
by praveen065
Hi Jason,
I am writing Java application with below functionality using docx4j

Input: XHTML div content + .css file
Ouput:PDF or Word File

I could able to convert the xhtml to pdf successfulle but could not retain the styles.
It is working only when styles sent inline

e.g.,
Working
Code: Select all
<div style="color:red;margin-left:20px;">Text 1</div>


Not working
Code: Select all
<div class="head-red">Text 1</div>


Is there anyway to read the .css file and apply the style to the Xhtml content.
Thanks is advance.

Regards,
Praveen_J

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Mon Mar 31, 2014 9:34 pm
by jason
Hi Praveen,

Please see xhtml-import-f28/xhtmlimporter-and-headings-t1823.html#p6235

Does that help?

cheers .. Jason

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Tue Apr 01, 2014 7:55 pm
by praveen065
Hi Jason,
I do not want to add styles manually like
Code: Select all
tt.addAttributeTransformation("class", "Heading1");
manually as I have hundreds of styles in my css file.
Could you help me in loading the styles from a .css file.

I tried
Code: Select all
XHTMLImporter.setRunFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);
as well as adding style to XhtmlNamespaceHandler.css inside docx4j-ImportXHTML-nightly-20140121.jar
Still no effect in the generated pdf.



Actual Code:

Code: Select all
// Getting html part from request
         InputStream is = request.getInputStream();
         ByteArrayOutputStream os = new ByteArrayOutputStream();
         byte[] buf = new byte[32];
         int r = 0;
         while (r >= 0) {
            r = is.read(buf);
            if (r >= 0)
            os.write(buf, 0, r);
         }
         String s = new String(os.toByteArray(), "UTF-8");
         String decoded = URLDecoder.decode(s, "UTF-8");
         
         //Creating document
         wordMLPackage = WordprocessingMLPackage.createPackage();
         NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
         wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
         ndp.unmarshalDefaultNumbering();
         
         XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
         XHTMLImporter.setHyperlinkStyle("Hyperlink");
         
         wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert(decoded, baseURL));
         
         //Writing to pdf
         FOSettings foSettings = Docx4J.createFOSettings();
         foSettings.setWmlPackage(wordMLPackage);
         // Docx4J.toFO(foSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);
         Docx4J.toFO(foSettings, response.getOutputStream(), Docx4J.FLAG_NONE);
         response.setContentType("application/pdf");

Regards,
Praveen_J

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Tue Apr 01, 2014 8:30 pm
by jason
Your code contains

Code: Select all
wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert(decoded, baseURL));


Have you set baseURL appropriately?

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Tue Apr 01, 2014 8:35 pm
by praveen065
S dude..

String baseURL = "C:\\Users\\Praveen\\Desktop";

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Tue Apr 01, 2014 9:49 pm
by jason
And what does your reference to the CSS file in your XHTML look like?

If that is in order, you might need to run your code in a debugger to see what is happening.

Have you got debug level logging enabled on org.docx4j.org.xhtmlrenderer ? That ought to give you some clues...

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Tue Apr 01, 2014 10:35 pm
by praveen065
JaSON,
While debugging, I could see the styles loaded from docx4j-3.0.1.jar\org\docx4j\openpackaging\parts\WordprocessingML\styles.xml only


My Input to request would be a XHTML code

Code: Select all
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
      <title>HTML Generator Sample Page</title>
   </head>
   <body>
      <div class="test-class">Test Text</div>
   </body>
</html>


And I want to load/pick the class styles from a separate css file.

test.css
Code: Select all
test-class{
font-style: italic
}



Could you share your mobile number to my mail id "praveen_spiceboy@yahoo.co.in" so that I could call and explain you the exact scenario whats happening.

Regards,
Praveen_J

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Tue Apr 01, 2014 11:37 pm
by jason
You have 3 options(short of modifying the source code, which of course you can also do..)

Option 1: include a link to the external css file in the xhtml in the usual way (I had assumed so far that you were doing this!)

Option 2: set up you docx with the styles you want (with suitable names), and use FormattingOption.CLASS_TO_STYLE_ONLY

Option 3: add the style to XhtmlNamespaceHandler.css (that should've worked..)

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Thu Apr 03, 2014 8:43 pm
by praveen065
Hi Jason,
Thanks for your suggestions.
I made it work by formimg the whole html page as given below.

Code: Select all
      String header1 = "<html><head><style>";
      String cssStyles = FileUtils.readFileToString( new File("C:\\Workspace_Rad\\PrintDocument\\WebContent\\css\\test.css"));
      String header2 = "</style></head><body>";   
      String divPart = request.getParameter("data");// Contains the only div part to print sent through ajax
      String footer = "</body></html>";

      //Actual code given to docx4j to convert
      String toConvertPDF =  header1 + cssStyles + header2 + divPart + footer;



But after converting, I could see many classes are not applied successfully.
Could you explain if there are any limitations using the XTHML Importer with respect to XHTML/CSS versions.
Please refer the attachment for the details.

Regards,
Praveen_J

Re: Docx4j xhtml to pdf retain styles/css

PostPosted: Thu Apr 03, 2014 9:40 pm
by jason
As you'd be aware, you're actually doing 2 steps:

1. XHTML to docx
2. docx to PDF

To understand where any discrepancies are creeping in, you should analyse the 2 steps separately.

At present, we don't have comprehensive documentation as to what is and isn't supported.