Page 1 of 1

Core Properties from DOCX to PDF

PostPosted: Sat Oct 04, 2014 4:54 am
by MikeH
Hello,

I've seen the different posts on Core Properties and how to create them. I have been successful in creating Core Properties (Author (dc:creator), Subject (dc:description) and Title (dc:title)). These are in my docx file after merging xml with a template. (I set the Core Properties with values found from xml).

The problem I'm having is that I can't get these Core Properties to show up in my PDF.

I set up the FO exporter and use Docx4j.toFO to generate the PDF. But I don't see where I should set a setting to convert Core Properties as well. (If such a setting/capability even exists.)

boolean saveFO = true;
FOSettings foSettings = Docx4J.createFOSettings();
if (saveFO) {
foSettings.setFoDumpFile(new java.io.File(outpdf + ".fo"));
}
foSettings.setWmlPackage(wordMLPackage);

OutputStream os = new java.io.FileOutputStream(outpdf);

Docx4J.toFO(foSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);

I realize this is probably a FO question as opposed to a Docx4J question, but I'll ask here anyways. Has anyone had Core Properties brought over into a PDF from a DOCX using Docx4J? If so, do you have any sample code anywhere? Or something that can point me in the right direction?

Thank you for any assistance

Mike

Re: Core Properties from DOCX to PDF

PostPosted: Sat Oct 04, 2014 8:25 am
by jason
Good question.

Per http://xmlgraphics.apache.org/fop/1.1/metadata.html

Originally, you could set some metadata information through FOP's FOUserAgent by using its set*() methods (like setTitle(String) or setAuthor(String). These values are directly used to set value in the PDF Info object. Since PDF 1.4, adding metadata as an XMP document to a PDF is possible. That means that there are now two mechanisms in PDF that hold metadata.

Apache FOP now synchronizes the Info and the Metadata object in PDF, i.e. when you set the title and the author through the FOUserAgent, the two values will end up in the (old) Info object and in the new Metadata object as XMP content. If instead of FOUserAgent, you embed XMP metadata in the XSL-FO document .. the XMP metadata will be used as-is in the PDF Metadata object and some values from the XMP metadata will be copied to the Info object to maintain backwards-compatibility for PDF readers that don't support XMP metadata.


So what you want is quite do-able, but docx4j will need a small enhancement to do it.