Page 1 of 1

HTML Convertion Problem

PostPosted: Wed Jan 07, 2015 2:53 am
by chris_2
Hi.

I have been running into Problems with the docx4j HTML convertion of a docx.

The main Problem supposed to be, that the docx is generated by a tool. If I open the document in WORD it is opened in compatibilty mode. So far, here is the Output of the convertion Routine:

Code: Select all
Information: No MOXy JAXB config found; assume not intended..
Warnung: name: com.sun.xml.internal.bind.namespacePrefixMapper value: org.docx4j.jaxb.NamespacePrefixMapperSunInternal@a10a231 .. trying RI.
Information: Using NamespacePrefixMapper, which is suitable for the JAXB RI
Information: Using JAXB Reference Implementation
Information: Not using MOXy; using com.sun.xml.bind.v2.runtime.JAXBContextImpl
Warnung: Couldn't get resource: docx4j.properties
Warnung: Couldn't find/read docx4j.properties; docx4j.properties not found via classloader.
Information: Using com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
Information: Using com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
Information: Detected WordProcessingML package
Information: Instantiated package of type org.docx4j.openpackaging.packages.WordprocessingMLPackage
Information: xpath implementation: org.apache.xpath.jaxp.XPathFactoryImpl
Schwerwiegend: No subclass found for /word/media/image1.png; defaulting to binary
Schwerwiegend: No subclass found for /word/media/image2.png; defaulting to binary
Information: package read;  elapsed time: 4845 ms
Information: Lazily unmarshalling /word/document.xml
Information: For org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart, unmarshall via binder
Information: Lazily unmarshalling /word/styles.xml
Information: For org.docx4j.openpackaging.parts.WordprocessingML.StyleDefinitionsPart, unmarshall via binder
Information: ---------------------------------------------------------------> HTML-Package
Information: ---------------------------------------------------------------> WML-Package
Information: ---------------------------------------------------------------> CONVERT
Information: Lazily unmarshalling /docProps/app.xml
Information: unmarshalling org.docx4j.openpackaging.parts.DocPropsExtendedPart
Information: Lazily unmarshalling /docProps/core.xml
Information: unmarshalling org.docx4j.openpackaging.parts.DocPropsCorePart
Information: Lazily unmarshalling /word/endnotes.xml
Information: For org.docx4j.openpackaging.parts.WordprocessingML.EndnotesPart, unmarshall via binder
Information: Lazily unmarshalling /word/footnotes.xml
Information: For org.docx4j.openpackaging.parts.WordprocessingML.FootnotesPart, unmarshall via binder
Information: Lazily unmarshalling /word/numbering.xml
Information: For org.docx4j.openpackaging.parts.WordprocessingML.NumberingDefinitionsPart, unmarshall via binder
Information: Lazily unmarshalling /word/theme/theme1.xml
Information: For org.docx4j.openpackaging.parts.ThemePart, unmarshall via binder
Information: Lazily unmarshalling /word/webSettings.xml
Information: Lazily unmarshalling /word/fontTable.xml
Information: Lazily unmarshalling /word/settings.xml
Information: For org.docx4j.openpackaging.parts.WordprocessingML.DocumentSettingsPart, unmarshall via binder
Warnung: Aborting: file:/C:/Windows/FONTS/ALGER.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/arimon__.ttf (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/BAUHS93.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/BERNHC.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/BROADW.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/CHILLER.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/Gabriola.ttf (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/HARLOWSI.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/HARNGTON.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/HATTEN.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/impact.ttf (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/JOKERMAN.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/JUICE___.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/MSJH.TTC (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/MSJHBD.TTC (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/MSYH.TTC (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/MSYHBD.TTC (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/NewsGothicStd-Bold.otf (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/NewsGothicStd-BoldOblique.otf (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/NewsGothicStd-Oblique.otf (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/NewsGothicStd.otf (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/PLAYBILL.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/SNAP____.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/STENCIL.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Warnung: Aborting: file:/C:/Windows/FONTS/TEMPSITC.TTF (can't get EmbedFontInfo[] .. try deleting fop-fonts.cache?)
Information: fontsInUse..
Information: Style with name Normal, id 'Normal' is default paragraph style
Information: Set virtual style, id 'DocDefaults', name 'DocDefaults'
Information: Style with name Default Paragraph Font, id 'Fuentedeprrafopredeter' is default character style
Information: fontsInUse..
Information: Found existing style named DocDefaults
Information: Style with name Normal, id 'Normal' is default paragraph style
Information: Style with name Default Paragraph Font, id 'Fuentedeprrafopredeter' is default character style
Information: starting
Information: Style with name Normal Table, id 'Tablanormal' is default table style
Information: giving TableStyleFontSizeAndJustification primacy, as per this docx w:compatSetting
Warnung: Expected Normal-Tablaconcuadrcula-BR to have <w:basedOn ??
Information: Preparing StyleTree
Information: Outputting well-formed XHTML..
Information: /pkg:package
Warnung: TODO - implement for CTTblStylePr!
Warnung: TODO - implement for CTTblStylePr!
Warnung: ! null rPr for character style Fuentedeprrafopredeter


I can provide the docx via private message if necessary. It Needs to be converted online, thats why we decided to use docx4j. The document is converted to html and displayed in a browser without loss of Information, but does not look good. If I open the document in word, save it and convert it with the same code it Looks perfect. If someone could give me a hint what may be wrong it realy would be of help. I can Change a lot, but Need to know what to Change.

Thanks a lot.

ChriS

Re: HTML Convertion Problem

PostPosted: Wed Jan 07, 2015 7:24 am
by jason
chris_2 wrote:but does not look good


Since you do not provide any detail as to what is wrong, it is not possible to make any suggestions.

Assuming no more than say 4 to 5 issues, the best approach would be to isolate each issue in a separate test doc, and make a new thread for each (including the test case docx).

Re: HTML Convertion Problem

PostPosted: Wed Jan 07, 2015 7:53 pm
by chris_2
Hi Jason.

Thanks for the reply. I know, it is poor Information. I will try to modify one document to eliminate the sensitive Information and provide it to the Forum.

ChriS

Re: HTML Convertion Problem

PostPosted: Wed Jan 07, 2015 9:08 pm
by chris_2
Ok, please find attached a sample docx from our Generator.

This is a view how it should look like (sample has been saved in word and processed with docx4j.toHtml().
Gut.jpg
Gut.jpg (163.27 KiB) Viewed 2551 times


And this is a view how the docx from the Generator Looks like:
Schlecht.jpg
Schlecht.jpg (155.41 KiB) Viewed 2551 times


ChriS

Re: HTML Convertion Problem

PostPosted: Thu Jan 08, 2015 2:51 am
by chris_2
Hi Jason.

So far I solved most of the Problems. The Major issues have been in the Table-Grid-Settings of the docx (it seems Word fixes them in compatiility mode).

I have just one item left: All the table Contents is displayed vertically centered in the html produced by docx4j. Do I have to Change something in the styles of the tables to make the columns being displayed top-aligned?

Thanks for the help.

ChriS

Re: HTML Convertion Problem

PostPosted: Thu Jan 08, 2015 3:17 am
by chris_2
Ha, got it. Everything Looks fine now. Took a while to find out where to look for but now it is ok.

But maybe you could give me some Information on some of the Messages I have posted in my Initial post. I am pretty sure, it would be good to know what they mean:

Code: Select all
Information: No MOXy JAXB config found; assume not intended..
Warnung: name: com.sun.xml.internal.bind.namespacePrefixMapper value: org.docx4j.jaxb.NamespacePrefixMapperSunInternal@a10a231 .. trying RI.
Information: Using NamespacePrefixMapper, which is suitable for the JAXB RI
Information: Using JAXB Reference Implementation
Information: Not using MOXy; using com.sun.xml.bind.v2.runtime.JAXBContextImpl


From other Posts I have noted that it should be desireable to use MOXy. Is that true, and if yes what is missing to use MOXy?

Code: Select all
Warnung: Couldn't get resource: docx4j.properties
Warnung: Couldn't find/read docx4j.properties; docx4j.properties not found via classloader.


This sounds weired, but I am not sure if it is a Problem. Any comments?

Thanks again.

ChriS

Re: HTML Convertion Problem

PostPosted: Thu Jan 08, 2015 7:37 am
by jason
MOXy: only switch to it if it makes more sense in your environment. See docx-java-f6/moxy-t1242.html

docx4j.properties: get it from https://github.com/plutext/docx4j/tree/ ... _resources