Page 1 of 1

Embedded bold or italic font problem

PostPosted: Sat Nov 21, 2020 3:15 am
by grieu2
Hi,

I have a problem with embedded fonts when there is bold or italic elements in text (files embedded are big so i use file transfert https://www.transfernow.net/aNsfED112020).
It works perfectly when there is no bold or italic element in document (see helvetica.docx => helvetica.pdf)

But when bold or italic style is applied (see helvetica-bold.docx), i have this error (see stack.txt):

2020-11-20 15:35:19 ERROR LazyFont:130 - Failed to read font file file:/home/dev/.docx4all/temporary%20embedded%20fonts/1605882916587-Helvetica-bold.ttf
java.io.FileNotFoundException: /home/dev/.docx4all/temporary embedded fonts/1605882916587-Helvetica-bold.ttf

I can see in logs that embedded fonts are good listed (see logs.txt):
2020-11-20 15:35:16 INFO FontTablePart:382 - Writing temp embedded fonts 1605882916587
2020-11-20 15:35:16 INFO ObfuscatedFontPart:382 - deObfuscating 'Calibri' with fontkey: {01AE7680-03BB-4265-9983-001F0322C3BA}
2020-11-20 15:35:16 INFO ObfuscatedFontPart:382 - deObfuscating 'Calibri-bold' with fontkey: {70B9A714-370E-4A7B-99A5-82A2670355A1}
2020-11-20 15:35:16 INFO ObfuscatedFontPart:382 - deObfuscating 'Helvetica' with fontkey: {19EF9272-79AE-455E-9F01-9700593173EE}
2020-11-20 15:35:16 INFO ObfuscatedFontPart:382 - deObfuscating 'Helvetica-bold' with fontkey: {2CD2AF45-17C0-4228-962B-24EDCDF946C0}
2020-11-20 15:35:16 INFO ObfuscatedFontPart:382 - deObfuscating 'Calibri Light' with fontkey: {DEB2FBB5-EF76-48A5-B06B-9DE3C816CA5F}

I upgraded to version 11.2.5 yesterday (it doesn't work before and after :(),
implementation("org.docx4j:docx4j-JAXB-ReferenceImpl:11.2.5")
implementation("org.docx4j:docx4j-export-fo:11.2.5")

and my code is very simple with default configuration :
val wordMLPackage: WordprocessingMLPackage = WordprocessingMLPackage.load(FileInputStream(inputFileName))
val os = FileOutputStream(outputFileName)

val foSettings = Docx4J.createFOSettings()
foSettings.opcPackage = wordMLPackage

Docx4J.toFO(foSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL)

Is there any configuration to fix this problem ?

Thank you for any help.

Re: Embedded bold or italic font problem

PostPosted: Thu Nov 26, 2020 6:19 am
by jason
Works for me.

" LazyFont:130 - Failed to read font file" doesn't correspond with anything in docx4j-core/src/main/java/org/docx4j/fonts/fop/fonts/LazyFont.java

So maybe this is your clue. What jars do you have on your classpath?

Re: Embedded bold or italic font problem

PostPosted: Thu Dec 03, 2020 1:44 am
by iceman
I can confirm, we are seeing the same problem.

Everything works as long as we don't have bold / italic fonts. If we have bold & italic fonts, it crashes with the problem as grieu2 wrote -> the problem is the fonts are getting deleted (they will be extracted 3 times), but when the solution wants to find the bold file -> the font files are already deleted (at least the third extraction set). (And as we have the time in the filename -> it will not use the 2 first extraction sets.

I think it is https://github.com/plutext/docx4j/issues/289

Re: Embedded bold or italic font problem

PostPosted: Sat Dec 05, 2020 11:23 am
by jason
grieu2 wrote:2020-11-20 15:35:16 INFO ObfuscatedFontPart:382 - deObfuscating 'Helvetica-bold' with fontkey: {2CD2AF45-17C0-4228-962B-24EDCDF946C0}


Using your helvetica-bold.docx, with debug on, I see:

Code: Select all
10:21:39.474 [main] INFO  o.d.o.p.W.ObfuscatedFontPart 89 - deObfuscating 'Helvetica-bold' with fontkey: {BF1EEA01-E85E-4B1B-A453-C0C2D38A72F8}
10:21:39.474 [main] DEBUG o.d.o.p.W.ObfuscatedFontPart 98 - BF1EEA01-E85E-4B1B-A453-C0C2D38A72F8
10:21:39.474 [main] DEBUG o.d.o.p.W.ObfuscatedFontPart 102 - BF1EEA01E85E4B1BA453C0C2D38A72F8
10:21:39.477 [main] DEBUG o.d.o.p.W.ObfuscatedFontPart 130 - wrote: 980756
10:21:39.477 [main] DEBUG o.d.o.p.W.ObfuscatedFontPart 147 - Loading from: /home/jharrop/.docx4all/temporary embedded fonts/1607124099385-Helvetica-bold.ttf
10:21:39.480 [main] INFO  o.d.o.p.W.ObfuscatedFontPart 157 - Successfully reloaded Arial-BoldMT
10:21:39.480 [main] DEBUG o.d.o.p.W.ObfuscatedFontPart 159 - confirmed embeddable
10:21:39.483 [main] DEBUG o.d.o.p.W.ObfuscatedFontPart 87 - bytes: 1393188


In other words, 1607124099385-Helvetica-bold.ttf actually contains Arial-BoldMT!

This is "normal" Office on Windows behaviour, coming from a Windows font substitution: see https://office-watch.com/2014/windows-s ... helvetica/ and in the registry Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontSubstitutes

Note also that there is a difference between:

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
      <w:r>
        <w:rPr>
          <w:rFonts w:ascii="Helvetica"/>
          <w:b/>
        </w:rPr>
        <w:t>Helvetica b</w:t>
      </w:r>
 
Parsed in 0.001 seconds, using GeSHi 1.0.8.4


and

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
      <w:r>
        <w:rPr>
          <w:rFonts w:ascii="Helvetica Bold" />
        </w:rPr>
        <w:t >Helvetica bold</w:t>
      </w:r>
 
Parsed in 0.000 seconds, using GeSHi 1.0.8.4


Word usually writes the former.

This isn't an answer to your issue, just some observations along the way...

Re: Embedded bold or italic font problem

PostPosted: Mon Dec 07, 2020 9:47 am
by jason
Tracking the helvetica bold issue as https://github.com/plutext/docx4j/issues/424

iceman, there will be a new release (8.2.6) today which should fix your issue