Page 1 of 1
Issue with MathML

Posted:
Thu Aug 11, 2022 7:41 pm
by Milqn
Hi,
Nice work you have done with Docx4j!
I need to convert html with MathML to docx but during conversion MathML tags are striped.
How could I make this work?
Hope you can help me.
Thanks
Milqn
Re: Issue with MathML

Posted:
Fri Aug 12, 2022 7:50 pm
by jason
There is XSLT for converting OMML to MathML, but maybe not for the converse.
Google reveals python and npm libraries called mathml2omml (and there may be others); if one of these work well enough on your MathML, you could integrate them into docx4j-ImportXHTML.
Re: Issue with MathML

Posted:
Fri Aug 12, 2022 10:06 pm
by Milqn
Thank you for your answer!
I also thought that my issue could arise due to the Ooml.
MathML tags are not preserved and hence, LibreOffice also can't display math.
Re: Issue with MathML

Posted:
Tue Aug 16, 2022 7:37 am
by Milqn
Please don't mind but I need to ask what would be the best way to include MML2OMML?
I have find MML2OMML.XSL stylesheet.
How would you do it?
Thanks a lot
Re: Issue with MathML

Posted:
Fri Aug 19, 2022 4:41 am
by Milqn
Any assistance would be gratefully appreciated.
jason wrote:There is XSLT for converting OMML to MathML, but maybe not for the converse.
Google reveals python and npm libraries called mathml2omml (and there may be others); if one of these work well enough on your MathML, you could integrate them into docx4j-ImportXHTML.
Re: Issue with MathML

Posted:
Tue Aug 23, 2022 9:57 pm
by jason
Here's a sketch of how to do it to help you get started, mostly untested, but based on importing the following sample XHTML:
- Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>MathML in XHTML</title>
</head>
<body>
<p>
Follows:
<math xmlns="http://www.w3.org/1998/Math/MathML">
<mfrac>
<mn>1</mn>
<msqrt>
<mn>2</mn>
</msqrt>
</mfrac>
</math>
</p>
</body>
</html>
In XHTMLImporterImpl, at approx line 1383, add and flesh out the following block of code:
Using java Syntax Highlighting
} else if (e.
getNodeName().
equals("math")) {
// handle me
System.
out.
println("TODO: Handle mathml \n\r" + XmlUtils.
w3CDomNodeToString(e
) );
// Prepare to transform Element e
Templates xslt
= null; // your mathml2omml.xslt
// Use constructor which takes Unmarshaller, rather than JAXBContext,
// so we can set JaxbValidationEventHandler
JAXBContext jc
= Context.
jc;
Unmarshaller u
;
try {
u
= jc.
createUnmarshaller();
u.
setEventHandler(new org.
docx4j.
jaxb.
JaxbValidationEventHandler());
jakarta.
xml.
bind.
util.
JAXBResult result
= new jakarta.
xml.
bind.
util.
JAXBResult(u
);
XmlUtils.
transform(new DOMSource
(e
), xslt,
null, result
);
// What happened?
Object o
= result.
getResult();
// Attach it to the document
this.
contentContextStack.
peek().
getContent().
add(); // or addAll
} catch (JAXBException e1
) {
e1.
printStackTrace();
}
return;
Parsed in 0.017 seconds, using
GeSHi 1.0.8.4
Re: Issue with MathML

Posted:
Wed Aug 24, 2022 10:41 pm
by jason
The following proof of concept code works on my sample XHTML:
Using java Syntax Highlighting
} else if (e.
getNodeName().
equals("math")) {
// handle me
System.
out.
println("Handling mathml \n\r" + XmlUtils.
w3CDomNodeToString(e
) );
try {
// Prepare to transform Element e
Source xsltSource
= new StreamSource
(
ResourceUtils.
getResource(
"mml2omml.xsl")
); // https://raw.githubusercontent.com/Marti ... l2omml.xsl
/* You need to add the template:
*
<xsl:template match="/|*">
<oMath>
<xsl:apply-templates mode="mml" />
</oMath>
</xsl:template>
*/
Templates xslt
= XmlUtils.
getTransformerTemplate(xsltSource
);
// Use constructor which takes Unmarshaller, rather than JAXBContext,
// so we can set JaxbValidationEventHandler
JAXBContext jc
= Context.
jc;
Unmarshaller u
= jc.
createUnmarshaller();
u.
setEventHandler(new org.
docx4j.
jaxb.
JaxbValidationEventHandler());
jakarta.
xml.
bind.
util.
JAXBResult result
= new jakarta.
xml.
bind.
util.
JAXBResult(u
);
XmlUtils.
transform(new DOMSource
(e
), xslt,
null, result
);
// What happened?
org.
docx4j.
math.
CTOMath math
= (org.
docx4j.
math.
CTOMath)XmlUtils.
unwrap(result.
getResult());
org.
docx4j.
math.
ObjectFactory mathObjectFactory
= new org.
docx4j.
math.
ObjectFactory();
// Create object for oMathPara (wrapped in JAXBElement)
CTOMathPara omathpara
= mathObjectFactory.
createCTOMathPara();
JAXBElement
<org.
docx4j.
math.
CTOMathPara> omathparaWrapped
= mathObjectFactory.
createOMathPara(omathpara
);
omathpara.
getOMath().
add(math
);
P wP
= new P
();
wP.
getContent().
add(omathparaWrapped
);
// Attach it to the document
this.
contentContextStack.
peek().
getContent().
add(wP
);
} catch (Exception e1
) {
throw new Docx4JException
("Error processing MathML", e1
);
}
return;
Parsed in 0.017 seconds, using
GeSHi 1.0.8.4
Note that the heavy lifting is done by
https://raw.githubusercontent.com/Marti ... l2omml.xsl
Re: Issue with MathML

Posted:
Thu Aug 25, 2022 9:03 am
by infomladen
That's what I'm looking for. Great work.
Thanks a lot, Jason.