Page 1 of 1

Reading Hyperlink from Word doc

PostPosted: Thu Mar 02, 2017 10:37 am
by laurenquintanilla
Hello,

I have an html link in a word doc that points to (http://www.docx4java.org).
Using the following code,
WordprocessingMLPackage wordMLPackage;
wordMLPackage.getMainDocumentPart().getXML();

this xml is generated below. I can use xslt to read the hyperlink and its style, but I don't see the URL for the target. How do I get that? Somehow Docx4J.toHTML finds the target, but I don't see where it is getting it from.

<w:document mc:Ignorable="w14 wp14" xmlns:dsp="http://schemas.microsoft.com/office/drawing/2008/diagram" xmlns:odx="http://opendope.org/xpaths" xmlns:xdr="http://schemas.openxmlformats.org/drawingml/2006/spreadsheetDrawing" xmlns:odgm="http://opendope.org/SmartArt/DataHierarchy" xmlns:dgm="http://schemas.openxmlformats.org/drawingml/2006/diagram" xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:ns9="http://schemas.openxmlformats.org/schemaLibrary/2006/main" xmlns:ns12="http://schemas.openxmlformats.org/drawingml/2006/chartDrawing" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:ns33="http://schemas.openxmlformats.org/drawingml/2006/lockedCanvas" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:ns32="http://schemas.openxmlformats.org/drawingml/2006/compatibility" xmlns:ns17="urn:schemas-microsoft-com:office:excel" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:b="http://schemas.openxmlformats.org/officeDocument/2006/bibliography" xmlns:c="http://schemas.openxmlformats.org/drawingml/2006/chart" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:oda="http://opendope.org/answers" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:odc="http://opendope.org/conditions" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:ns23="http://schemas.microsoft.com/office/2006/coverPageProps" xmlns:odi="http://opendope.org/components" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:ns21="urn:schemas-microsoft-com:office:powerpoint" xmlns:odq="http://opendope.org/questions">
<w:body>
<w:p w:rsidR="00690910" w:rsidP="005C58B0" w:rsidRDefault="001A7177">
<w:pPr>
<w:pStyle w:val="Heading1"/>
</w:pPr>
<w:hyperlink w:history="true" r:id="rId6">
<w:r w:rsidRPr="001A7177">
<w:rPr>
<w:rStyle w:val="Hyperlink"/>
</w:rPr>
<w:t>Header1ExternalLinkExample</w:t>
</w:r>
</w:hyperlink>
<w:bookmarkStart w:name="_GoBack" w:id="0"/>
<w:bookmarkEnd w:id="0"/>
</w:p>
<w:sectPr w:rsidR="00690910">
<w:pgSz w:w="12240" w:h="15840"/>
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="720" w:footer="720" w:gutter="0"/>
<w:cols w:space="720"/>
<w:docGrid w:linePitch="360"/>
</w:sectPr>
</w:body>
</w:document>

Thanks!

Re: Reading Hyperlink from Word doc

PostPosted: Thu Mar 02, 2017 2:33 pm
by jason
It comes from the relationship, w:hyperlink/@r:id

For what its worth, the HTML output uses https://github.com/plutext/docx4j/blob/ ... l.java#L90

Probably best to ignore that...

Instead, assuming your hyperlink is in MainDocumentPart mdp, you'll get the Relationship using mdp.getRelationshipsPart().getRelationshipByID(id) where id is the value of w:hyperlink/@r:id. You read that from your P.Hyperlink object. You then read the URL from the Relationship object.