The most reliable way to do this is using Word.
If that is not feasible for you, LibreOffice driven by JODConverter is an alternative.
In principle, Plutext's PDF Converter could also do this, but we do not currently expose that functionality.
Until recently, docx4j had a simple proof of concept in it using POI's HWPF to read the binary doc, but it was far from complete.
Suffice to say this is non-trivial...