Archive for April, 2018

From VariableReplace to OpenDoPE data binding

April 28th, 2018 by Jason

This blog post is a walkthrough of how to easily move from variable replacement to OpenDoPE content control data binding.

Introduction

Variable replacement is quite a popular way to get started generating Word documents.

I guess that’s because developers expect this sort of approach to be available, and its easy: all you have to do is add your variables to the document, then bang, you replace them.

But its not all a bed of roses, there’s some thorns in there too:

  1. the so-called “split run” problem, in which Word has split your variable name across more than one XML element, due to formatting, spelling/grammar etc
  2. variable replacement is great if you just want to replace text, but what if you want to replace images, conditionally include/exclude content, or repeat table rows or list entries?

Content control data binding is a great solution to these problems.

Your data (provided in XML format) is bound to content controls using XPath, and with the OpenDoPE conventions, this approach offers:

Some users create very complex contracts and reports this way.

Automated Migration

The good news is that docx4j contains code to automatically migrate a document which has variables on its surface, to one which contains OpenDoPE content controls.

The code is in FromVariableReplacement.java

Have a look at the main method to see how to use it.

There have been some fixes recently, so you should use docx4j-nightly-20180428.jar (or 3.3.8 when released) or later.

OK, let’s assume you now have a docx file with content controls in it.  You may want to further develop your template.  For this you need an OpenDoPE Authoring tool.

OpenDoPE Authoring

The friendliest OpenDoPE authoring tool is the “No-XML” Word AddIn.

With this it is very easy to add conditions, but the limitation is that it assumes a fixed XML format.   If you want to use your own XML format (or to bind escaped XHTML I suspect), you’ll need to use one of the older add ins.

Here we’ll walk through adding a condition with “No XML” add in.

For this example, we’ll use the Commonwealth of Australia’s model Confidentiality Agreement, available at http://www.business.gov.au/IPToolkit

Here’s what the first few blanks look like, represented as content controls with the “No XML” AddIn’s ribbon showing in Word:

NoXMLAddIn1

I had pressed the “Show tags” button to be able to see the content controls in orange above.

Further down, there’s an optional Indemnity clause.

Since its optional, let’s wrap that in a conditional content control.  First, we need a question “Do you want the Indemnity”.  It works this way because this AddIn is aimed primarily at the interactive use case, that is, a user can answer questions in their web browser to generate an instance document.

But the resulting template can be used just as easily for the more common non-interactive / entirely programmatic case.

So click the “Insert Q/A” button.  I did this with my cursor somewhere in the middle of the Indemnity clause.

Fill in the form (for Multiple Choice choose yes):

qa_mcq

click next, then on the next page, choose type boolean (true/false), then ok.

You’ll see a content control inserted where your cursor was.  We don’t really want that, so its a bit annoying (you can/should delete it).  You’ll see why we did this just below.

Now select the clause heading and body, and click “Wrap with Condition”.  You’ll see something like:

condition1

In the condition builder, define the following condition:

indemnity_condition

then click OK. (Now you can see why we needed to set up that question first)

Our resulting conditional clause appears as follows:

indemnity_result

That’s all you need to do.  We can now try generating an instance document from this template.

Document generation runtime

To generate a document, use docx4j code to populate an Answers object, then call Docx4J.bind. For example:

WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
.load(new java.io.File(inputfilepath));

answers = new Answers();

addAnswer("Sponsor_name_ACNABN_oW", "CSIRO of Some St, Sydney")
addAnswer("want__Indemnity_clause_K8", "true"); // or false
// etc

Docx4J.bind(wordMLPackage, answers, Docx4J.FLAG_BIND_INSERT_XML & Docx4J.FLAG_BIND_BIND_XML);
Docx4J.save(wordMLPackage, new File(outputfilepath), Docx4J.FLAG_NONE);

where addAnswer is just:

private void addAnswer(String key, String value) {
Answer a = new Answer();
a.setId(key);
a.setValue(value);
answers.getAnswerOrRepeat().add(a);
}

How do you know what key to use?  Look in the answers part in the docx and use the corresponding ID (yes, you should be able to see this in the AddIn, but the reason you can’t is that for the interactive use-case, you never need to know), or you can just invoke Docx4J.bind with debug level logging enabled for org.docx4j.model.datastorage, and it will print out the relevant part.

answersPart

That’s about it.  If you have questions, they are probably best posted in the relevant docx4j forum or on StackOverflow.