Page 1 of 1

Numbering and document merging: Is it possible?

PostPosted: Fri Mar 02, 2012 11:08 pm
by elextra
Hi,

I am trying to assemble a word document out of several sections or clauses. Say we have 4 sections.
section1.docx
1. Clause ABC
This is some text
section2.docx
2. Clause DEF
This is some text
section3.docx
3. Clause GHI
This is some text
section4.docx
4. Clause JKL
This is some text

Each clause document contains its styling, numbering, etc...
Now, depending on some run time conditions, I need to insert at run time some (or All) of these clauses into a word report document.

Using AltChunk seemed to me to like a good choice. So I went ahead with that and was able to build my document out of the different sections. Except for 1 major detail: "numbering"
I.e: Say I need to insert section2.docx and section4.docx in my final document. Today, with what I have, my final document looks like this:
2. Clause DEF
This is some text
4. Clause JKL
This is some text

The numbering are obviously off. I would like to have instead:
1. Clause DEF
This is some text
2. Clause JKL
This is some text

Is there a way to go by solving this with docx4java? If not, can you point me to another tool which could help me achieve something like that.
I thought of leaving out the numbers in the sections documents and use instead content control to have those numbers bound and replaced at run time ...but it just
does not look to me like a natural way of doing it, and I can see it go out of hand very quickly with sub-numbering, i, ii, iii, etc...

Thank you.

Re: Numbering and document merging: Is it possible?

PostPosted: Sat Mar 03, 2012 1:13 am
by jason
There may be some way you can control what Word does when it resolves the altChunks: you'll need to consult to OpenXML spec to clarify.

If not, the following approaches spring to mind.

(1) put all your optional sections/clauses into content controls, and bind your document to an XML data file which says which of these should be included or dropped from the output document. docx4j has good support for this approach. (Google OpenDoPE for more info)

A variation on this approach would be to put your paragraph content into the XML data file, possibly as escaped HTML. But you'd probably run into numbering problems this way again. The advantage of this variation is that you can inject wholly new section/clause content via the XML data file at run time.

(2) MergeDocx paid extension for docx4j, which can resolve the altChunks, and do a better job with numbering. Contact Plutext off list if you are interested in this approach.

hope this helps .. Jason

Re: Numbering and document merging: Is it possible?

PostPosted: Sat Mar 03, 2012 9:58 am
by elextra
Hi There,

Thank you for your tips.
I have downloaded and run a little sample using OpenDoPE. I have used the "include docx" feature (processed by docx4j) to try to solve my problem.
Well, the document assembly happens properly, however, it does not seem to handle numbering whatsoever, unless I have missed something. I tried several combination, without success.

I will try to get hold of Plutext with this particular issue, hopefully they have an answer for me.
Cheers,

Re: Numbering and document merging: Is it possible?

PostPosted: Sat Mar 03, 2012 8:54 pm
by jason
I was suggesting that you try using conditional content controls (ie content controls with tag od:condition, in which the XPath evaluates to true or false, and the content is only included if it evaluates to true), not included documents.

The "include docx" feature uses altChunk under the covers, leaving the merging work to Word (unless you are a MergeDocx user). So you'd expect the same behaviour as you were getting before.

One thing: if you want Word to merge the lists, do they have the same nsid? (unzip the docx, and look at the relevant list in numbering.xml)

Re: Numbering and document merging: Is it possible?

PostPosted: Wed Mar 07, 2012 9:42 am
by elextra
Hi,

I just read your answer...I was actually working on evaluating MergeDocx to see if it actually solves the problem.
What I found out is that numbering works when the same nsid is used (I know, I just read your answer :-). ). I found that out accidentally when merging "clause1.docx" with
"Copy of clause1.docx".
- Is there a simple way to inspect the word docx.xml? (I use Word save-as for now, but I would have expected a view in M.S Word to give you that automatically, for instance in the dev ribbon or something of that sort...)

I am also going to try your suggestion concerning the conditional content control as it gives a certain flexibility as well I believe. Though, one thing I was wondering (and if of my concern) when using content control in general is: is there a way to remove the content control tags/aliases once the document is generated, so that the end user does not see them. He/She only sees the content, but not the tags. I have seen some samples in docx4java on how to remove block level elements but not a particular stdPr ( leaving the sdtContent behind - see below example- ). It could probably done using straight xml parsing....but that could break any ref/links...I dont know. Wouldn't that be something useful to have in the API?
----
...
<w:sdt>
<w:sdtPr>
<w:alias w:val="companyname"/>
<w:tag w:val="companyname"/>
<w:id w:val="12580208"/>
<w:placeholder>
<w:docPart w:val="93F7B7EB148C45FCB22AFA9F4B091F59"/>
</w:placeholder>
<w:showingPlcHdr/>
<w:dataBinding w:prefixMappings="xmlns:ns0='http://schemas.test2'" w:xpath="/ns0:test2[1]/ns0:en1[1]/ns0:compname[1]" w:storeItemID="{2...}"/>
<w:text/>
</w:sdtPr>
<w:sdtContent>
...<w:t>the content to keep</w:t>
</w:sdtContent>
</w:sdt>
----

Thank you,

Re: Numbering and document merging: Is it possible?

PostPosted: Fri Mar 09, 2012 4:52 pm
by jason
elextra wrote: is there a way to remove the content control tags/aliases once the document is generated, so that the end user does not see them. He/She only sees the content, but not the tags.


RemovalHandler does this. See the ContentControlBindingExtensions sample.