Page 1 of 1

Find merge fields in table row

PostPosted: Tue Jan 12, 2016 1:29 am
by PierreR
Hello,

I have some strange behavior with finding merge fields in a table row. I have a Word docx with several tables, each with merge fields to define the table title, header titles and cell values. To construct the table for merge I locate the row by looking for a merge field with a certain suffix. This way I can add rows for each record I need to merge.
The code works fine, but when I edit and save the Word doc, sometimes the merge fields in the table row are not found anymore. I then need to edit and save again (several times sometimes) to have the code find the merge fields again. This is not workable and very annoying. When performing the mail merge with MailMerger.performMerge(wordMLPackage, mergeMap, true) the merge fields in the table are found so they do exist.

Is there something I 'm missing here? Or is there another (better) way to find merge fields in a table row?

Code: Select all
      ClassFinder rowFinder = new ClassFinder(Tr.class);
      new TraversalUtil(table.getContent(), rowFinder);
      log.info("TR Rows found = " + rowFinder.results.size());
      for (Object objRow : rowFinder.results) {
         log.info("Process row " + rowFinder.results.indexOf(objRow));
         Tr row = (Tr) objRow;
         ComplexFieldLocator fl = new ComplexFieldLocator();
         new TraversalUtil(row.getContent(), fl);
         log.info("Complex fields found: " + fl.getStarts().size());
         if (fl.getStarts().size() > 0) {
            List<FieldRef> fieldRefs = new ArrayList<FieldRef>();
            for( P p : fl.getStarts() ) {
               FieldsPreprocessor.canonicalise(p, fieldRefs);
            }
            log.info("Field refs found: " + fieldRefs.size());
            for (FieldRef fr : fieldRefs) {
               for (Object o : fr.getInstructions()) {
                  o = XmlUtils.unwrap(o);
                  String fieldName = "";
                  if (o instanceof Text) {
                     String instr = ((Text)o).getValue();
                     Matcher matcher = MERGEFIELD_PATTERN.matcher(instr);
                     if (matcher.find() && matcher.groupCount() == 1) {
                        fieldName = matcher.group(1).trim();
                     }
                  } else {
                     fieldName = XmlUtils.unwrap(o).getClass().getName().trim();
                  }
                  if (!fieldName.isEmpty()) {
                     log.info("Field name = " + fieldName);
                     if (fieldName.endsWith(endsWith)) {
                        mergeRow = row;
                        log.info("Merge Row found! (" + fieldName + ")");
                        break;
                     }
                  }
               }
               if (mergeRow != null) {
                  break;
               }
            }
            if (mergeRow != null) {
               break;
            }
         }
      }
      
      return mergeRow;

Re: Find merge fields in table row

PostPosted: Tue Jan 12, 2016 1:41 pm
by jason
PierreR wrote:when I edit and save the Word doc, sometimes the merge fields in the table row are not found anymore


When this happens, unzip your docx and look at the XML representing one of the merge fields which no longer works.

I guess the text element content is split across several w:t?

Re: Find merge fields in table row

PostPosted: Wed Jan 13, 2016 5:40 am
by p3consulting
A MERGE field is basically the following sequence of objects
a w:fldChar of type "begin"
one or more run containing a w:Text or a CTSimpleField containing a w:inStr, and once merged the text value looks like " MERGEFIELD fieldname \* MERGEFORMAT"
a w:fldChar "separate"
one or more run of w:Text , and once merged the text looks like «MergeField displayedfieldname»
a w:fldChar "end"

You need to be careful because the runs between "separate" and "end" may be interleaved with other kind of info like proof instructions and change of format (font, size, ...)
AND the displayedfieldname MAY BE DIFFERENT from the field name found in the " MERGEFIELD fieldname \* MERGEFORMAT" string…
(don't forget: fieldname may contain spaces...)

If you write a software doing merge field replacement, you should use the fieldname found in the begin-separate interval, not the displayed one found in separate-end interval, and of course the whole sequence from "begin" to "end" w:fldChar should be replaced by your new run of text.