Page 1 of 1

find replace text problem

PostPosted: Wed Nov 27, 2013 5:21 am
by sfmorais
Hi,

I know that I have a repeated problem here in the forum but in my example I can´t find the solution.

I created a simple document (attached in the post)

I want to execute the find/replace text task.
I defined that my variables is marked between two pairs of '@@' (in this case called @@variable@@)

I googled to find the solutions. I think that I have the 'split text' problem. One of the solutions that I found to avoid the 'split text' problem is to programatically disable the RsIds and spell. I tried it but I have the same problem.

Now I tested with version 3.0 (but I started tested with older versions)

Many thanks in advance

Code: Select all
private static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
      List<Object> result = new ArrayList<Object>();
      if (obj instanceof JAXBElement) {
         obj = ((JAXBElement<?>) obj).getValue();
      }
      if (obj.getClass().equals(toSearch)) {
         result.add(obj);
      } else if (obj instanceof ContentAccessor) {
         List<?> children = ((ContentAccessor) obj).getContent();
         for (Object child : children) {
            result.addAll(getAllElementFromObject(child, toSearch));
         }
      }
      return result;
   }
   
   private static void replacePlaceholder(WordprocessingMLPackage template, String name, String placeholder ) {
      List<Object> texts = getAllElementFromObject(template.getMainDocumentPart(), Text.class);
      for (Object text : texts) {
         Text textElement = (Text) text;
         if (textElement.getValue().equals(placeholder)) {
            textElement.setValue(name);
         }
      }
   }
   
   public static void main(String[] args) throws Exception {
      
      WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new FileInputStream(new File("c:\\documento_template_simples.docx")));
   
      WordprocessingMLPackage.FilterSettings filterSettings = new WordprocessingMLPackage.FilterSettings();
        filterSettings.setRemoveProofErrors(true);
        filterSettings.setRemoveContentControls(true);
        filterSettings.setRemoveRsids(true);
        wordMLPackage.filter(filterSettings);
       
      replacePlaceholder(wordMLPackage, "text to replace", "@@variable@@");
      wordMLPackage.save(new File("c:\\documento_final.docx"));
   }


The part of document.xml that contains the variables is:

Code: Select all
<w:p w:rsidR="00EE275D" w:rsidRDefault="000A40A3" w:rsidP="00EE275D"><w:pPr><w:pStyle w:val="SemEspaamento"/></w:pPr><w:r><w:t>@@</w:t></w:r><w:proofErr w:type="spellStart"/><w:r><w:t>varia</w:t></w:r><w:r w:rsidR="002E2992"><w:t>ble</w:t></w:r><w:proofErr w:type="spellEnd"/><w:r><w:t>@@</w:t></w:r></w:p><w:p w:rsidR="00EE275D" w:rsidRDefault="005D49F6" w:rsidP="00EE275D"><w:pPr><w:pStyle w:val="SemEspaamento"/></w:pPr><w:r><w:t>@@</w:t></w:r><w:r w:rsidR="00EE275D"><w:t>varia</w:t></w:r><w:r w:rsidR="002E2992"><w:t>ble</w:t></w:r><w:r w:rsidR="00EE275D"><w:t>@@</w:t></w:r></w:p><w:sectPr w:rsidR="00EE275D"><w:pgSz w:w="11906" w:h="16838"/><w:pgMar w:top="1417" w:right="1701" w:bottom="1417" w:left="1701" w:header="708" w:footer="708" w:gutter="0"/><w:cols w:space="708"/><w:docGrid w:linePitch="360"/></w:sectPr>


documento_template_simples.docx
my example docx
(13.49 KiB) Downloaded 337 times

Re: find replace text problem

PostPosted: Wed Nov 27, 2013 7:44 am
by jason
Well you still seem to have both rsids (w:rsidR) and spell checking (<w:proofErr w:type="spellStart"/>..<w:proofErr w:type="spellEnd"/>) on in Word. You could try again to turn that off.

Alternatively, you can run your docx through https://github.com/plutext/docx4j/blob/ ... epare.java

There's a main method at the end which shows how to use it.

Re: find replace text problem

PostPosted: Wed Nov 27, 2013 11:58 pm
by sfmorais
Ok Jason, many thanks in advance.

Your solution solve me partially. Works only in some situations.

Follow my before post, the @@variable1@@ was solved with prepare method of VariablePrepare sample
and @@variable2@@ continue splited

As you can see, after the prepare method, the 'variable2' is splitted (the second pair of '@') by <w:bookmark

Code: Select all
w:p>
            <w:pPr>
                <w:pStyle w:val="SemEspaamento"/>
            </w:pPr>
            <w:r>
                <w:t>@@variable2</w:t>
            </w:r>
            <w:bookmarkStart w:name="_GoBack" w:id="0"/>
            <w:bookmarkEnd w:id="0"/>
            <w:r>
                <w:t>@@</w:t>
            </w:r>
        </w:p>


My class main

Code: Select all
private static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
      List<Object> result = new ArrayList<Object>();
      if (obj instanceof JAXBElement) {
         obj = ((JAXBElement<?>) obj).getValue();
      }
      if (obj.getClass().equals(toSearch)) {
         result.add(obj);
      } else if (obj instanceof ContentAccessor) {
         List<?> children = ((ContentAccessor) obj).getContent();
         for (Object child : children) {
            result.addAll(getAllElementFromObject(child, toSearch));
         }
      }
      return result;
   }
   
   private static void replacePlaceholder(WordprocessingMLPackage template, String name, String placeholder ) {
      List<Object> texts = getAllElementFromObject(template.getMainDocumentPart(), Text.class);
      for (Object text : texts) {
         Text textElement = (Text) text;
         if (textElement.getValue().equals(placeholder)) {
            textElement.setValue(name);
         }
      }
   }
   
   public static void main(String[] args) throws Exception {
      
      WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new File("c:\\documento_template_simples.docx"));
      
      Docx4jUtil.prepare(wordMLPackage);
      System.out.println("After:\n" + XmlUtils.marshaltoString(wordMLPackage.getMainDocumentPart().getJaxbElement(), true, true));      
       
      replacePlaceholder(wordMLPackage, "text 1 to replace", "@@variable1@@");
      replacePlaceholder(wordMLPackage, "text 2 to replace", "@@variable2@@");
      
      wordMLPackage.save(new File("c:\\documento_final.docx"));
   }


Attached is my Doc4jUtil class (that contains the main methods of VariablePrepare sample)

Many thanks in advance
Regards

documento_template_simples.docx
input file
(13.32 KiB) Downloaded 372 times

Docx4jUtil.java
contains the main methods of VariablePrepare sample
(7.61 KiB) Downloaded 498 times

Re: find replace text problem

PostPosted: Thu Nov 28, 2013 4:41 am
by sfmorais
Hi Jason,
I think that I found the solution

To solve my problem I looked to the 'BookmarksDeleter' sample

So, before replace the variables I execute the 'clean' text operations: call prepare method of VariablePrepare sample and the fixRange method of BookmarksDeleter sample. (I don´t know if it is a efficient way)

Till now I did some tests and it works.

Thanks