Page 1 of 1

getJAXBNodesViaXPath deletes page breaks

PostPosted: Thu Jun 26, 2014 8:27 pm
by flopflop
Hi,
(excuse my poor English)
I use docx4j to concat existing docx files, merge xml data, insert an image, ...

I have some questions :

- when concatening existing files with tables, I have to add a Br and a P (see addPageBreak method), otherwise the second page misses a line if I don't add the P -> is this normal ?

- after the concat, I just want to parse XML tree (with getJAXBNodesViaXPath) in order to substitute a text by an image, but just parsing and saving the docx file is deleting the page breaks !!! why ?

- now let's open d.docx in Word and save it, comment the "append(true, doc4, doc1, doc2, doc3);" line, the bug disappears !

What do I do wrong ?
Thanks for the help...

Code: Select all
import java.io.File;
import org.docx4j.Docx4J;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.wml.Br;
import org.docx4j.wml.P;
import org.docx4j.wml.STBrType;

public class Test {
    public static void main(String[] args) throws Exception {
      String doc1 = "y:/tests/a.docx";
      String doc2 = "y:/tests/b.docx";
      String doc3 = "y:/tests/c.docx";
      String doc4 = "y:/tests/d.docx";
      String doc5 = "y:/tests/e.docx";
      
      // Concat
      append(true, doc4, doc1, doc2, doc3);
      
      // the bug is here
      WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(doc4));
      wordMLPackage.getMainDocumentPart().getJAXBNodesViaXPath("//w:p", false);
      wordMLPackage.save(new java.io.File(doc5)); // Page breaks disappeared !
   }
   
   public static void append(boolean pageBreak, String output_DOCX, String... input_DOCX) throws Docx4JException {
      // Load all packages
      WordprocessingMLPackage[] wordMLs = new WordprocessingMLPackage[input_DOCX.length];
      for (int i = 0; i < input_DOCX.length; i++) {
         wordMLs[i] = WordprocessingMLPackage.load(new java.io.File(input_DOCX[i]));
      }
      // Concat packages
      WordprocessingMLPackage all = wordMLs[0];
      for (int i = 1; i < wordMLs.length; i++) {
         addPageBreak(all);
         all.getMainDocumentPart().getContent().addAll(wordMLs[i].getMainDocumentPart().getContent());
      }
      // Save
      Docx4J.save(all, new File(output_DOCX), Docx4J.FLAG_NONE);
   }
   
   private static void addPageBreak(WordprocessingMLPackage wordML) {
      Br np = new Br();
      np.setType(STBrType.PAGE);
        wordML.getMainDocumentPart().getContent().add(np);
        wordML.getMainDocumentPart().getContent().add(new P()); // Otherwise there is a line missing
        //wordML.getMainDocumentPart().addParagraphOfText("\u00A0");
   }
}

Re: getJAXBNodesViaXPath deletes page breaks

PostPosted: Fri Jun 27, 2014 6:39 pm
by jason
Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
      Br np = new Br();
      np.setType(STBrType.PAGE);
        wordML.getMainDocumentPart().getContent().add(np);
 
Parsed in 0.015 seconds, using GeSHi 1.0.8.4


is creating invalid content, since w:br should not be a sibling of w:p, but rather, paragraph content.

I guess the invalid content is dropped perhaps if getJAXBNodesViaXPath unmarshalls the content.

So your first step should be to add your w:br element in the correct part of the hierarchy.

Re: getJAXBNodesViaXPath deletes page breaks

PostPosted: Fri Jun 27, 2014 6:55 pm
by flopflop
Thanks a lot, this works !

Code: Select all
P p = new P();
R r = new R();
Br br = new Br();
br.setType(STBrType.PAGE);
r.getContent().add(br);
p.getContent().add(r);
wordML.getMainDocumentPart().getContent().add(p);