Page 1 of 1

how to remove white space before "addAltChunk"

PostPosted: Wed Aug 08, 2012 11:14 am
by sarad
Hi,
I am creating a word file which has two parts .
In first part , I am replacing couple of words using
Code: Select all
Object obj = XmlUtils.unmarshallFromTemplate(xml, mappings);
documentPart.setJaxbElement((Document) obj);


In second part, I am inserting html code using
Code: Select all
wordMLPackage.getMainDocumentPart().addAltChunk(AltChunkType.Html, htmlCode.getBytes());

Things works fine. However, a lot of white space (perhaps 7-8 blank lines) appears between first and second part. I want my html part (alternateChunk) to appear just after the first part.
Is there any anything I am missing?

Thanks in advance!!

Re: how to remove white space before "addAltChunk"

PostPosted: Thu Aug 09, 2012 6:36 pm
by jason
I guess you need to work out where the blank lines are coming from.

Are they empty w:p elements?

Were they in the document before you added the altChunk? If not, do they correspond to content in your HTML?

This sounds like a question about the behaviour of Microsoft Word; if that is correct, which version of Word?

Re: how to remove white space before "addAltChunk"

PostPosted: Fri Aug 10, 2012 4:23 am
by sarad
hi Jason,
yes the white space is the <w:p> elements included below:
Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
  <w:p w:rsidR="00521AD4" w:rsidRPr="00B73858" w:rsidRDefault="00521AD4" w:rsidP="006B573E">
    <w:pPr>
      <w:jc w:val="center"/>
      <w:rPr>
        <w:color w:val="800080"/>
      </w:rPr>
    </w:pPr>
  </w:p>
  <w:p w:rsidR="00521AD4" w:rsidRPr="002F6217" w:rsidRDefault="00521AD4" w:rsidP="006B573E">
    <w:pPr>
      <w:jc w:val="center"/>
      <w:rPr>
        <w:color w:val="000000"/>
      </w:rPr>
    </w:pPr>
  </w:p>
  <w:p w:rsidR="00521AD4" w:rsidRDefault="00521AD4" w:rsidP="002F6217">
    <w:pPr>
      <w:ind w:left="1440" w:firstLine="720"/>
    </w:pPr>
  </w:p>
  <w:p w:rsidR="00521AD4" w:rsidRPr="00B73858" w:rsidRDefault="00521AD4" w:rsidP="002F6217">
    <w:pPr>
      <w:ind w:left="1440" w:firstLine="720"/>
      <w:rPr>
        <w:color w:val="800080"/>
      </w:rPr>
    </w:pPr>
  </w:p>
  <w:p w:rsidR="00521AD4" w:rsidRDefault="00521AD4" w:rsidP="00E87F3F">
    <w:pPr>
      <w:ind w:left="1440" w:firstLine="720"/>
    </w:pPr>
  </w:p>
  <w:p w:rsidR="00521AD4" w:rsidRPr="00B73858" w:rsidRDefault="00521AD4" w:rsidP="007E713E">
    <w:pPr>
      <w:ind w:left="1440" w:firstLine="720"/>
      <w:rPr>
        <w:color w:val="800080"/>
      </w:rPr>
    </w:pPr>
  </w:p>
  <w:p w:rsidR="00521AD4" w:rsidRPr="00B73858" w:rsidRDefault="00521AD4" w:rsidP="007E713E">
    <w:pPr>
      <w:ind w:left="1440" w:firstLine="720"/>
      <w:rPr>
        <w:sz w:val="16"/>
        <w:szCs w:val="16"/>
      </w:rPr>
    </w:pPr>
    <w:r>
      <w:rPr>
        <w:sz w:val="16"/>
        <w:szCs w:val="16"/>
      </w:rPr>
      <w:tab/>
    </w:r>
  </w:p>
  <w:p w:rsidR="00521AD4" w:rsidRDefault="00521AD4"/>
  <w:p w:rsidR="00000000" w:rsidRDefault="00093A7C">
    <w:pPr>
      <w:pStyle w:val="NormalWeb"/>
      <w:divId w:val="107701330"/>
    </w:pPr>

 
Parsed in 0.006 seconds, using GeSHi 1.0.8.4

Strangely, the space appears only if there is pre-existing text in the file before inserting the html codes. If I inject the html codes in a blank word file, the outcome is as desired without any blank space in the top.
I am using word 2010 for this test.

Also is there any dedicated method (or example) to get rid of the white space (<w:p> elements) ?

Thanks for the help!!

Re: how to remove white space before "addAltChunk"

PostPosted: Fri Aug 10, 2012 9:10 am
by jason
Are there corresponding elements in the HTML?

As for dedicated methods to remove empty paragraphs, no, there aren't any. You could easily roll your own, but it wouldn't be much use would it, if you have to open the docx in Word first to convert the altChunk from HTML to WordML?

Note that docx4j can import XHTML (see the samples), so if it is well-formed xml, you could try doing that instead.

Re: how to remove white space before "addAltChunk"

PostPosted: Sat Aug 11, 2012 6:20 am
by sarad
hi Jason,
Thanks for your response so far. I made few more tests and realized that the white space is created if I am loading an exisitng file in WMLPackage. When I am creating a new package then there is no such space created .

So, I am thinking -- instead of making changes in the loaded file why not to take all the JAXBElements and altChunk(html) to a new package.

So, roughly my design would be :

Code: Select all
// 1. Load and create two WMLPackages.
                 WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(path) );
                 WordprocessingMLPackage wordMLPackage2 = WordprocessingMLPackage.createPackage();

// 2. Fetch the document part
                MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
                MainDocumentPart documentPart2 = wordMLPackage2.getMainDocumentPart();

//3. Get the elements from first documentpart
              org.docx4j.wml.Document wmlDocumentEl = (org.docx4j.wml.Document) documentPart
                     .getJaxbElement();

              //4. marshal the elements to String in xml format
              String xml = XmlUtils.marshaltoString(wmlDocumentEl, true);

//5. Create a map to do replacement mapping
              HashMap<String, String> mappings = new HashMap<String, String>();

//6. Convert the xml with changed values back to object form.
              Object obj = XmlUtils.unmarshallFromTemplate(xml, mappings);

// 7. Instead of setting documentPart of Loaded WMLPackage , do it in the WMLPackage of new Created one (second one)

documentPart2.setJaxbElement((Document)obj);

//8 . Also add the html package
wordMLPackage2.getMainDocumentPart().addAltChunk(AltChunkType.Html, html.getBytes());

//9. And save it
wordMLPackage2.save(new java.io.File("C:/test2.docx"));


I am not sure if this strategy makes sense. When I try to execute it, the word file cannot be opened and complains it is missing some part/invalid (Location: part: / word/document.xml , Line: 1, Column: 0 )
I am new to JAXB and docx4j . Would appreciate your directions.

Thanks