Page 1 of 1

Carriage return in content controls not working

PostPosted: Fri Oct 11, 2013 10:05 am
by sylnsr
I am using CR+LF characters in my custom XML part, which is mapped to content controls. I can use the Word 2007 Content Control Toolkit to edit and confirm that I have working XML with CR+LF characters. I can open the document in Word and confirm that the new lines are shown in the document.

For some reason though, when I use docx4j to do the substitution on the same XML, and save the file as a PDF (not sure that makes any difference though) then all the new lines are gone.

I am using docx4j 2.8.1

1.png
Original XML data showing CR+LF
1.png (29.95 KiB) Viewed 3371 times


2.png
View from Word showing that CR+LF works in content control
2.png (9.16 KiB) Viewed 3371 times


3.png
Replacement XML with different text but still using CR+LF
3.png (22.81 KiB) Viewed 3371 times


4.png
PDF file created with docx4j using replacement XML showing new lines missing
4.png (14.6 KiB) Viewed 3371 times

Re: Carriage return in content controls not working

PostPosted: Fri Oct 11, 2013 10:30 am
by sylnsr
Am I supposed to be using ""\n\r\f" instead of "\n\r" ??

Re: Carriage return in content controls not working

PostPosted: Fri Oct 11, 2013 1:07 pm
by sylnsr
Scratch the last post since \f (form-feed) is an illegal character for XML.
I went ahead and rummaged about in the source and got to BindingTraverserXSLT.java where I see it has:

StringTokenizer st = new StringTokenizer(r, "\n\r\f");
// tokenize on the newline character, the carriage-return character, and the form-feed character

.. so that appears to be invalid since \f is not valid to put into the XML in the first place

In any case the tokenizer seems to be pointless because when the string is extracted from the XML data-part with this line of code:

String r = BindingHandler.xpathGetString(pkg, customXmlDataStorageParts, storeItemId, xpath, prefixMappings);

.. then the string comes out as "I amthe newfield 1" so effectively any "\n\r\f" or "\n\r\" or "\n" gets removed completely so there will never be anything for the string tokenizer to find in that string.

Based on this, I would say that this appears to be a multi-part bug.

Re: Carriage return in content controls not working

PostPosted: Sat Oct 12, 2013 1:41 pm
by jason
That tokenizer is saying "split on \n \r or \f"; it does not require \f to be present and is agnostic as to whether its a legal character in XML or not.

In practice, you'd indicate a line break with '\n' or '\r' or both of them. Using both won't result in 2 line breaks.

To see, try:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
        public static void main(String[] args) {

             StringTokenizer st = new StringTokenizer("this-is+a-+test so there", "-+");
             while (st.hasMoreTokens()) {
                 System.out.println(st.nextToken());
             }         
        }
 
Parsed in 0.015 seconds, using GeSHi 1.0.8.4

Re: Carriage return in content controls not working

PostPosted: Sun Oct 13, 2013 6:20 am
by sylnsr
Thanks for pointing that out. I am new to the StringTokenizer and didn't realize it was matching on a pattern. I resolved another bug in my own code which was stripping my breaks and now it's working perfectly.

BTW, thank you for this awesome tool. Once I start making money from my site, I'll buy you a beer!

Re: Carriage return in content controls not working

PostPosted: Sun Oct 13, 2013 9:13 am
by jason
No worries :-)