wiki:WebInterface

Web Interface - Directory Listing

web.xml includes the following servlet mappings:

<servlet-mapping>

<servlet-name>ServletGetAs?</servlet-name> <url-pattern>/repository/*</url-pattern>

</servlet-mapping>

This servlet can display a  directory listing, which lets you see documents in various ways.

A document can be of two types, these are chunked or not. "Chunked" means that the document.xml file in the package has been replaced by a series of content controls, each stored separately in the JCR, and a skeleton.xml file which identifies them by ID. Typically, a document is chunked when it is first opened via the Word add-in (which gets the document with a ?word query string).

This page describes the various formats in which a docx can be fetched, column by column. The difference depends on the query string. ?word is described first, since it is the one used in production. The others are just useful for developers from time to time.

For Word (?word)

Of all the various formats in which a document can be fetched, this is the only one used in production.

The document is returned as a conforming Open Package docx document. This is the format the Word 2007 add-in asks for them in. If you are on a Windows platform with Word 2007 (or Word 2003 with the compatibility pack installed), you should be able to click on this link and have the document open in Word.

If it is a plain docx document, it is chunked first (ie content controls and a skeleton.xml are written to the repository), but then returned as described. To return it as a conforming Open Package docx document, a Main Document part is constructed writing the content controls in the order specified in skeleton.xml.

Stream (No query string)

The document is streamed as the Jackrabbit WebDAV servlet would normally stream it. It is assembled from its child nodes.

If the document is chunked, Word 2007 will *not* be able to open it, since the document will not conform to the Open Packaging Conventions (as described above, it doesn't have a Main Document part).

But it can be useful to download the file and unzip it.

Show Chunks (?showchunks)

The XML comprising each content control is rendered as plain text in your web browser. A content control is represented in WordprocessingML by a "structured document tag" <w:sdt>.

This is useful to see quickly how the document has been chunked. Usually this will be by paragraph, in which case you'll see a <w:p> within the <w:sdtContent> element.

Dump (?dump)

The tree of JCR nodes is walked and the contents dumped as plain text in your web browser.

This helps you to see how the document is stored in the repository.

You can also see that the various changes made to a document over time are stored in the transforms property on the document's jcr:content node. This is for the purpose of communicating them to other active editing sessions. All changes are also stored as versions in the repository.

Round trip test (?rtt)

Probably not so useful.

Tests the load and save routines, saving the document back to JCR with a name -n.docx. Will chunk it along the way.