Archive for 2013

docx4j 3.0 and Maven

November 28th, 2013 by Jason

blog/2011/10/hello-maven-central/ walks you through the basics of using docx4j in an Eclipse project with the help of m2eclipse.

This post is about the different ways you can set up docx4j 3.0 with the help of Maven.

We’ll be using the following skeleton pom.xml:


<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
	<modelVersion>4.0.0</modelVersion>
		
	<groupId>your.group</groupId>
	<artifactId>your.artifactp</artifactId>
	<name>nameless</name>
	<version>0.0.1-SNAPSHOT</version>
	<description>
		some description
	</description>


	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
	</properties>

	<build>
		<plugins>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-dependency-plugin</artifactId>
				<version>2.0</version>
			</plugin>
		</plugins>
	</build>
	
	<dependencies>
	
		<!-- dependencies go here -->

				 	
	</dependencies>

</project>

Adding the core dependency

To use docx4j, including its LGPL XHTML import capability, just include the following dependency in your pom.xml:


		<dependency>
			<groupId>org.docx4j</groupId>
			<artifactId>docx4j-ImportXHTML</artifactId>
			<version>3.0.0</version>
		</dependency>

That’ll drag in docx4j, and all the other dependencies (you should be able to see then in Eclipse under Maven Dependencies, or by running mvn dependency:tree at a command prompt).
If you don’t want the XHTML import stuff, just use:

		<dependency>
			<groupId>org.docx4j</groupId>
			<artifactId>docx4j</artifactId>
			<version>3.0.0</version>
		</dependency>

(You should consider adding a docx4j.properties to your classpath)
Logging
Both of the above default to using log4j.  If you are happy with log4j, you’ll want a log4j.xml file unless you already have it on your classpath.  If you don’t, you can configure https://github.com/plutext/docx4j/blob/master/src/samples/_resources/log4j.xml to suit.
If you want to use something other than log4j for logging, well you can, since docx4j uses slf4j.
First you need to exclude the log4j stuff.

		<dependency>
			<groupId>org.docx4j</groupId>
			<artifactId>docx4j-ImportXHTML</artifactId>
			<version>3.0.0</version>
			<exclusions>
				<exclusion>
					  <groupId>org.slf4j</groupId>
					  <artifactId>slf4j-log4j12</artifactId>
				</exclusion>
				<exclusion>
					<groupId>log4j</groupId>
					<artifactId>log4j</artifactId>				
				</exclusion>
			</exclusions>
		</dependency>

Then you add in the dependencies for your other logging frameworks.   See further http://www.slf4j.org/ and slf4j in search.maven.org
JAXB
docx4j relies very heavily on JAXB.  With Java 6 or 7, usually it’ll use the JAXB included in that (though things can be different with application servers – see the deployment forums for details).
The point here is that there is an alternative JAXB implementation, called EclipseLink MOXy (see http://www.eclipse.org/eclipselink/moxy.php), which is very well supported by its developers.  You can try it with docx4j.  To do so, just include the following additional dependencies:


org.docx4j
docx4j-MOXy-JAXBContext
3.0.0


org.eclipse.persistence
org.eclipse.persistence.moxy
2.5.1

/sourcecode]

Since using MOXy with docx4j is all quite new, you may run into some minor issues.  If you do, please let us know in the docx4j forums (with sufficient info for us to reproduce what you are seeing!).  Thanks.

docx4j 3.0 released

November 26th, 2013 by Jason

On behalf of everyone who has contributed to docx4j, Plutext is pleased to announce that version 3 was released today.

You can get it from Maven Central, or from http://www.docx4java.org/docx4j/ (the jar, the dependencies, or everything including documentation zipped up)

Source code is available at GitHub or from the Maven Central link above.  Javadoc is at Maven Central.

For what you need to know about docx4j 3.0, please see this post.

The XHTML Import stuff is now a separate project (since it and its dependencies are LGPL, not ASLv2 like docx4j).

  • the three jars you need (docx4j-ImportXHTML, xhtmlrenderer, and iText) are included for convenience in the zip file above.  You can delete them if you don’t need or want XHTML import.
  • or you can get it from Maven Central

docx4j 3.0 uses slf4j for logging.  For convenience, log4j is the default implementation.  A follow-up post will explain more about logging config.

Thanks to everyone who has helped to make this release our best yet!

If you have questions pertaining to the use of docx4j, please post them in our forum, or on StackOverflow (rather than in comments to this post).

docx4j 3.0 beta

November 7th, 2013 by Jason

A beta of docx4j 3.0 is now available, at:

http://www.docx4java.org/docx4j/docx4j-3_0-beta2.zip [link updated 15 Nov]

That zip file contains docx4j, and all its dependencies.  To use it, add all the jars to your classpath.

Alternatively, Maven users can get the beta from our staging repo on GitHub.

<repositories>
    <repository>
        <id>docx4j-mvn-repo</id>
        <url>https://raw.github.com/plutext/docx4j/mvn-repo/</url>
        <snapshots>
            <enabled>true</enabled>
            <updatePolicy>always</updatePolicy>
        </snapshots>
    </repository>
</repositories>

docx4j 3.0 beta is:


<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j</artifactId>
<version>3.0.0-SNAPSHOT</version>
</dependency>

Our last blog post outlines the major things to be aware of in v3.

Additional notes:

  • For convenience, the zip file also contains docx4j-ImportXHTML, and its dependencies, which are LGPL.  You can delete these if you wish.  They aren’t in the mvn staging repo.
  • To see any logging, you’ll need to add an slf4j implementation.
  • You might want to add a docx4j.properties file

You can find updated Getting Started guide in docx|pdf formats at http://www.docx4java.org/docx4j/.

Feedback welcome.  You can reply here, or to the post in the docx4j forums.

All going smoothly, we’ll progress to final release over the next couple of weeks, so the sooner your feedback, the better!

docx4j 3.0 – what you need to know

October 18th, 2013 by Jason

docx4j 3.0 (beta for which will be available shortly) contains a lot of changes, some big, some small.

Here are the most visible (see our changelog for the rest):

Logging

docx4j 3.0 uses slf4j, instead of log4j.

As the slf4j website puts it:

The Simple Logging Facade for Java (SLF4J) serves as a simple facade or abstraction for various logging frameworks (e.g. java.util.logging, logback, log4j) allowing the end user to plug in the desired logging framework at deployment time.

So you need the slf4j api jar on your classpath:

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.5</version>
</dependency>

If you want to use log4j, then include it, and:

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.5</version>
</dependency>

XHTML Import

The XHTML Import functionality is now a separate project on GitHub.

The reason being that its main dependency – Flying Saucer – is licensed under LGPL v2.1 (as opposed to ASL v2, which docx4j’s other dependencies use).

If you want this functionality, you have to add these jars to your classpath.  We’ll update this post with their coordinates once they are in Maven Central.

Docx4j facade

3.0 contains a facade providing clean access to some typical uses of docx4j:
  • Loading a document
  • Saving a document
  • Binding xml to content controls in a document
  • Exporting the document (to HTML, or PDF and other formats supported by the FO renderer)

You don’t have to use this – in that existing code should continue to work – but the facade is the right way to do things.  Behind the facade is a major rethink/cleanup to the export architecture/implementation, contributed by Alberto.

MOXy

The key technology underlying docx4j – and a major differentiator from Apache POI – is JAXB.

There is a JAXB reference implementation; the JAXB baked into Java 6 and 7 is based on it.

Prior to v3, you had to use the reference implementation, or the implementation included in the JDK.

With v3, you can choose to use EclipseLink MOXy instead.  To do so, simply include docx4j-MOXy-JAXBContext-3.0.0.jar and the MOXy jars on your classpath.

Sample code

The docx4j samples have relocated to src/samples

docx4j in a single page

May 15th, 2013 by Jason

Here’s a single A4 page reference/overview of docx4j aka a cheat sheet, in PDF or PNG format.

This one is focused on docx files (WordprocessingML).

I’ll create something similar for pptx and xlsx over coming days.

docx4j/pptx/xlsx online code generation

May 15th, 2013 by Jason

Just launched is http://webapp.docx4java.org

You should be able to see it in the menu at the top right of this website (if not, reload the web page…).

There are three things you can do with it right now:

• Explore your docx/pptx/xlsx and its representation in docx4j

• Convert  docx to PDF or XSL FO

• Merge docx files (eg cover letter plus contract) into a single docx, using Plutext’s MergeDocx. Or the same thing for pptx files, using MergePptx.

Here I want to focus on the first of these.

After you’ve uploaded your docx/pptx/xlsx, the first thing you see is like docx4j’s PartsList sample:

Here, I’ll click in the left hand column to look at the main document part, document.xml

When I do that, I see the XML:

No surprises there.

But notice the hyperlinks.  Here I’ll just click on the first w:p.

What you get back, is Java source code to create that complete structure:-

As you can see from the image above, both styles of code (as described in docx4j’s Getting Started document) are produced for you.  With a bit of luck, you can cut/paste either into your IDE (Eclipse or whatever), and just run with it!

To actually see the created object in an Office document, you’ll still need to add the created object to a part.  See Getting Started, or the cheat sheet for how to do that.

I hope this helps you to create/modify your Office documents more efficiently,with docx4j!

Do let us know what you think in the comments, or in docx4j’s forums.