Subscribe for updates on posts
Be the first to read the latest news

Create and populate your database independent of the db server using ant, dbunit and hibernate

January 31st, 2007 by Martin Paraschiv in Java, General

Database creation is such a common task that you must have some scripts and tips and tricks at hand, in order to have it done as quickly as possible. What about when there’s more than one type of RDBMS in your project that you need to handle. Things get a little messy.
How about using just one XML file and two or three libraries, regardless of the RDBMS type. Wouldn’t that be cool?
I’m saying, all you need is ant, dbunit and Hibernate as libraries and a couple of files for data and database connection configuration. With this you’ll be able to populate with only one file init-dataset.xml different rdbms like Oracle, MySql, MsSql, Postgres. For a complete list of supported rdbms have a look at dbunit SupportedRDBMS
Database creation and population process is in four steps:
1.Create an ant build.xml file and setup a project with path for used libraries
2.Generate DDL from Hibernate mapping
3.Create database schema running generated DDL
4.With DBUnit populate database created above, providing as input a seed XML file with data you’re database must be populated with.

<project name=”OMS” basedir=”.” default=”deploy-reload-classes”>
<property file=”web/WEB-INF/classes/“/>
<property name=”ddl.file” value=”ddl.sql”/>
<property name=”lib.dir” value=”lib”/>

<path id=”project.classpath”>
<fileset dir=”${lib.dir}”>
<include name=”**/*.jar”/>

<!– dbnunit task definition –>
<taskdef name=”dbunit”


Typical ant project setup. In order to use dbunit task we need to define it.

<target name=”hbm2ddl-schema” description=”Generate DDL from Hibernate mapping”     depends=”build”>
<java classname=”net.sf.hibernate.tool.hbm2ddl.SchemaExport” fork=”true”>

<!– Hibernate Properties –>
<jvmarg value=”-Dhibernate.dialect=${databaseDialect}”/>

<!– Mapping Files –>
<arg value=”ObjectModel.hbm.xml”/>

<classpath refid=”project.classpath”/>


These lines should be familiar to an hibernate user.

<target name=”sql-run-ddl” description=”Run DDL script” depends=”hbm2ddl- schema”>
<sql url=”${databaseURL}” driver=”${databaseDriver}”
userid=”${databaseUser}” password=”${databasePassword}”
onerror=”continue” autocommit=”true” print=”yes”>
<classpath refid=”project.classpath”/>
<transaction src=”${ddl.file}”/>

all database related variables should be defined into a properties file, I called it, see step 1.

<target name=”dbunit-init-db” description=”using dbunit for testing” depends=”sql-run-ddl”>
<dbunit driver=”${databaseDriver}”
<operation type=”DELETE”
<operation type=”INSERT” format=”flat”

after initializing database we can populate it.

For a simple database setup the script above should do it.
But sometimes we need to initialize more complex databases, where we have some foreign key circular dependency.

Some Tips & Tricks

Unfortunately dbunit is not perfect, meaning it has some bugs. One is that it doesn’t handle too well circular reference between tables, and second is when a db is initialized. At initialization operation, in the seed dataset file each new row for each new table must have all fields populated, otherwise unpopulated fields (from first row, but populated in subsequent rows) will be set to null;
Here are some workarounds.
1.instead of using a row with missing column, use one with complete columns.
For example:

<user ID=’1′ />
<user ID=’2′ NAME=’user name’/>

will insert null in all NAME columns of user table. The solution is to use

<user ID=’2′ NAME=’user name’ />
<user ID=’1′ />

1.instead of using INSERT operation, use REFRESH operation.At REFRESH operation, data of existing rows are updated and non-existing rows get inserted. For clean initalization, perform CLEAN before.

<user ID=’1′ col2=’val1’/>
<user ID=’2′ NAME=’user name’ col2=’val2’/>
<!– if met twice id=2, than the second time it means there’s an update –>
<user ID=’2′ NAME=’user name’ />

In the update part you must update only fields that you know they were setup to null.
The second method is also useful when you need to populate a db with data that contains circular reference. Let’s suppose we have users and portfolios. And for some reason we would like to have a reference of an portfolio into user’s table, and a reference of user into portfolio table. The data will look something like this:

<user ID=’1′ name=’default user’ portfolio_id=’1′ />
<portfolio ID=’1′ user_id=’1′ name=”default portfolio”/>

If we’ll use the dataset above with INSERT operation, dbunit will report an error, that portfolio with id=1 doesn’t exists.

Using REFRESH operation, with an appropriate dataset the problems will disappear.

<user ID=’1′ name=’default user’ />
<portfolio ID=’1′ user_id=’1′ name=”default portfolio”/>
<!– if met twice id=2, than the second time it means there’s an update –>
<user ID=’2′ NAME=’user name’ portfolio_id=’1’/>

As you can see portfolio_id is missing from the first row. So dbunit gets fooled. Next record will be inserted with reference to user table. And the last record will update reference of user record to portfolio record.

The bottom line: in spite of it’s imperfections, dbunit has reached 2.2 version after more than one year breakend, it can be used very efficiently for other purposes than database testing. A database setup operation independent of RDBMS type can be done easily enough with DBUnit. I hope you find this interpretation useful.

Good luck!

You might also like

DBUnit loses data If not used properly Dbunit can loose data. I'm using dbunit version 2.1 which has the following bug...
Hibernate annotations – default value If you wanted to set the default value using hibernate annotations, you've probably had some difficulties,...
How to set the default charset to utf-8 for create table when using hibernate with java persistence annotations Yesterday I encountered a problem when trying to persist a String value into a MySQL column of type 'text'...
Hibernate Annotations 3.2.1 – Lucene sort Keeping database and Lucene index in sync is easily done using the new Lucene-Hibernate integration. Just...

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.