Hibernate, UTF-8 and SQL Server 2005

2 min read >

Hibernate, UTF-8 and SQL Server 2005

Engineering Insights & Enterprise solutions

I found out today that MS SQL server seems to handle Unicode in a very special way. Instead of having some support at a database or table level, each Unicode column has to be created as “national”. That is either nchar, nvarchar, or ntext.

MS SQL Server 2005 seems to go one step further by announcing future deprecation for ntext, text, and image types.

From SQL Server 2005 notes:

ntext, text, and image data types will be removed in a future version of Microsoft SQL Server. Avoid using these data types in new development work, and plan to modify applications that currently use them. Use nvarchar(max), varchar(max), and varbinary(max) instead.”

When working with Hibernate it seems there is no dialect to handle Unicode integration properly. You have to get down and write a custom dialect that maps to the new data types.

/**
 * Unicode support in SQL Server
 *
 * @author icocan
 */
public class UnicodeSQLServerDialect extends SQLServerDialect {

    public UnicodeSQLServerDialect() {
        super();

        // Use Unicode Characters
        registerColumnType(Types.VARCHAR, 255, "nvarchar($l)");
        registerColumnType(Types.CHAR, "nchar(1)");
        registerColumnType(Types.CLOB, "nvarchar(max)");

        // Microsoft SQL Server 2000 supports bigint and bit
        registerColumnType(Types.BIGINT, "bigint");
        registerColumnType(Types.BIT, "bit");
    }
}