Yesterday I encountered a problem when trying to persist a String value into a MySQL column of type ‘text‘ (the problem also occurs for column types ‘tinytext‘, ‘mediumtext‘ etc.)

The first confusing thing was that the error message returned by mysql was
java.sql.BatchUpdateException: Data truncation: Data too long for column 'my_column'

The confusing thing was that the column was of type ‘text‘ (length=65536), and I then changed it to type ‘mediumtext‘ (length=16777215), but the value that I was trying to persist had only around 6000 characters, so it was not a problem with the data length.
Instead the one thing that was noticeable about my String value was that it contained non-ASCII characters (particularly Romanian characters with diacritics).

After a little search on the web, I found the following mysql bug description
A comment on that page ([8 Mar 2006 9:53] [ name withheld ]) also contains some more links to similar bug descriptions.

Now it was a little clearer that the problem was caused by the Romanian characters (and the error message was just stupid), so the next thing to do was to figure out why the database table didn’t store UTF-8 characters.

Although I created the database with
create database mydb character set utf8 collate utf8_general_ci;
still the tables created inside it had the collation latin1_swedish_ci.

My application uses Hibernate for persistence and uses java persistence annotations to specify the hibernate mappings.

In my case the code is something like:

import javax.persistence.*;

@Table(name = "my_table")
public class MyClass implements Serializable {
    private String myValue;

    @Column(name = "my_column", columnDefinition = "mediumtext", length = 16777215)
    public String getMyValue() {
        return this.myValue;

    public void setMyValue(String value) {
        this.myValue = value;

The database schema is automatically generated based on the classes and hibernate mappings, by using the following ant target:

    <taskdef name="hibernatetool"

    <target name="hbm2ddl-schema"
            description="Generates the database schema from hibernate mappings">
        <hibernatetool destdir="">
            <classpath refid="project.classpath"/>
            <annotationconfiguration configurationfile="${classes.dir}/hibernate.cfg.xml"/>
            <hbm2ddl export="true" drop="false" create="true" haltonerror="true"/>

If I had created the database tables by writing the SQL DDL by hand, I would have used the following script:

  create table `my_table` (
      `Id` int(11) NOT NULL auto_increment,
      `my_column` mediumtext NOT NULL default ”,

But in my case the database schema is automatically generated by HibernateTools, so I have to specify the default charset for the table somewhere in the @Table annotation, or hibernate configuration, or hibernatetool parameters.

After trying a few options (and I must admit I did not search thoroughly, so if you know of some better way to specify the default charset, please leave a comment), I chose the following solution:
I implemented my own custom org.hibernate.dialect.Dialect, by subclassing org.hibernate.dialect.MySQL5InnoDBDialect:

import org.hibernate.dialect.MySQL5InnoDBDialect;

 * Extends MySQL5InnoDBDialect and sets the default charset to be UTF-8
 * @author Sorin Postelnicu
 * @since Aug 13, 2007
public class CustomMysqlDialect extends MySQL5InnoDBDialect {

    public String getTableTypeString() {
        return " ENGINE=InnoDB DEFAULT CHARSET=utf8";

and used it in my hibernate.cfg.xml:

        <property name="hibernate.dialect">my.package.CustomMysqlDialect</property>

10 responses to “How to set the default charset to utf-8 for create table when using hibernate with java persistence annotations

  1. I’ve tried a few other options, but I had no luck and I dont know why…

    I tried changing the con.url

    i tried adding a hibernate charset property

    the only sollition that actually changed my tables to UTF-8 was yours..

  2. Ik kicked out my tags *smile*..

    First try:
    hibernate.connection.url = jdbc:mysql://localhost:3306/kaartenbalie_test2?characterEncoding=UTF-8

    Second try:
    hibernate.connection.charSet = UTF-8

    no luck..

  3. If your application is the only application on DB, than you may try to find the property for default charset in DB server config.

    For example in MySQL it is: “default-character-set=utf8”

  4. This is the way you do it.

    In your hibernate.cfg.xml you use another dialect.


    depending on if you use mysql 4.x

    Then to get utf8 on the tables you have to create the database as utf8.
    create database someDatabase default charset utf8;

    Then hibernate will automatically set the charset for the tables to utf8.
    Hope this helps


  5. Ops, I see my properties didnt appear. Here it is:

    property name=”hibernate.dialect” value=”org.hibernate.dialect.MySQL5InnoDBDialect”


    property name=”hibernate.dialect” value=”org.hibernate.dialect.MySQLInnoDBDialect”

  6. I have also used your technic, because I encountered the same problem, and it works very well.

  7. Easier than using your own dialact might be switching the database to UTF-8 e.g.


Leave a Comment:

Your email address will not be published. Required fields are marked *