CFMX Mysql Query String for UTF8 and Multiple Queries

While testing some of the code we've written for UTF8 support for double byte characters (Chinese, Japanese, Malaysian) I discovered that there are some voodoo JDBC connection string parameters that have to be set to make it work.

Firstly, CFMX used UTF8 by default so it's already configured. You need to make sure that your MySQL DB table has a Charset of utf8 as per example below:

CREATE TABLE `utf8test` (
   `id`    int(11) NOT NULL,
   `data`    varchar(25) NULL,
   PRIMARY KEY(`id`)
)CHARACTER SET utf8 COLLATE utf8_general_ci;

Then you need to add the following params to the connection string:

  • characterEncoding=UTF8
  • characterSetResults=UTF8

I also add the "allowMultiQueries=true" string which allows you to execute multiple queries in the one statement. It is disabled by default as it can open up a window for SQL Injection attacks, but this can be mitigated by always using CFQUERYPARAM's.

So the final connection string looks like this:

Hope it helps, Cheers, Mark

Comments
Brontojoris's Gravatar I was under the impression that by setting your tables to UTF-8, which I always do anyway when I use the MySQL Data Migration Wizard, that ColdFusion automatically returned UTF-8 results. I say this because my curly quotes and other funky characters came though as expected anyway.

Are you able to provide a demonstration?
# Posted By Brontojoris | 2/10/08 10:17 AM
Mark Lynch's Gravatar Hi Brontojoris,

Here is the sample code I wrote when testing it - not sure how well it will work in the comments:

<cfset datasource = "demo">
<cfset testData = "相思红豆生南">
<cfquery name="test" datasource="demo">
   DELETE FROM utf8test;
</cfquery>

<cfquery name="test" datasource="demo">
   INSERT INTO utf8test (id,data,datalong)
   VALUES(1,<cfqueryparam value="#testData#">,<cfqueryparam value="#testData#">)
</cfquery>

<cfquery name="test" datasource="demo">
   SELECT * FROM utf8test
</cfquery>

<cfdump var='#test#'>

I used the following table for testing:

CREATE TABLE `utf8test` (
`id` int(11) NOT NULL,
`data` varchar(25) default NULL,
`datalong` text,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8


Without the parameters on the connection string it was returning ????? instead of the chinese characters.

Interested to see if you get the same results - please let me know.

Regards,
Mark
# Posted By Mark Lynch | 2/10/08 10:50 AM
Mark Lynch's Gravatar I just updated the comment above - my DB connection for this wasn't allowing the Chinese UTF8 characters.

Brontojoris - let me know if you see the same results - it could be to do with the fact that my db's were not all initially created as UTF8.

Cheers,
Mark
# Posted By Mark Lynch | 2/10/08 11:01 AM
Rob Shaw's Gravatar Thanks Mark the connection string worked brilliantly I have a very happy language department now :)
# Posted By Rob Shaw | 5/29/08 6:08 AM
Belinda Weinbrecht's Gravatar I am having the same problem with CFMX 7 and MYSQL 5.
Our database is configured as collation latin1_swedish_ci.

When I set the connection string on the JDBC URL to use useUnicode=true&characterEncoding=UTF8, I still get ??.

any ideas?
# Posted By Belinda Weinbrecht | 7/10/08 6:23 PM
Mark Lynch's Gravatar Hi Belinda,
The collation is set to latin1_general_ci then your table is probably of type latin1. You need to make sure you create it with character set utf8 as per the example above.

Also, the strings you were using were not the ones I had success with:
characterEncoding=UTF8&characterSetResults=UTF8

Cheers,
Mark
# Posted By Mark Lynch | 7/13/08 12:18 PM
Gavin Cooney's Gravatar I've found the same problem with UTF characters with MySQL, using a MySQL GUI (Navicat) and the command line interface (for batch queries etc).
In the connection properties of Navicat I was able to pick UTF-8 as the default character encoding, and with the command line interface I added --default-character-set=UTF-8 to the command.
# Posted By Gavin Cooney | 8/27/08 2:18 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.1.004.