On Wed, Aug 12, 2009 at 02:01:35PM +0200, Christoph Burgmer wrote:
> A string type column with a utf8_bin collation will not be converted to a
> Python Unicode string, but instead will be returned as a utf8 (byte) string.
> 
> The MySQL documentation though clearly states: "A nonbinary string has a
> character set and is converted to another character set in many cases, even
> when the string has a _bin collation"[1].
> 
> I understand that a string with utf8_bin collation is still a string and
> thus should not be dealt with differently. The utf8_bin collation is
> essential when working with Unicode without wanting the Unicode collation
> algorithm to kick in.
> 
> How to reproduce:
> 
> CREATE TABLE t1 (
>     a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin,
> );
> 
> INSERT INTO t1 VALUES ('ΓΌ');
> 
> In Python:
> >>> import MySQLdb
> >>> db = MySQLdb.connect(db='pymysqltest', charset='utf8', use_unicode=True)
> >>> cur = db.cursor()
> >>> cur.execute("SELECT a FROM t1;")
> 1L
> >>> cur.fetchall()
> (('\xc3\xbc',),)
> 
> Chosing utf8_general_ci instead of utf8_bin will properly yield Unicode
> objects:
> 
> >>> cur.execute("SELECT a COLLATE utf8_general_ci FROM t1;")
> 1L
> >>> cur.fetchall()
> ((u'\xfc',),)
> 
> [1] http://dev.mysql.com/doc/refman/5.1/en/charset-binary-collations.html

On Sat, Dec 10, 2011 at 02:50:27PM +0100, Philipp Spitzer wrote:
> It is still present in upstream python-mysqldb 1.2.3.

Is this bug still present in the python-mysqldb package in unstable?
Version 1.3.6-1 based on the mysqlclient fork.
-- 
Brian May <b...@debian.org>

Reply via email to