On Wed, Aug 12, 2009 at 02:01:35PM +0200, Christoph Burgmer wrote: > A string type column with a utf8_bin collation will not be converted to a > Python Unicode string, but instead will be returned as a utf8 (byte) string. > > The MySQL documentation though clearly states: "A nonbinary string has a > character set and is converted to another character set in many cases, even > when the string has a _bin collation"[1]. > > I understand that a string with utf8_bin collation is still a string and > thus should not be dealt with differently. The utf8_bin collation is > essential when working with Unicode without wanting the Unicode collation > algorithm to kick in. > > How to reproduce: > > CREATE TABLE t1 ( > a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin, > ); > > INSERT INTO t1 VALUES ('ΓΌ'); > > In Python: > >>> import MySQLdb > >>> db = MySQLdb.connect(db='pymysqltest', charset='utf8', use_unicode=True) > >>> cur = db.cursor() > >>> cur.execute("SELECT a FROM t1;") > 1L > >>> cur.fetchall() > (('\xc3\xbc',),) > > Chosing utf8_general_ci instead of utf8_bin will properly yield Unicode > objects: > > >>> cur.execute("SELECT a COLLATE utf8_general_ci FROM t1;") > 1L > >>> cur.fetchall() > ((u'\xfc',),) > > [1] http://dev.mysql.com/doc/refman/5.1/en/charset-binary-collations.html
On Sat, Dec 10, 2011 at 02:50:27PM +0100, Philipp Spitzer wrote: > It is still present in upstream python-mysqldb 1.2.3. Is this bug still present in the python-mysqldb package in unstable? Version 1.3.6-1 based on the mysqlclient fork. -- Brian May <b...@debian.org>