Edit report at http://bugs.php.net/bug.php?id=52453&edit=1
ID: 52453 Updated by: and...@php.net Reported by: jeroen at asystance dot nl Summary: Connection charset seems arbitrary -Status: Open +Status: Feedback Type: Feature/Change Request Package: MySQLi related Operating System: Linux PHP Version: 5.3.3 -Assigned To: +Assigned To: mysql Block user comment: N New Comment: Are you using mysqlnd or libmysql. Libmysql uses its default charset, or set in my.cnf or by explicit call to mysqli_options before calling mysqli_real_connect, but after mysqli_init(). If you use mysqlnd, in this case mysqlnd always assumes the server default charset. This charset is sent during connection hand-shake/authentication and mysqlnd assumes it. Previous Comments: ------------------------------------------------------------------------ [2010-07-27 12:23:25] gerben at asystance dot nl I connected to the same remote database from a Windows/WAMP system and got exactly the same results. ------------------------------------------------------------------------ [2010-07-27 12:09:15] jeroen at asystance dot nl BTW: when I make the same connection as in the first example, but use the mysql CLI client instead of php, I get mysql> SHOW VARIABLES LIKE 'character_set%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ So PHP's mysql_connect() behaves differently from mysql cli! ------------------------------------------------------------------------ [2010-07-27 11:00:11] jeroen at asystance dot nl Description: ------------ The connection_charset used by a mysql/mysqli connection seems arbitrary - at least I cannot figure out how it is determined. The documentation provides no clues as to which charset is used by default. I've tried connecting to different mysql servers from different shell servers and can't figure out how the default charset is determined. As to find out which one is used, open a mysql/mysqli (procedural or object-oriented doesn't matter) connection and use mysql_client_encoding() / mysqli_get_charset() or "SHOW VARIABLES LIKE 'character_set%';" to find out. This probably is just a documentation problem, but maybe the default could be chosen more sensibly: for example, the mysql server database's charset seems the most sensible default. For example, connecting from a shell that has en_US.UTF-8 as locale, I get: character_set_client: utf8 character_set_connection: utf8 character_set_database: utf8 character_set_filesystem: binary character_set_results: utf8 character_set_server: utf8 character_set_system: utf8 character_sets_dir: /usr/share/mysql/charsets/ Switching to en_US.iso88591 doesn't change anything. So it would seem some server setting determines the charset, right? However, connecting to the same mysql server from another system (though from intranet instead of internet), I get: character_set_client: latin1 character_set_connection: latin1 character_set_database: utf8 character_set_filesystem: binary character_set_results: latin1 character_set_server: utf8 character_set_system: utf8 character_sets_dir: /usr/share/mysql/charsets/ Again, the client locale doesn't influence this. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/bug.php?id=52453&edit=1