From: anomie at users dot sourceforge dot net Operating system: Linux PHP version: 5.3CVS-2008-12-26 (snap) PHP Bug Type: JSON related Bug description: json_encode chokes on characters outside the BMP
Description: ------------ json_encode encodes characters above U+1FFFF incorrectly; sometimes it incorrectly encodes them as characters in the U+10000-U+1FFFF range, and sometimes it just errors out. Note this is not an error with the source not being UTF8; as you can see below, I am building the UTF8-encoded text byte-by-byte. 5.2.6 has the same problem, although instead of null it returns "aa" for those cases due to bug 43941. It looks like there are actually two unrelated bugs here: 1. utf8_to_utf16 in ext/json/utf8_to_utf16.c should use "c -= 0x10000;" at line 49 instead of "c &= 0xFFFF;". This causes the part where it incorrectly encodes values over U+1FFFF as U+10000-U+1FFFF. 2. utf8_decode_next in ext/json/utf8_decode.c should use 0xF8 instead of 0xF1 at line 168. This causes the part where UTF8 characters beginning with an F1 or F3 byte error out. Reproduce code: --------------- for($i=1; $i<=16; $i++){ print json_encode("aa".chr(0xf0|($i>>2)).chr(0x8f|($i&3)<<4)."\xbf\xbdzz")."\n"; } Expected result: ---------------- "aa\ud83f\udffdzz" "aa\ud87f\udffdzz" "aa\ud8bf\udffdzz" "aa\ud8ff\udffdzz" "aa\ud93f\udffdzz" "aa\ud97f\udffdzz" "aa\ud9bf\udffdzz" "aa\ud9ff\udffdzz" "aa\uda3f\udffdzz" "aa\uda7f\udffdzz" "aa\udabf\udffdzz" "aa\udaff\udffdzz" "aa\udb3f\udffdzz" "aa\udb7f\udffdzz" "aa\udbbf\udffdzz" "aa\udbff\udffdzz" Actual result: -------------- "aa\ud83f\udffdzz" "aa\ud83f\udffdzz" "aa\ud83f\udffdzz" null null null null "aa\ud83f\udffdzz" "aa\ud83f\udffdzz" "aa\ud83f\udffdzz" "aa\ud83f\udffdzz" null null null null "aa\ud83f\udffdzz" -- Edit bug report at http://bugs.php.net/?id=46944&edit=1 -- Try a CVS snapshot (PHP 5.2): http://bugs.php.net/fix.php?id=46944&r=trysnapshot52 Try a CVS snapshot (PHP 5.3): http://bugs.php.net/fix.php?id=46944&r=trysnapshot53 Try a CVS snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=46944&r=trysnapshot60 Fixed in CVS: http://bugs.php.net/fix.php?id=46944&r=fixedcvs Fixed in CVS and need be documented: http://bugs.php.net/fix.php?id=46944&r=needdocs Fixed in release: http://bugs.php.net/fix.php?id=46944&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=46944&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=46944&r=needscript Try newer version: http://bugs.php.net/fix.php?id=46944&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=46944&r=support Expected behavior: http://bugs.php.net/fix.php?id=46944&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=46944&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=46944&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=46944&r=globals PHP 4 support discontinued: http://bugs.php.net/fix.php?id=46944&r=php4 Daylight Savings: http://bugs.php.net/fix.php?id=46944&r=dst IIS Stability: http://bugs.php.net/fix.php?id=46944&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=46944&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=46944&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=46944&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=46944&r=mysqlcfg