#30692 [NEW]: Behaviour change in SAX causes breakage from php4 -> php5
From: chrivers at iversen-net dot dk Operating system: Linux 2.6.5, Debian Sarge PHP version: 5.0.2 PHP Bug Type: XML related Bug description: Behaviour change in SAX causes breakage from php4 -> php5 Description: When converting my pages to PHP5 SAX XML parser, they broke because of an appearant incompatability. The chardata-handler is called in a different pattern that in PHP4. Before, it seemed to be called once per character block. Now, the buffer is flushed before each block of high-bit characters, it seems. This is unexpected and (seemingly?) impossible to change. Reproduce code: --- $str"; $xml_parser = xml_parser_create(); #xml_set_element_handler($xml_parser, "es", "ee"); xml_set_character_data_handler($xml_parser, "cd"); xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, true); xml_parser_set_option($xml_parser, XML_OPTION_TARGET_ENCODING, "iso-8859-1"); If (xml_parse($xml_parser, $buffer) == false) die(sprintf("TV import error: %s at line %d col %d\n%s", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser), xml_get_current_column_number($xml_parser), $buffer)); xml_parser_free($xml_parser); ?> Expected result: expected: [ISO:æøå:ISO] php4: [ISO:æøå:ISO] Actual result: -- [ISO:] [æøå:ISO] -- Edit bug report at http://bugs.php.net/?id=30692&edit=1 -- Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=30692&r=trysnapshot4 Try a CVS snapshot (php5.0): http://bugs.php.net/fix.php?id=30692&r=trysnapshot50 Try a CVS snapshot (php5.1): http://bugs.php.net/fix.php?id=30692&r=trysnapshot51 Fixed in CVS:http://bugs.php.net/fix.php?id=30692&r=fixedcvs Fixed in release:http://bugs.php.net/fix.php?id=30692&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=30692&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=30692&r=needscript Try newer version: http://bugs.php.net/fix.php?id=30692&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=30692&r=support Expected behavior: http://bugs.php.net/fix.php?id=30692&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=30692&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=30692&r=submittedtwice register_globals:http://bugs.php.net/fix.php?id=30692&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=30692&r=php3 Daylight Savings:http://bugs.php.net/fix.php?id=30692&r=dst IIS Stability: http://bugs.php.net/fix.php?id=30692&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=30692&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=30692&r=float MySQL Configuration Error: http://bugs.php.net/fix.php?id=30692&r=mysqlcfg
[PHP-BUG] Bug #64874 [NEW]: json_decode handles whitespace and case-sensitivity incorrectly
From: chrivers at iversen-net dot dk Operating system: All PHP version: 5.4.15 Package: JSON related Bug Type: Bug Bug description:json_decode handles whitespace and case-sensitivity incorrectly Description: There are 2 problems with the json_decode function. 1) It only sometimes disregards whitespace The RFC clearly says: """Insignificant whitespace is allowed before or after any of the six structural characters.""" 2) It only sometimes enforces lowercase-ness of identifiers The RFC clearly says: """The literal names MUST be lowercase. No other literal names are allowed.""" The test script demonstrates this Test script: --- Expected result: true * 5 Actual result: -- bool(true) bool(false) bool(false) bool(true) bool(true) -- Edit bug report at https://bugs.php.net/bug.php?id=64874&edit=1 -- Try a snapshot (PHP 5.4): https://bugs.php.net/fix.php?id=64874&r=trysnapshot54 Try a snapshot (PHP 5.3): https://bugs.php.net/fix.php?id=64874&r=trysnapshot53 Try a snapshot (trunk): https://bugs.php.net/fix.php?id=64874&r=trysnapshottrunk Fixed in SVN: https://bugs.php.net/fix.php?id=64874&r=fixed Fixed in release: https://bugs.php.net/fix.php?id=64874&r=alreadyfixed Need backtrace: https://bugs.php.net/fix.php?id=64874&r=needtrace Need Reproduce Script: https://bugs.php.net/fix.php?id=64874&r=needscript Try newer version: https://bugs.php.net/fix.php?id=64874&r=oldversion Not developer issue:https://bugs.php.net/fix.php?id=64874&r=support Expected behavior: https://bugs.php.net/fix.php?id=64874&r=notwrong Not enough info: https://bugs.php.net/fix.php?id=64874&r=notenoughinfo Submitted twice: https://bugs.php.net/fix.php?id=64874&r=submittedtwice register_globals: https://bugs.php.net/fix.php?id=64874&r=globals PHP 4 support discontinued: https://bugs.php.net/fix.php?id=64874&r=php4 Daylight Savings: https://bugs.php.net/fix.php?id=64874&r=dst IIS Stability: https://bugs.php.net/fix.php?id=64874&r=isapi Install GNU Sed:https://bugs.php.net/fix.php?id=64874&r=gnused Floating point limitations: https://bugs.php.net/fix.php?id=64874&r=float No Zend Extensions: https://bugs.php.net/fix.php?id=64874&r=nozend MySQL Configuration Error: https://bugs.php.net/fix.php?id=64874&r=mysqlcfg
Bug #64874 [Opn]: json_decode handles whitespace and case-sensitivity incorrectly
Edit report at https://bugs.php.net/bug.php?id=64874&edit=1 ID: 64874 User updated by:chrivers at iversen-net dot dk Reported by:chrivers at iversen-net dot dk Summary:json_decode handles whitespace and case-sensitivity incorrectly Status: Open Type: Bug Package:JSON related Operating System: All PHP Version:5.4.15 Block user comment: N Private report: N New Comment: Well, the part of the RFC that you're quoting describes the "JSON-text" type, which indeed must be non- primitive. However, the json_decode() function is documented as taking a "json value", which according to the spec is """ A JSON value MUST be an object, array, number, or string, or one of the following three literal names: false null true """ So that's perfectly fine, really. There are other errors, too. For example, " true" WORKS while "true " fails, which makes no sense at all. I've created an updated test case: 0], $error[json_decode($x) !== $y] ); } printf($fmt, "JSON", "Expected", "Actual", "JSON_ERROR", "PASS"); printf("-- \n"); // works json_cmp("true", true); // fails - is actually true json_cmp("tRue", NULL); // fails - is actually NULL json_cmp("true ", true); // works json_cmp("[true ] ", array(true)); json_cmp("[ true ] ", array(true)); json_cmp("[true] ", array(true)); // works, even though the non-array version fails json_cmp("[tRue]", NULL); json_cmp("0", 0); json_cmp("1", 1); json_cmp("false", false); json_cmp("'foo'", NULL); json_cmp('"foo"', "foo"); json_cmp('1.123', 1.123); json_cmp('1.123 ', 1.123); json_cmp(' 1.123', 1.123); json_cmp('42', 42); json_cmp('42 ', 42); json_cmp(' 42', 42); json_cmp(".123", 0.123); ?> Which gives the following results: JSON Expected Actual JSON_ERROR PASS -- 'true' true true - - 'tRue' NULL true - FAIL 'true ' true NULL FAIL FAIL '[true ] ' array ( 0 => true,) array ( 0 => true,) - - '[ true ] ' array ( 0 => true,) array ( 0 => true,) - - '[true] 'array ( 0 => true,) array ( 0 => true,) - - '[tRue]' NULL NULL FAIL - '0' 00- - '1' 11- - 'false' falsefalse- - '\'foo\''NULL NULL FAIL - '"foo"' 'foo''foo'- - '1.123' 1.1231.123- - '1.123 ' 1.123NULL FAIL FAIL ' 1.123' 1.1231.123- - '42' 42 42 - - '42 '42 NULL FAIL FAIL ' 42'42 42 - - '.123' 0.1230.123- - I see "FAIL" 4 times, so that seems like 4 bugs to me. Previous Comments: -------------------- [2013-05-18 15:00:32] cmbecker69 at gmx dot de RFC 4627[1] also states in section 2: | A JSON text is a serialized object or array. | |JSON-text = object / array According to that definition the $json argument of examples 1-3 is not a valid JSON-text. Furthermore: json_decode('true '); var_dump(json_last_error() === JSON_ERROR_SYNTAX); prints: bool(true) So the returned NULL is actually correct according to the documentation. [1] <http://www.ietf.org/rfc/rfc4627.txt> [2013-05-17 21:48:29] chrivers at iversen-net dot dk Description: There are 2 problems with the json_decode function. 1) It only sometimes disregards whitespace The RFC clearly says: """Insignificant whitespace is allowed before or after any of the six structural characters.""" 2) It only sometimes enforces lowercase-ness of identifiers The RFC clearly says: """The literal names MUST be lowercase. No other literal names are allowed.""" The test script demonstrates this Test script: --- Expected result: true * 5 Actual result: -- bool(true) bool(false) bool(false) bool(true) bool(true) -- Edit this bug report at https://bugs.php.net/bug.php?id=64874&edit=1