Edit report at https://bugs.php.net/bug.php?id=62010&edit=1

 ID:                 62010
 Comment by:         votefordevnull at gmail dot com
 Reported by:        tklingenberg at lastflood dot net
 Summary:            json_decode produces invalid byte-sequences
 Status:             Open
 Type:               Bug
 Package:            JSON related
 Operating System:   Windows
 PHP Version:        5.3.13
 Block user comment: N
 Private report:     N

 New Comment:

Successfully reproduced on Linux


Previous Comments:
------------------------------------------------------------------------
[2012-05-11 22:46:34] tklingenberg at lastflood dot net

Looks like that #41067 https://bugs.php.net/bug.php?id=41067 was not fully 
fixed.

------------------------------------------------------------------------
[2012-05-11 22:12:42] tklingenberg at lastflood dot net

Description:
------------
It's a typical case the JSON *and* UTF-16 specifications warn about: decoding 
of 
non-existing UTF-16 code-points:

    json_decode('"\ud834"')

shoud give NULL because \ud834 is *invalid*. But instead it starts some party, 
get's boozed and offers this as UTF-8 byte-sequence:

    1110 1101  1010 0000  1011 0100
    1110 xxxx  10xx xxxx  10xx xxxx
               1101 1000  0011 0100
               D8         34

U+D834 is not a valid unicode character.



Test script:
---------------
if (NULL !== json_decode('"\ud834"')) {
    echo "json_decode is still broken.";
}

Expected result:
----------------
NULL because the json is invalid.

Actual result:
--------------
PHP tries to create UTF-8 out of it and fails by creating invalid UTF-8 unicode 
byte-sequences.


------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=62010&edit=1

Reply via email to