Edit report at https://bugs.php.net/bug.php?id=63898&edit=1

 ID:                 63898
 Comment by:         programming at stefan-koch dot name
 Reported by:        sreed at ontraport dot com
 Summary:            json_encode sets string to null for invalid
                     characters
 Status:             Open
 Type:               Bug
 Package:            JSON related
 Operating System:   All
 PHP Version:        5.4.10
 Block user comment: N
 Private report:     N

 New Comment:

I was able to locate the bug, but I am too unknown in the PHP source to know 
how to fix it best.

For keys, just like for values, "json_escape_string" is being used. In PHP 5.4 
(unlike PHP 5.2) there's a check for invalid UTF-8 sequences. In PHP 5.2.0 this 
special check did not exist, instead when something was either wrong or empty, 
an empty string was printed.

So the location of the problem is line 432 in ext/json/json.c (PHP 5.4.12) or 
around line 442 in git master (commit ac9f53dd9c0b184bab14d669c72971c0405ed488).

My idea would be - if one wants to maintain the 'null' printing - to pass an 
additional argument to "json_escape_string" to tell whether this is a key or a 
value (since they seem to need different treatment, as null is not allowed for 
keys in JSON).
Alternative would be to insert empty string in case of invalid UTF8 sequence. 
This would be a very easy fix going back to the old state. However, I guess 
somebody introduced null for some reason.
Or you could return false if some error occured, but from my Python knowledge I 
really dislike this treatment. It's correct, but it leads to non-working code 
due to encoding problems very often, at least when you receive data from 
somewhere else).


Previous Comments:
------------------------------------------------------------------------
[2013-01-06 11:35:39] Sjon at hortensius dot net

This actually worked fine in 5.3.14 but was broken in 5.3.14:
 
http://3v4l.org/Eouni#v5314

5.2.0 - 5.2.6 would truncate the character without notice but wouldn't produce 
invalid json either

------------------------------------------------------------------------
[2013-01-04 01:06:40] sreed at ontraport dot com

.

------------------------------------------------------------------------
[2013-01-04 01:04:31] sreed at ontraport dot com

Description:
------------
When you use json_encode with an invalid UTF-8 byte sequence in a string PHP 
will 
generate a warning (with display_errors set to off) and the function returns an 
invalid json encoded string. The string with the invalid UTF-8 byte sequence is 
replaced with null (for example: {null:""}). This is invalid json and can not 
be 
decoded with json_decode.

I would think the expected behavior should be that json_encode should never 
returns an invalid json encoded string. It should either return false on 
failure 
as the documentation states or the invalid UTF-8 byte sequence should be 
handled 
in a way that does not corrupt the json string.

Test script:
---------------
$key = "Foo " . chr(163);

$array = array($key => "");

var_dump($array);

$json = json_encode($array);

echo $json."\n";

var_dump(json_decode($json));

Expected result:
----------------
I would expect the returned json string to be valid or for json_encode to 
return 
false. 

Actual result:
--------------
array(1) {
  ["Foo �"]=>
  string(0) ""
}
{null:""}
NULL



------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=63898&edit=1

Reply via email to