On Tue, Oct 2, 2012 at 8:37 AM, Tomas Doran <[email protected]> wrote:
> I entirely agree - there should never be explicit code like this needed.
>
> However, currently, if you remove it - then the tests fail.
>
I just looked at the tests. It's easy to make it mostly pass, but there's
parts of that test that I don't understand.
Just so we agree, the serialized data should be utf8-encoded octets, and
deserialized back into characters. That's what JSON does automatically.
And it's expected that character data is correctly flagged -- e.g. utf8
octets brought in from, say a database, is correctly decoded
(pg_enable_utf8 for Postgresql as an example).
I also use JSON (or JSON::XS) so not clear if JSON::ANY behaves the same.
The test has inline utf8 *character* strings but fails to set "use utf8" at
the start of the test. So, this attribute:
has 'utf8_string' => (
is => 'rw',
isa => 'Str',
default => sub { "ネットスーパー (Internet Shopping)" }
);
means that the utf8_string is not flagged as character data. That sets in
motion the failure of the rest of the tests.
So, I added "use utf8;" at the top of the test.
When comparing the serialized data then need to encode the test character
string to utf8 octets (because we are comparing to serialized octets).
is($json,
*encode_utf8(* '{"__CLASS__":"Foo","utf8_string":"ネットスーパー (Internet
Shopping)"}*')*,
'... got the right JSON');
But, I'm confused by the last test set that starts like this:
my $test_string;
{
use utf8;
$test_string = "ネットスーパー (Internet Shopping)";
no utf8;
}
Ok, so now we have a character string.
But, then the tests forces the utf8 bit off:
Encode::_utf8_off($test_string);
So, I'm just not clear what these tests are trying to do. What's the point
of testing that a character string with its utf8 flag forced off works
correctly?
--
Bill Moseley
[email protected]