* Lars Wirzenius <[email protected]>, 2009-05-03, 19:36:
$ man utf-8 | grep -A 2 UTF-16 | sed -e 's/^ *//'
The UCS code values 0xd800–0xdfff (UTF-16 surrogates) as well as 0xfffe
and 0xffff (UCS non-characters) should not appear in  conforming  UTF-8
streams.

$ s='\xed\xa0\x88\xed\xbd\x85' # 0xd808 + 0xdf45
$ printf $s | isutf8 && echo $?
0

Thanks for the bug report. You report very clear bugs!

Attached is a patch that should fix the issue. Jakub, could you test it
and verify that I've understood things correctly and that it really
fixes the problem?
Looks fine to me.

--
Jakub Wilk



--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to