Please make sure that errors result in only the first byte of an error being replaced and UTF-8 parsing continuing with the second byte.

For instance if 0xE0 followed by 0x20 should produce an error indicator followed by a space. It appears this code will only produce an error indicator.

In addition this does not appear to be detecting and rejecting overlong forms.

For terminal display I have found it very useful to display the error bytes as though they are ISO8859-1 or CP1252 bytes. This makes the result readable if ISO8859-1 is accidentally output to the terminal.

Sorry to be a pain about this but bad UTF-8 handling is one of the things that really annoys me.
_______________________________________________
wayland-devel mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/wayland-devel

Reply via email to