[Haskell-cafe] Re: File path programme

Aaron Denney Sun, 30 Jan 2005 14:03:56 -0800

On 2005-01-30, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote:
> Glynn Clements <[EMAIL PROTECTED]> writes:
>
>> And it isn't a theoretical issue. E.g. in an environment where EUC-JP
>> is used, filenames may begin with <ESC>$)B (designate JISX0208 to G1),
>> or they may not (because G1 is assumed to contain JISX0208 initally).
>
> I think such encodings are never used as default encodings of a Unix
> locale.
>
>>> The various UTF encodings do not have this particular problem; if a UTF 
>>> string is valid, then it is a unique representation of a unicode string.
>
> BOM is a problem. Unfortunately Unicode mandates that FEFF at the
> start of a UTF-8 text stream is a mark which doesn't belong to the
> text.


Right

> It provides variants of UTF-16/32 with and without a BOM, but
> UTF-8 only has the variant with a BOM. This makes UTF-8 a stateful
> encoding.

I think you mean "UTF-8 only has the variant without a BOM".  Otherwise
I'd like to see a citation in the standard for this.  Because that's
not the reading I get from <http://www.unicode.org/faq/utf_bom.html>.
Instead, it seems that whether the BOM is included or not is a function
of the protocol, and that the UTF-8 streams themselves do not include
the BOM.

-- 
Aaron Denney
-><-

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: File path programme

Reply via email to