On 8/30/2013 4:01 AM, Anne van Kesteren wrote:
On Fri, Aug 30, 2013 at 9:40 AM, Gervase Markham <g...@mozilla.org> wrote:
We don't want people to try and move to UTF-8, but move back because
they haven't figured out how (or are technically unable) to label it
correctly and "it comes out all wrong".
You also don't want it to be wrong half of the time. Given that full
content scans won't fly (we try to restrict scanning for encodings as
much as possible), that's a very real possibility, especially given
forums such as in OP that are mostly ASCII.

Labeling is what people ought to do, and it's very easy: <meta
charset=utf-8> (if all other files end up unlabeled, they'll inherit
from this one).

The problem I have with this approach is that it assumes that the page is authored by someone who definitively knows the charset, which is not a scenario which universally holds. Suppose you have a page that serves up the contents of a plain text file, so your source data has no indication of its charset. What charset should the page report? The choice is between guessing (presumably UTF-8) or saying nothing (which causes the browser to guess Windows-1252, generally).

--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to