I think that we don't have any data when user doesn't use non-UTF-8 (and C) locale such as ja_JP.eucJP. We should get data via telemetry.
-- Makoto On Thu, Nov 30, 2017 at 9:02 PM, Masatoshi Kimura <vyv03...@nifty.ne.jp> wrote: > I intentionally ignored non-UTF-8 UNIX locales because our support for > those locales is already half-broken and almost nobody cares about that. > For example, OS.File assumes that the filesystem encoding is always > UTF-8 on UNIX while nsIFile does not. This discrepancy caused a bug[1] > that did not get much attention. > > I think it's time to stop pretending to support non-UTF-8 UNIX locales. > > [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1342659 > > On 2017/11/30 7:09, Karl Tomlinson wrote: >> I've always found this confusing, and so I'll write down the >> understanding I've reached, in the hope that either it will help >> others, or others can help me by correcting if these are >> misunderstandings. >> >> On Unix systems: >> >> `nativePath` >> >> contains the bytes corresponding to the native filename used >> by native system calls. >> >> `path` >> >> is a UTF-16 encoding of an attempt to provide a human >> readable version of the native filename. This involves >> interpreting native bytes according to the character encoding >> specified by the current locale of the application as >> indicated by nl_langinfo(CODESET). >> >> For different locales, the same file can have a different >> `path`. >> >> The native bytes may not be valid UTF-8, and so if the >> character encoding is UTF-8, then there may not be a valid >> `path` that can be encoded to produce the same `nativePath`. >> >> It is best to use `nativePath` for working with filenames, >> including conversion to URI, but use `path` when displaying >> names in the UI. >> >> On WINNT systems: >> >> `path` >> >> contains wide characters corresponding to the native filename >> used by native wide character system APIs. For at least most >> configurations, I assume wide characters are UTF-16, in which >> case this is also human readable. >> >> `nativePath` >> >> is an attempt to represent the native filename in the native >> multibyte character encoding specified by the current locale >> of the application. >> >> For different locales, I assume the same file can have a >> different `nativePath`. >> >> I assume there is not necessarily a valid multibyte character >> encoding, and so there may not be a valid `nativePath` that >> can be decoded to produce the same `path`. >> >> It is best to use `path` for working with filenames. >> Conversion to URI involves assuming `path` is UTF-16 and >> converting to UTF-8. >> >> The parameters mean very different things on different systems, >> and so it is not generally possible to write XP code with either >> of these, but Gecko attempts to do so anyway. >> >> The numbers of applications not using UTF-8 and filenames not >> valid UTF-8 are much smaller on Unix systems than the numbers of >> applications not using UTF-8 and non-ASCII filenames on WINNT >> systems, and so choosing to work with `path` provides more >> compatibility than working with `nativePath`. >> _______________________________________________ >> dev-platform mailing list >> dev-platform@lists.mozilla.org >> https://lists.mozilla.org/listinfo/dev-platform >> > _______________________________________________ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform