http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28315
Laszlo Ersek <lacos at caesar dot elte.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bonzini at gnu dot org, | |lacos at caesar dot elte.hu --- Comment #1 from Laszlo Ersek <lacos at caesar dot elte.hu> 2013-03-29 13:17:21 UTC --- gcc has defaulted to UTF-8 rather than the locale's codeset in _cpp_default_encoding() [libcpp/charset.c] since the following 2004 hunk: http://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=d856c8a6#patch25 ( The default encoding is selected for both "input_charset" (overrideable with -finput-charset) and "narrow_charset" (overrideable with -fexec-charset): cpp_create_reader() [libcpp/init.c] ~ narrow_charset = _cpp_default_encoding() ~ input_charset = _cpp_default_encoding() The "overrides" are implemented in c_common_handle_option() [gcc/c-family/c-opts.c]. ) Considering the encodings of source files "in the wild" that gcc has been used to compile in the last 8+ years (ie. while the "&& 0" has been in place): - UTF-8 (of which 7-bit ASCII is a subset) worked. - Any non-UTF-8 encoding that utilized the MSB (eg. ISO-8859-2) required the -finput-charset option. People who would have originally wanted gcc to take that codeset from the locale were probably *developing* the source code in question, hence they could easily add the -finput-charset to their makefiles. Much of the world must have migrated to UTF-8-encoded locales by now. Reverting the "&& 0" would: - not affect people with such a distro-default locale who build UTF-8 / ASCII sources: their locale codeset matches the current hardwired default, - not affect people building sources with non-UTF-8 8-bit codesets (eg. ISO-8859-2), since those projects already have to use the -finput-charset options in their makefiles, - affect people who have stuck to their 7-bit ASCII, or non-UTF-8 8-bit codesets in their locales, and compile real UTF-8 sources. People in the last group (which includes me :)) would be forced to (a) modify their locale when building such sources as end-users, or (b) to find out about -finput-charset=UTF-8 and pass it via (b1) Makefile hacking or (b2) ./configure settings (env vars, or command line options). I think that's unreasonable; building random projects from the tubes would break for this small but existent group of users. Therefore I suggest to keep the logic as-is, and update the docs instead ("gcc/doc/cppopts.texi"): "-finput-charset" should not refer to the locale.