https://gcc.gnu.org/g:abd8e6af5e64bd1e97a8abb15f27fa3cdcb4ba82
commit r16-8387-gabd8e6af5e64bd1e97a8abb15f27fa3cdcb4ba82 Author: Sandra Loosemore <[email protected]> Date: Wed Apr 1 08:36:40 2026 +0000 doc: Update docs for character set support and environment variables [PR70917] The "Environment Variables" section of the GCC manual had long-obsolete (20+ years!) information about environment variables affecting character sets and encodings. In particular, GCC stopped using environment variables to control how input files are parsed in 2004. gcc/ChangeLog PR preprocessor/70917 * doc/invoke.texi (Environment Variables): Clarify that LC_ALL, LC_CTYPE, LC_MESSAGES, and LANG affect only diagnostics and informational output from GCC, not the encodings of input and output files. Remove separate bit-rotten entry for LANG. Diff: --- gcc/doc/invoke.texi | 41 ++++++++--------------------------------- 1 file changed, 8 insertions(+), 33 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 2c1b60df17cf..7749513bfb44 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -37440,28 +37440,25 @@ GNU Compiler Collection (GCC) Internals}. @c @itemx LC_TIME @itemx LC_ALL These environment variables control the way that GCC uses -localization information which allows GCC to work with different -national conventions. GCC inspects the locale categories +localization information that allows GCC to work with different +national conventions for diagnostic and informational output. +GCC inspects the locale categories @env{LC_CTYPE} and @env{LC_MESSAGES} if it has been configured to do so. These locale categories can be set to any value supported by your installation. A typical value is @samp{en_GB.UTF-8} for English in the United Kingdom encoded in UTF-8. -The @env{LC_CTYPE} environment variable specifies character -classification. GCC uses it to determine the character boundaries in -a string; this is needed for some multibyte encodings that contain quote -and escape characters that are otherwise interpreted as a string -end or escape. - -The @env{LC_MESSAGES} environment variable specifies the language to -use in diagnostic messages. - If the @env{LC_ALL} environment variable is set, it overrides the value of @env{LC_CTYPE} and @env{LC_MESSAGES}; otherwise, @env{LC_CTYPE} and @env{LC_MESSAGES} default to the value of the @env{LANG} environment variable. If none of these variables are set, GCC defaults to traditional C English behavior. +These environment variables do not affect the encodings of input files +and strings in output files produced by GCC, which are controlled by the +@option{-finput-charset} and @option{-fexec-charset} options, respectively. +@xref{Preprocessor Options}. + @vindex TMPDIR @item TMPDIR If @env{TMPDIR} is set, it specifies the directory to use for temporary @@ -37538,28 +37535,6 @@ using GCC also uses these directories when searching for ordinary libraries for the @option{-l} option (but directories specified with @option{-L} come first). -@vindex LANG -@cindex locale definition -@item LANG -This variable is used to pass locale information to the compiler. One way in -which this information is used is to determine the character set to be used -when character literals, string literals and comments are parsed in C and C++. -When the compiler is configured to allow multibyte characters, -the following values for @env{LANG} are recognized: - -@table @samp -@item C-JIS -Recognize JIS characters. -@item C-SJIS -Recognize SJIS characters. -@item C-EUCJP -Recognize EUCJP characters. -@end table - -If @env{LANG} is not defined, or if it has some other value, then the -compiler uses @code{mblen} and @code{mbtowc} as defined by the default locale to -recognize and translate multibyte characters. - @vindex GCC_EXTRA_DIAGNOSTIC_OUTPUT @item GCC_EXTRA_DIAGNOSTIC_OUTPUT If @env{GCC_EXTRA_DIAGNOSTIC_OUTPUT} is set to one of the following values,
