On 8/15/19 6:42 PM, Carroll, Paul wrote:
> This is a proposed patch to libiberty, with an accompanying patch to GCC.
> The purpose of this patch is to make it possible for Windows-hosted
> toolchains to have the ability to control whether Canonicalized
> filenames are converted to all lower-case.
> Most Windows users are not affected by this behavior.
> However, we have at least one customer who does use Windows systems
> where their hard drives are case-sensitive.  Thus, they desire this
> ability.
> 
> The first implementation of Windows support in libiberty/lrealpath.c was
> added back in 2004.  The new code included a call to GetFullPathName(),
> to Canonicalize the filename.  Next,  if there was sufficient room in
> the buffer, the following code was run:
> 
>         /* The file system is case-preserving but case-insensitive,
>            Canonicalize to lowercase, using the codepage associated
>            with the process locale.  */
>         CharLowerBuff (buf, len);
>         return strdup (buf);
> 
> In effect, the assumption of the code is that all Windows file systems
> will be case-preserving but case-insensitive, so converting a filename
> to lowercase is not a problem.  And tools would always find the
> resulting file on the file system.  That turns out not to be true, but
> lrealpath() was mostly used just for system header files, so no one
> noticed.
> 
> However, in April 2014, libcpp was patched, to cause even non-system
> headers on Windows systems to be Canonicalized.  This patch has caused
> problems for users that have case-sensitive file systems on their
> Windows systems.  A patch to libcpp was proposed to do additional
> Canonicalize non-system headers on Windows systems.
> The discussion on the patch starts at
> https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00009.html
> As is noted in the comments:
>       For DOS based file system, we always try to shorten non-system
> headers, as DOS has a tighter constraint on max path length.
> The okay to add the patch was given May 9, 2014 at
> https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00557.html , with the note:
>       I've spoke with Kai a bit about this and he thinks it's
> appropriate and desirable to shorten paths on these kinds of filesystems.
> 
> The libcpp change meant that lrealpath() began to get called for
> non-system header files.  There is other code in both bfd and libiberty
> that can reach the lrealpath() function, but apparently those functions
> are not as much of a problem, as in not really being used in most tools
> (obviously since our customer had no complaints before 2014).
> 
> What I am proposing is to modify lrealpath() to only call CharLowerBuff
> when the user has a filesystem where changing the case of the filename
> is not an issue.
> That is actually most users, so the default should be to call
> CharLowerBuff.
> But I want to give the user the ability to control that behavior.
> 
> My proposal is to change that section of code in libiberty/lrealpath.c
> as follows:
> 
>      else
>        {
> -       /* The file system is case-preserving but case-insensitive,
> -          Canonicalize to lowercase, using the codepage associated
> +       /* The file system is normally case-preserving but
> case-insensitive,
> +          Canonicalize to lowercase if desired, using the codepage
> associated
>            with the process locale.  */
> -        CharLowerBuff (buf, len);
> +       if (libiberty_lowerpath)
> +          CharLowerBuff (buf, len);
>          return strdup (buf);
>        }
> 
> In effect, this just adds a control to let the user decide whether or
> not a pathname is converted to lowercase on Windows systems.
> 
> I also added a global definition for that control at the start of the
> libiberty/lrealpath.c source, setting it so the default behavior on
> Windows is to convert pathnames to lowercase:
> 
> +/* External option to control whether a pathname is converted
> +   to all lower-case on Windows systems.  The default behavior
> +   is to convert.
> +*/
> +#if defined (_WIN32)
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +unsigned char libiberty_lowerpath = 1;
> +#ifdef __cplusplus
> +}
> +#endif
> +#endif
> 
> And, for use by tools that link to libiberty.a, I added an external
> reference to that control in include/libiberty.h:
> 
> +#if defined (_WIN32)
> +/* Determine if filenames should be converted to lower case */
> +extern unsigned char libiberty_lowerpath;
> +#endif
> 
> 
> Adding the above code to include/libiberty.h and libiberty/lrealpath.c
> results in a libiberty.a that behaves exactly the same as it does today
> for most users.
> It also makes available a method of affecting whether or not a given
> tool affects canonicalized pathnames on Windows by also converting those
> pathnames to lowercase.
> 
> The questions to discuss:
> 1.    Is this a reasonable solution to this problem, by adding such a
> controlling symbol?
> 2.    Is it reasonable to use ‘libiberty_lowerpath’ as the symbol’s
> name?  There are other global symbols defined in libiberty that use a
> similar name, so this seems reasonable.
> 3.    Should the symbol be called ‘libiberty_lowerpath’ or something
> else, such as ‘libiberty_lowerfilename’?  Does that matter?
> 4.    While the ‘CharLowerBuff()’ code is within a _WIN32 section of
> code, should the global symbol also be defined so it only exists in
> libiberty.a for Windows hosts?  Or should that global symbol always be
> defined in libiberty.a, regardless of whether it is for Windows or
> Linux?  Obviously that symbol is only meaningful for Windows hosts.  But
> the existence (or non-existence)  of the symbol obviously affects how
> tools write their code to modify that symbol.  As presently written,
> tools would only modify this symbol when being built for running on
> Windows.
> 
> Depending on the answers to the above, a followup can show the proposed
> changes to tools, to add options to allow control of changing filenames
> to lower case on Windows.
> 
> If the above change (or something similar) is done to libiberty, then
> GCC should also be changed to add an option for accessing this
> variable.  I would propose adding the '-flowerpath' option, which would
> be the default.  That also adds '-fno-lowerpath', which is what users
> would use to keep their filenames from being converted to lower-case. 
> Most users, of course, could just ignore this option.  And it would only
> be meaningful for Windows-hosted tools.
> 
> Here are the patches that would implement the above:
> 
> diff -raup a/include/libiberty.h b/include/libiberty.h
> --- a/include/libiberty.h    2019-08-07 08:16:05.269694038 -0700
> +++ b/include/libiberty.h    2019-08-07 09:02:05.771381502 -0700
> @@ -750,6 +750,11 @@ extern unsigned long libiberty_len;
>     (char *) memcpy (libiberty_nptr, libiberty_optr, libiberty_len))
>  #endif
> 
> +#if defined (_WIN32)
> +/* Determine if filenames should be converted to lower case */
> +extern unsigned char libiberty_lowerpath;
> +#endif
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff -raup a/libiberty/lrealpath.c b/libiberty/lrealpath.c
> --- a/libiberty/lrealpath.c    2019-08-07 08:16:05.957694774 -0700
> +++ b/libiberty/lrealpath.c    2019-08-07 09:03:03.271408306 -0700
> @@ -72,6 +72,20 @@ extern char *canonicalize_file_name (con
>  # endif
>  #endif
> 
> +/* External option to control whether a pathname is converted
> +   to all lower-case on Windows systems.  The default behavior
> +   is to convert.
> +*/
> +#if defined (_WIN32)
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +unsigned char libiberty_lowerpath = 1;
> +#ifdef __cplusplus
> +}
> +#endif
> +#endif
> +
>  char *
>  lrealpath (const char *filename)
>  {
> @@ -159,10 +173,11 @@ lrealpath (const char *filename)
>        return strdup (filename);
>      else
>        {
> -    /* The file system is case-preserving but case-insensitive,
> -       Canonicalize to lowercase, using the codepage associated
> +    /* The file system is normally case-preserving but case-insensitive,
> +       Canonicalize to lowercase if desired, using the codepage associated
>         with the process locale.  */
> -        CharLowerBuff (buf, len);
> +    if (libiberty_lowerpath)
> +          CharLowerBuff (buf, len);
>          return strdup (buf);
>        }
>    }
> diff -raup a/gcc/common.opt b/gcc/common.opt
> --- a/gcc/common.opt    2019-08-05 09:42:54.889054582 -0700
> +++ b/gcc/common.opt    2019-08-05 10:09:01.708902089 -0700
> @@ -1543,6 +1543,10 @@ floop-nest-optimize
>  Common Report Var(flag_loop_nest_optimize) Optimization
>  Enable the loop nest optimizer.
> 
> +flowerpath
> +Common Report
> +Convert pathnames to all lowercase on Windows (ignore elsewhere).
> +
>  fstrict-volatile-bitfields
>  Common Report Var(flag_strict_volatile_bitfields) Init(-1) Optimization
>  Force bitfield accesses to match their type width.
> diff -raup a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> --- a/gcc/doc/invoke.texi    2019-08-05 09:42:55.213053535 -0700
> +++ b/gcc/doc/invoke.texi    2019-08-05 10:27:53.749415626 -0700
> @@ -174,7 +174,8 @@ in the following sections.
>  -pass-exit-codes  -pipe  -specs=@var{file}  -wrapper  @gol
>  @@@var{file}  -ffile-prefix-map=@var{old}=@var{new}  @gol
>  -fplugin=@var{file}  -fplugin-arg-@var{name}=@var{arg}  @gol
> --fdump-ada-spec@r{[}-slim@r{]}  -fada-spec-parent=@var{unit}
> -fdump-go-spec=@var{file}}
> +-fdump-ada-spec@r{[}-slim@r{]}  -fada-spec-parent=@var{unit} @gol
> +-fdump-go-spec=@var{file} -fno-lowerpath}
> 
>  @item C Language Options
>  @xref{C Dialect Options,,Options Controlling C Dialect}.
> diff -raup a/gcc/opts.c b/gcc/opts.c
> --- a/gcc/opts.c    2019-08-05 09:42:55.385052980 -0700
> +++ b/gcc/opts.c    2019-08-05 10:09:01.724902054 -0700
> @@ -2412,6 +2412,12 @@ common_handle_option (struct gcc_options
>        opts->x_flag_stack_usage_info = value != 0;
>        break;
> 
> +    case OPT_flowerpath:
> +#ifdef __WIN32
> +      libiberty_lowerpath = value;
> +#endif
> +      break;
> +
>      case OPT_g:
>        set_debug_level (NO_DEBUG, DEFAULT_GDB_EXTENSIONS, arg, opts,
> opts_set,
>                         loc);
So what I dislike here is using global variables to pass around state
between gcc and libiberty.  ISTM that providing a function based API
that libiberty consumers can use would be better.

THoughts?

Jeff

Reply via email to