Hi Lasse,

Thanks for the explanations and patch.

As I understand it, the patch modifies the conversion
  wchar_t[] (UTF-16)  -->  char[]
used for file names, and it does so only for opendir+readdir.
(And the pages that you refer to - "Filename smuggling" - explain
why this is desirable because it will stop mapping FRACTION SLASH
and DIVISION SLASH to '/'.)

Two remarks on this:

  * Such a change of the file name conversion mapping needs to be
    done consistently, for all operations from open() to chdir(),
    from stat() to truncate().

    Otherwise ill effects will be seen, for example, readdir()
    might tell the caller that a directory has no entries, but
    rmdir() on this directory will fail.

    Therefore, in the scope of Gnulib, what would be needed is a
    "transversal" module, that is, a module which affects the behaviour
    of other modules (e.g. the module 'sigpipe' or 'windows-stat-timespec').

  * This change should be opt-in, not enabled by default.
    Microsoft's file name conversion mapping is in use for 30 years,
    and therefore corresponds to the expectations of application
    developers.

In addition to this, some applications might want to go full-UTF-8 with
file names. (Not all applications, but Emacs and GNU clisp are packages
where things would get nicer without the limitations of an 8-bit "ANSI"
codepage. [Emacs has such a thing already, but it's separate from Gnulib.])
This too could be a transversal module.
(One way to enforce UTF-8 for file names is a 'manifest' file, like you
showed, but it affects many more things than file names. And also it
does not work in older versions of Windows, which is a requirement for
Emacs.)

Bruno




Reply via email to