FWIW, a user on Stack Overflow just reported the same issue with list.files
running R 4.3.z on Windows. They do not observe the issue running R-devel,
with Tomas' patch (r84960). It is still the case that their file names did
not exceed 260 wide characters.
https://stackoverflow.com/q/77527167/12685768
Mikael
On 2023-08-17 6:00 am, r-devel-requ...@r-project.org wrote:
Message: 5
Date: Wed, 16 Aug 2023 16:00:13 +0200
From: Tomas Kalibera<tomas.kalib...@gmail.com>
To: Ivan Krylov<krylov.r...@gmail.com>
Cc:"r-devel@r-project.org" <r-devel@r-project.org>
Subject: Re: [Rd] R-4.3 version list.files function could not work
correctly in chinese
Message-ID:<21e91609-85b2-103b-8e23-12eadff62...@gmail.com>
Content-Type: text/plain; charset="utf-8"; Format="flowed"
On 8/16/23 13:22, Ivan Krylov wrote:
On Wed, 16 Aug 2023 09:42:09 +0200
Tomas Kalibera<tomas.kalib...@gmail.com> wrote:
Fixed in R-devel (84960). Please let me know if you see any problem
with the fix.
Thank you for implementing the fix! I gave 叶月光 the link to the
GitHub Action build of the r84960 installer.
Thanks and thanks for looking at the change.
I'm worried that 叶月光 was seeing FindNextFileA fail for a different
reason (all the examples given at the Capital of Statistics forum
seemed to use less than 256/4 = 64 characters per file name...), but
maybe this won't reappear with the switch to FindNextFileW. If this
keeps happening, it might be worth producing a warning when
FindNextFileW() fails with an unexpected GetLastError() value.
I've added a warning to R-devel when list.files() on Windows stops
listing a directory due to an error.
There is probably not more we can do unless there is a revised bug
report of the original problem.
fs::dir_fs() uses NtQueryDirectoryFile() and WideCharToMultiByte()
instead of FindNextFileW() and wcstombs(), but maybe this shouldn't
matter. In particular, both list.files() and fs::dir_fs() would fail
given a file name that cannot be represented in UTF-8 (invalid UTF-16
surrogate pairs?)
Right, R only support file names that are valid strings, this assumption
is present at many places in the code, so it is fine/consistent to be
here as well. The choice of opendir/readdir in R was probably motivated
by minimization of platform-specific code.
Best
Tomas
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel