Hi Bruno,

On Mon, Aug 18, 2025 at 03:54:22PM +0200, Bruno Haible wrote:
> It is good that you explain the rationale. But the essential part of these
> explanations ought to be contained in the code, since 10 years from now,
> people will read the code, not the ChangeLog entry or git history. Therefore,
> can you please add sensible comments in the code?

Yes, I could do that.


> Also, since the problem occurs only in a particular situation (/proc file
> system and FTS_TIGHT_CYCLE_CHECK), can you please submit a unit test as well?
> Ideally, as a new file tests/test-fts-2.c.

Making test for this is not simple, see the strace:

1: newfstatat(AT_FDCWD, "/proc/self/fd", {st_mode=S_IFDIR|0500,
st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
2: write(1, " entering: /proc/self/fd\n", 26) = 26
3: write(1, "Read: fd", 8)                 = 8
4: openat(AT_FDCWD, "/proc/self/fd",
O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC|O_DIRECTORY) = 3
5: newfstatat(3, "", {st_mode=S_IFDIR|0500, st_size=0, ...}, AT_EMPTY_PATH) = 0
6: fcntl(3, F_GETFL)                       = 0x38800 (flags
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW|O_DIRECTORY)
7: fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
8: newfstatat(3, "", {st_mode=S_IFDIR|0500, st_size=0, ...}, AT_EMPTY_PATH) = 0

On line 1 the directory is stat() the first time and it's inserted to
the hash immediately after printing the debug message on line 2.
Then fts_read() returns and one can process the FTSENT in his loop.
When fts_read() is entered again, libc opendir() gets called, which
makes syscalls 4 to 7. Failure of these is propagated by libc and is
correctly handled by the gnulib. Then on the line 8 the problematic stat
is called, which failure was not handled correctly.

As the control flow is not passed back to the caller between calls
7 and 8, the only way how to make a test for this would be by overriding
the fstat call provided by glibc or using a trace mechanism to stop the
caller at the right time and then tinker with the FS.

Neither of this is portable and also depends on the libc implementation,
which makes it fragile. I have tried grepping other tests and I haven't
found any doing something similar.

Let me know if I should try to make the test by overriding glibc fstat
function of if it's too hackish.
BR,
  Petr

Reply via email to