On 01/27/2018 06:45 PM, Peng Yu wrote:
glusterfs doesn't provide D_TYPE information:
getdents(4, {{d_ino=10054722685526780333, ..., d_type=DT_UNKNOWN} ...
Nevertheless, it is strange that find calls newfstatat() also
in the case of "-maxdepth 1" - it shouldn't need to.
Should this be considered as a performance bug of 'find'?
well, maybe.
I could reproduce this case with sshfs where getdents also returns DT_UNKNOWN.
$ mkdir -p ~/tmp/d1 \
&& seq 10000 | xargs env -C ~/tmp/d1 touch
$ mkdir -p ~/tmp/mnt \
&& sshfs localhost:tmp/d1 ~/tmp/mnt
$ strace -ve getdents,newfstatat find ~/tmp/mnt -maxdepth 1
$ strace -ve getdents,newfstatat find -D search ~/tmp/mnt -maxdepth 1 -name
doesntmatter
The problem seems to be that gnulibs' fts_read() already tries to determine
whether the current item is a directory [1]:
[...]
getdents(4, [], 32768) = 0
newfstatat(5, "8793", {st_dev=makedev(0, 46), st_ino=2, st_mode=S_IFREG|0644,
...}, AT_SYMLINK_NOFOLLOW) = 0
before find() sees it [2]:
consider_visiting (early): ‘/home/berny/tmp/mnt/8793’: fts_info=FTS_F , [...]
@James: do you have an idea how to work around this?
[1]
https://git.sv.gnu.org/cgit/gnulib.git/tree/lib/fts.c?id=d4f6a210f44a#n1054
[2]
https://git.sv.gnu.org/cgit/findutils.git/tree/find/ftsfind.c?id=040f20b91e#n559
Have a nice day,
Berny