On 01/27/2018 06:45 PM, Peng Yu wrote:
glusterfs doesn't provide D_TYPE information:

getdents(4, {{d_ino=10054722685526780333, ..., d_type=DT_UNKNOWN} ...

Nevertheless, it is strange that find calls newfstatat() also
in the case of "-maxdepth 1" - it shouldn't need to.


Should this be considered as a performance bug of 'find'?

well, maybe.

I could reproduce this case with sshfs where getdents also returns DT_UNKNOWN.

  $ mkdir -p ~/tmp/d1 \
      && seq 10000 | xargs env -C ~/tmp/d1 touch

  $ mkdir -p ~/tmp/mnt \
      && sshfs localhost:tmp/d1 ~/tmp/mnt

  $ strace -ve getdents,newfstatat find ~/tmp/mnt -maxdepth 1

  $ strace -ve getdents,newfstatat find -D search ~/tmp/mnt -maxdepth 1 -name 
doesntmatter

The problem seems to be that gnulibs' fts_read() already tries to determine
whether the current item is a directory [1]:

  [...]
  getdents(4, [], 32768)                  = 0
  newfstatat(5, "8793", {st_dev=makedev(0, 46), st_ino=2, st_mode=S_IFREG|0644, 
...}, AT_SYMLINK_NOFOLLOW) = 0

before find() sees it [2]:

  consider_visiting (early): ‘/home/berny/tmp/mnt/8793’: fts_info=FTS_F , [...]

@James: do you have an idea how to work around this?

[1]
https://git.sv.gnu.org/cgit/gnulib.git/tree/lib/fts.c?id=d4f6a210f44a#n1054
[2]
https://git.sv.gnu.org/cgit/findutils.git/tree/find/ftsfind.c?id=040f20b91e#n559

Have a nice day,
Berny

Reply via email to