URL: <https://savannah.gnu.org/bugs/?57693>
Summary: find wastefully calls stat for leaves Project: findutils Submitted by: vpanteleev Submitted on: Wed 29 Jan 2020 12:49:28 PM UTC Category: find Severity: 3 - Normal Item Group: None Status: None Privacy: Public Assigned to: None Originator Name: Originator Email: Open/Closed: Open Discussion Lock: Any Release: None Fixed Release: None _______________________________________________________ Details: Consider the invocation: find /dir -mindepth 1 -maxdepth 1 The expected behavior is for find to print the full paths of all directory entries in /dir, which it does. However, as far as I can see, this task should not require find to perform stat calls on the directory entries of /dir. Nevertheless, it does so. In certain situations, a mere directory listing is much faster than also calling stat on every member. In my case, I am seeing a considerable performance difference when enumerating snapshots (~2000 total) on a btrfs filesystem located on a HDD. `ls /dir | cat` is almost instantaneous (here, the output is piped through `cat` so that `ls` doesn't attempt to colorize entries, which would require `stat` calls). However, the aforementioned `find` invocation, as well as just `ls`, takes several minutes. The find manual states the following for the -D option: stat Print messages as files are examined with the stat and lstat system calls. The find program tries to minimise such calls. However, I can observe that find does call stat, without even printing anything with `-D stat` prepended to its command line. I can see that find calls stat by attaching to it, as it is running, with gdb, and examining its backtrace. With findutils 28f11d689dc61f9202de44078d67299419fbad26 and gnulib a7903da07d3d18c23314aa0815adbb4058fd7cec, here is one instance: Thread 1 (process 277972): #0 0x00007f2aa82bdddf in __fxstatat64 () from /usr/lib/libc.so.6 #1 0x00005557b564ff96 in fstatat (__flag=256, __statbuf=0x5557b6204a48, __filename=<optimized out>, __fd=<optimized out>) at /usr/include/sys/stat.h:477 #2 fts_stat (sp=sp@entry=0x5557b61cbf40, p=p@entry=0x5557b62049d0, follow=follow@entry=false) at fts.c:1827 #3 0x00005557b565208b in rpl_fts_read (sp=0x5557b61cbf40) at fts.c:1044 #4 0x00005557b5634012 in find (arg=0x7ffc358806ea "/mnt/2016-hdd-8t-raid/home") at ftsfind.c:561 #5 0x00005557b5633aea in process_all_startpoints (argv=<optimized out>, argc=<optimized out>) at ftsfind.c:625 #6 main (argc=<optimized out>, argv=<optimized out>) at ftsfind.c:734 It looks like stat is not being called by find directly, but rather the fts feature of gnulib, so it looks like there is possibly a second bug here (-D stat not reporting stat calls in gnulib). _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?57693> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/