Re: Why is `find -name '.txt'` much slower than '.txt' on glusterfs?

James Youngman Sun, 21 Jan 2018 05:51:08 -0800

On Sat, Jan 20, 2018 at 10:16 AM, Peng Yu <pengyu...@gmail.com> wrote:


> Hi,
>
> There are ~7000 .txt files in a directory on glusterfs. Here are the run
> time of the following two commands. Does anybody know why the find command
> is much slower than *.txt? Is there a way to change the API that `find`
> uses to search files so that it can be more friendly to
> glusterfs?
>
> $ time echo *.txt > /dev/null
>
> real    0m2.206s
> user    0m0.039s
> sys     0m0.056s
> $ time find -name '*.txt' > /dev/null
>
> real    0m18.558s
> user    0m0.317s
> sys     0m0.663s



Is this an apples-to-apples comparison?   For example does . contain sub
directories?    A comparison of the output of strace -c for both commands
will probably be illuminating.   Perhaps stat calls are relatively
expensive on glusterfs (this happens on at least some other cluster
filesystems because obtaining a correct value fort st_size requires finding
the consensus answer for the current length of the file, while obtaining
the list of items in a directory may not require the same amount of locking
or consensus work

James.

Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?

Reply via email to

Re: Why is `find -name '.txt'` much slower than '.txt' on glusterfs?