Niklas Hambüchen added the comment:
A small update / summary so far:
>From here this developed into coreutils discussion:
#29921 O(n^2) performance of rm -r
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=29921
and finally a `linux-fsdevel` discussion:
O(n^2) deletion performa
Niklas Hambüchen added the comment:
For people who pass by, this issue has been taken on again in:
https://bugs.python.org/issue9998
--
nosy: +nh2
___
Python tracker
<http://bugs.python.org/issue2
Niklas Hambüchen added the comment:
I've filed https://bugs.python.org/issue32453, which is about O(n^2) deletion
behaviour for large directories.
--
nosy: +nh2
___
Python tracker
<https://bugs.python.org/is
New submission from Niklas Hambüchen :
See http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a
for the explanation and equivalent fix in coreutils.
The gist ist that deleting entries in inode order can improve deletion
performance dramatically.
To obtain inode numbers
Niklas Hambüchen added the comment:
Serhiy, did you run your benchmark on an SSD or a spinning disk?
The coreutils bug mentions that the problem is seek times.
My tests on a spinning disk with 400k files suggest that indeed rmtree() is
~30x slower than `rm -r`:
# time (mkdir dirtest
Niklas Hambüchen added the comment:
It turns out I was wrong when saying that there's some cache we're hitting.
In fact, `rm -r` is still precisely O(n^2), even with the coreutils patch I
linked.
Quick overview table of the benchmark:
nfiles real user sys
100.
Niklas Hambüchen added the comment:
> Did you try to sync and flush caches before running `rm -r`?
Yes, it doesn't make a difference for me, I still see the same O(n²) behaviour
in `rm -r`.
I've sent an email "O(n^2) performance of rm -r" to bug-coreut...@gnu.org jus
Niklas Hambüchen added the comment:
OK, my coreutils email is at
http://lists.gnu.org/archive/html/bug-coreutils/2017-12/msg00054.html
--
___
Python tracker
<https://bugs.python.org/issue32
Niklas Hambüchen added the comment:
A better location to view the whole coreutils thread is:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=29921
--
___
Python tracker
<https://bugs.python.org/issue32