[issue32453] shutil.rmtree can have O(n^2) performance on large dirs

2021-10-30 Thread Niklas Hambüchen
Niklas Hambüchen added the comment: A small update / summary so far: >From here this developed into coreutils discussion: #29921 O(n^2) performance of rm -r https://debbugs.gnu.org/cgi/bugreport.cgi?bug=29921 and finally a `linux-fsdevel` discussion: O(n^2) deletion performa

[issue2936] ctypes.util.find_library() doesn't consult LD_LIBRARY_PATH

2017-05-13 Thread Niklas Hambüchen
Niklas Hambüchen added the comment: For people who pass by, this issue has been taken on again in: https://bugs.python.org/issue9998 -- nosy: +nh2 ___ Python tracker <http://bugs.python.org/issue2

[issue28564] shutil.rmtree is inefficient due to listdir() instead of scandir()

2017-12-29 Thread Niklas Hambüchen
Niklas Hambüchen added the comment: I've filed https://bugs.python.org/issue32453, which is about O(n^2) deletion behaviour for large directories. -- nosy: +nh2 ___ Python tracker <https://bugs.python.org/is

[issue32453] shutil.rmtree can have O(n^2) performance on large dirs

2017-12-29 Thread Niklas Hambüchen
New submission from Niklas Hambüchen : See http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a for the explanation and equivalent fix in coreutils. The gist ist that deleting entries in inode order can improve deletion performance dramatically. To obtain inode numbers

[issue32453] shutil.rmtree can have O(n^2) performance on large dirs

2017-12-31 Thread Niklas Hambüchen
Niklas Hambüchen added the comment: Serhiy, did you run your benchmark on an SSD or a spinning disk? The coreutils bug mentions that the problem is seek times. My tests on a spinning disk with 400k files suggest that indeed rmtree() is ~30x slower than `rm -r`: # time (mkdir dirtest

[issue32453] shutil.rmtree can have O(n^2) performance on large dirs

2017-12-31 Thread Niklas Hambüchen
Niklas Hambüchen added the comment: It turns out I was wrong when saying that there's some cache we're hitting. In fact, `rm -r` is still precisely O(n^2), even with the coreutils patch I linked. Quick overview table of the benchmark: nfiles real user sys 100.

[issue32453] shutil.rmtree can have O(n^2) performance on large dirs

2017-12-31 Thread Niklas Hambüchen
Niklas Hambüchen added the comment: > Did you try to sync and flush caches before running `rm -r`? Yes, it doesn't make a difference for me, I still see the same O(n²) behaviour in `rm -r`. I've sent an email "O(n^2) performance of rm -r" to bug-coreut...@gnu.org jus

[issue32453] shutil.rmtree can have O(n^2) performance on large dirs

2017-12-31 Thread Niklas Hambüchen
Niklas Hambüchen added the comment: OK, my coreutils email is at http://lists.gnu.org/archive/html/bug-coreutils/2017-12/msg00054.html -- ___ Python tracker <https://bugs.python.org/issue32

[issue32453] shutil.rmtree can have O(n^2) performance on large dirs

2018-01-01 Thread Niklas Hambüchen
Niklas Hambüchen added the comment: A better location to view the whole coreutils thread is: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=29921 -- ___ Python tracker <https://bugs.python.org/issue32