Hello, in commit
rm -r: avoid O(n^2) performance for a directory with very many entries http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a it says that `rm -r` "now displays linear performance, even when operating on million-entry directories on ext3 and ext4" I found this not to be the case on my systems running ext4 on Linux 4.9 on a WD 4TB spinning disk, coreutils rm 8.28. As reported on https://bugs.python.org/issue32453#msg309303, I got: nfiles real user sys 100000 0.51s 0.07s 0.43s 200000 2.46s 0.15s 0.89s 400000 10.78s 0.26s 2.21s 800000 44.72s 0.58s 6.03s 1600000 180.37s 1.06s 10.70s Each 2x increase of number of files results in 4x increased deletion time, making for a clean O(n^2) quadratic curve. I'm testing this with: set -e rm -rf dirtest/ echo 100000 && (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch) && time rm -r dirtest/ echo 200000 && (mkdir dirtest && cd dirtest && seq 1 200000 | xargs touch) && time rm -r dirtest/ echo 400000 && (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch) && time rm -r dirtest/ echo 800000 && (mkdir dirtest && cd dirtest && seq 1 800000 | xargs touch) && time rm -r dirtest/ echo 1600000 && (mkdir dirtest && cd dirtest && seq 1 1600000 | xargs touch) && time rm -r dirtest/ On another system, Ubuntu 16.04, coretuils rm 8.25, with Linux 4.10 on ext4, I get: nfiles real user sys 100000 0.94s 0.06s 0.516s 200000 2.94s 0.10s 1.060s 400000 10.88s 0.30s 2.508s 800000 34.60s 0.48s 4.912s 1600000 203.87s 0.99s 11.708s Also quadratic. Same machine on XFS: nfiles real user sys 100000 3.37s 0.04s 0.98s 200000 7.20s 0.06s 2.03s 400000 22.52s 0.16s 5.11s 800000 53.31s 0.37s 11.46s 1600000 200.83s 0.76s 22.41s Quadratic. What is the complexity of `rm -r` supposed to be? Was it really linear in the past as the patch describes, and this is a regression? Can we make it work linearly again? Thanks! Niklas
