Re: [Python-Dev] PEP 471 Final: os.scandir() merged into Python 3.5
Hi, On Sun, 8 Mar 2015 at 12:33 Ben Hoyt wrote: > Others: if you want to benchmark this, the simplest way is to use my > os.walk() benchmark.py test program here: > https://github.com/benhoyt/scandir -- it compares the built-in os.walk() > implemented with os.listdir() with a version of walk() implemented with > os.scandir(). I see huge gains on Windows (12-50x) and modest gains on my > Linux VM (3-5x). > I have a MacBook Pro Laptop running OS X 10.10.2. I did the following: - hg update -r 8ef4f75a8018 - patch -p1 < scandir-8.patch - ./configure --with-pydebug && make -j7 I then ran ./python.exe ~/Workspace/python/scandir/benchmark.py and I got: *Creating tree at /Users/rstuart/Workspace/python/scandir/benchtree: depth=4, num_dirs=5, num_files=50* *Using slower ctypes version of scandir* *Comparing against builtin version of os.walk()* *Priming the system's cache...* *Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 1/3...* *Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 2/3...* *Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 3/3...* *os.walk took 0.184s, scandir.walk took 0.158s -- 1.2x as fast* I then did ./python.exe ~/Workspace/python/scandir/benchmark.py -s and got: *Using slower ctypes version of scandir* *Comparing against builtin version of os.walk()* *Priming the system's cache...* *Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 1/3...* *Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 2/3...* *Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 3/3...* *os.walk size 23400, scandir.walk size 23400 -- equal* *os.walk took 0.483s, scandir.walk took 0.463s -- 1.0x as fast* Hope this helps. Cheers Note that the actual CPython version of os.walk() doesn't yet use > os.scandir(). I intend to open a separate issue for that shortly (or Victor > can). But that part should be fairly straight-forward, as I already have a > version available in my GitHub project. > > -Ben > > > On Sat, Mar 7, 2015 at 9:13 PM, Victor Stinner > wrote: > >> Hi, >> >> FYI I commited the implementation of os.scandir() written by Ben Hoyt. >> I hope that it will be part of Python 3.5 alpha 2 (Ben just sent the >> final patch today). >> >> Please test this new feature. You may benchmark here. >> http://bugs.python.org/issue22524 contains some benchmark tools and >> benchmark results of older versions of the patch. >> >> The implementation was tested on Windows and Linux. I'm now watching >> for buildbots to see how other platforms like os.scandir(). >> >> Bad news: OpenIndiana doesn't support d_type: the dirent structure has >> no d_type field. I already fixed the implementation to support this >> case. os.scandir() is still useful on OpenIndiana, because the stat >> result is cached in a DirEntry, so only one syscall is required, >> instead of multiple, when multiple DirEntry methods are called (ex: >> entry.is_dir() and not entry.is_symlink()). >> >> Victor >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/benhoyt%40gmail.com >> > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > ryan.stuart.85%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 471 Final: os.scandir() merged into Python 3.5
Hi Ben, On Mon, 9 Mar 2015 at 21:58 Ben Hoyt wrote: > Note that this benchmark is invalid for a couple of reasons. (...) > Thanks a lot for the guidance Ben, greatly appreciated. Just starting to take an interest in the development of CPython and so something like running a benchmark seemed like a good a place as any to start. Since I want to get comfortable with compiling from source I tried this again. Instead of applying the patch, since the issue is now closed, I just compiled from the tip of the default branch which at the time was 94920:0469af231d22. I also didn't configure with --with-pydebug. Here are the new results: *Ryans-MacBook-Pro:cpython rstuart$ ./python.exe ~/Workspace/python/scandir/benchmark.py* Using Python 3.5's builtin os.scandir() Comparing against builtin version of os.walk() Priming the system's cache... Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 1/3... Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 2/3... Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 3/3... os.walk took 0.061s, scandir.walk took 0.012s -- 5.2x as fast *Ryans-MacBook-Pro:cpython rstuart$ ./python.exe ~/Workspace/python/scandir/benchmark.py -s* Using Python 3.5's builtin os.scandir() Comparing against builtin version of os.walk() Priming the system's cache... Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 1/3... Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 2/3... Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 3/3... os.walk size 23400, scandir.walk size 23400 -- equal os.walk took 0.109s, scandir.walk took 0.049s -- 2.2x as fast This is on a Retina Mid 2012 MacBook Pro with an SSD. Cheers > you're compiling Python in debug mode (--with-pydebug), which produces > significantly slower code in my tests -- for example, on Windows > benchmark.py is about twice as slow when Python is compiled in debug > mode. > > Second, as the output above shows, benchmark.py is "Using slower > ctypes version of scandir" and not a C version at all. If os.scandir() > is available, benchmark.py should use that, so there's something wrong > here -- maybe the patch didn't apply correctly or maybe you're testing > with a different version of Python than the one you built? > > In any case, the easiest way to test it now is to download Python 3.5 > alpha 2 which just came out: > https://www.python.org/downloads/release/python-350a2/ > > I just tried this on my Mac Mini (i5 2.3GHz, 2 GB RAM, HFS+ on > rotational drive) and got the following results: > > Using Python 3.5's builtin os.scandir() > Comparing against builtin version of os.walk() > Priming the system's cache... > Benchmarking walks on benchtree, repeat 1/3... > Benchmarking walks on benchtree, repeat 2/3... > Benchmarking walks on benchtree, repeat 3/3... > os.walk took 0.074s, scandir.walk took 0.016s -- 4.7x as fast > > > I then did ./python.exe ~/Workspace/python/scandir/benchmark.py -s and > got: > > Also note that "benchmark.py -s" tests the system os.walk() against a > get_tree_size() function using scandir's DirEntry.stat().st_size, > which provides huge gains on Windows (because stat().st_size doesn't > require on OS call) but only modest gains on POSIX systems, which > still require an OS stat call to get the size (though not the file > type, so at least it's only one stat call). I get "2.2x as fast" on my > Mac for "benchmark.py -s". > > -Ben > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com