Neal Norwitz wrote: > On Nov 30, 2007 7:16 PM, Brett Cannon <[EMAIL PROTECTED]> wrote: >> On Nov 30, 2007 12:02 PM, Neil Toronto <[EMAIL PROTECTED]> wrote: >>> On both of my systems, using -O2 reduces execution time in pystone by 9% >>> and in pybench by 8%. It's function inlining: "-O3 >>> -fno-inline-functions" works just as well as "-O2". Removing "-g" has >>> little effect on the result. >>> >>> Systems: >>> - AMD Athlon 64 X2 Dual Core 4600+, 512 KB cache (desktop) >>> - Intel T2300 Dual Core 1.66GHz, 512 KB cache (laptop) >>> >>> Both are Ubuntu 7.04, GCC 4.1.2. >>> >>> Does anybody else see this? >>> >>> It may be GCC being stupid (which has happened before) or not enough >>> cache on my systems (definitely possible). If it's not one of those, I'd >>> say it's because CPython core functions are already very large, and >>> almost everything that ought to be inlined is already a macro. >>> >> That's quite possible. Previous benchmarks by AMK have shown that >> perhaps -0m (or whatever the flag is to optimize for size) sometimes >> is the best solution. It has always been believed that the eval loop >> is already large and manages to hit some cache sweet spot. > > The flag is -Os. I suspect you will do better to limit the size of > inlining rather disabling it completely. The option is > -finline-limit=number. I don't know the default value or what you > should try. I would be interested to hear more results though.
I've got some pystones (500000) results for the Athlon. The default for -finline-limit is 600. This is for the current trunk. Global options pystones/sec (median of 3) -------------- ------------ -O3 50454.1 -O2 57273.8 -Os 52798.3 -O3 -fno-inline-functions 54824.6 -O3 -finline-limit=300 51229.7 -O3 -finline-limit=150 51177.7 -O3 -finline-limit=75 51759.8 -O3 -finline-limit=25 53821.3 ceval.c options (-O3 for others) pystones/sec (median of 3) --------------- ------------ -O2 55066.1 -Os 57012.5 -O3 -fno-inline-functions 55679.3 -O3 -finline-limit=300 51440.3 -O3 -finline-limit=150 50916.5 -O3 -finline-limit=75 51387.5 -O3 -finline-limit=25 52631.6 Now that's interesting. -O2 seems to be the best global option, and -Os seems to be best for ceval.c. One more test then: Global -O2, ceval.c -Os 56753.7 Weird. If you're going to run these benchmarks yourself, make sure you "make clean" before building with different options. (I don't know why it's necessary, but it is.) To change options for just ceval.c, add this to Makefile.pre.in under "Special rules": Python/ceval.o: $(srcdir)/Python/ceval.c $(CC) -c $(PY_CFLAGS) -Os \ -o $@ $(srcdir)/Python/ceval.c The last -O flag should override any other. Neil _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com