[issue21955] ceval.c: implement fast path for integers with a single digit

2020-10-01 Thread STINNER Victor
STINNER Victor added the comment: New changeset bd0a08ea90e4c7a2ebf29697937e9786d4d8e5ee by Victor Stinner in branch 'master': bpo-21955: Change my nickname in BINARY_ADD comment (GH-22481) https://github.com/python/cpython/commit/bd0a08ea90e4c7a2ebf29697937e9786d4d8e5ee -- ___

[issue21955] ceval.c: implement fast path for integers with a single digit

2020-10-01 Thread STINNER Victor
Change by STINNER Victor : -- pull_requests: +21500 pull_request: https://github.com/python/cpython/pull/22481 ___ Python tracker ___ __

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-10-20 Thread Roundup Robot
Roundup Robot added the comment: New changeset 61fcb12a9873 by Victor Stinner in branch 'default': Issue #21955: Please don't try to optimize int+int https://hg.python.org/cpython/rev/61fcb12a9873 -- nosy: +python-dev ___ Python tracker

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-10-20 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: fixed -> rejected ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: ht

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-10-20 Thread STINNER Victor
STINNER Victor added the comment: The fatest patch (inline2.patch) has a negligible impact on benchmarks. The purpose of an optimization is to make Python faster, it's not the case here, so I close the issue. Using timeit, the largest speedup is 1.29x faster. Using performance, spectral_norm

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-10-20 Thread STINNER Victor
STINNER Victor added the comment: fastint6_inline2_json.tar.gz: archive of JSON files - fastint6.json - inline2.json - master.json - timeit-fastint6.json - timeit-inline2.json - timeit-master.json -- Added file: http://bugs.python.org/file45150/fastint6_inline2_json.tar.gz

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-10-20 Thread STINNER Victor
STINNER Victor added the comment: Result of performance 0.3.3 (and perf 0.8.3). No major benchmark is faster. A few benchmarks seem to be event slower using fastint6.patch (but I don't really trust pybench). == fastint6.patch == $ python3 -m perf compare_to master.json fastint6.json --group-

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-10-20 Thread STINNER Victor
STINNER Victor added the comment: Between inline2.patch and fastint6.patch, it seems like inline2.patch is faster (between 9% and 12% faster than fastint6.patch). Microbenchmark on Python default (rev 554fb699af8c), compilation using LTO (./configure --with-lto), GCC 6.2.1 on Fedora 24, Intel(

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-04-22 Thread Stefan Krah
Stefan Krah added the comment: #14757 has an implementation of inline caching, which at least seemed to slow down some use cases. Then again, whenever someone posts a new speedup suggestion, it seems to slow down things I'm working on. At least Case van Horsen independently verified the phenom

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-04-22 Thread STINNER Victor
STINNER Victor added the comment: Maybe we should adopt a difference approach. There is something called "inline caching": put the cache between instructions, in the same memory block. Example of paper on CPython: "Efficient Inline Caching without Dynamic Translation" by Stefan Brunthaler (20

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-10 Thread STINNER Victor
STINNER Victor added the comment: > The test suite can be run directly from the source tree. The test suite > includes timing information for individual tests and for the the entire test. > Sample invocation: I extracted the slowest test (test_polyroots_legendre) and put it in a loop of 5 ite

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-09 Thread Yury Selivanov
Yury Selivanov added the comment: > Maybe we should just close the issue? I'll take a closer look at gmpy later. Please don't close. -- ___ Python tracker ___ __

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-09 Thread STINNER Victor
STINNER Victor added the comment: Case Van Horsen added the comment: > I ran the mpmath test suite with the fastint6 and fastint5_4 patches. > > fastint6 results > without gmpy: 0.25% faster > with gmpy: 3% slower > > fastint5_4 results > without gmpy: 1.5% slower > with gmpy: 5.5% slower I'm mo

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-09 Thread Case Van Horsen
Case Van Horsen added the comment: I ran the mpmath test suite with the fastint6 and fastint5_4 patches. fastint6 results without gmpy: 0.25% faster with gmpy: 3% slower fastint5_4 results without gmpy: 1.5% slower with gmpy: 5.5% slower -- ___ Py

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-08 Thread Yury Selivanov
Yury Selivanov added the comment: > I ran the mpmath test suite with Python 3.6 and with the fastint6 patch. The > overall increase when using Python long type was about 1%. When using gmpy2's > mpz type, there was a slowdown of about 2%. > I will run more tests tonight. Please try to test fa

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-08 Thread Case Van Horsen
Case Van Horsen added the comment: mpmath is a library for arbitrary-precision floating-point arithmetic. It uses Python's native long type or gmpy2's mpz type for computations. It is available at https://pypi.python.org/pypi/mpmath. The test suite can be run directly from the source tree. The

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-08 Thread STINNER Victor
STINNER Victor added the comment: Yury Selivanov: > Alright, I ran a few benchmarks myself. (...) > From what I can see there is no negative impact of the patch on stable macro > benchmarks. I'm disappointed by the results. In short, these patches have *no* impact on macro benchmarks, other th

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-07 Thread Yury Selivanov
Yury Selivanov added the comment: Attaching another patch - fastint6.patch that only optimizes longs (no FP fast path). > #26288 brought a great speedup for floats. With fastint5_4.patch *on top of > #26288* I see no improvement for floats and a big slowdown for _decimal. What benchmark did y

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: Be careful with test suites: first, they might exercise code that would never be a critical point for performance in a real-world application; second and most important, unittest seems to have gotten slower between 2.x and 3.x, so you would really be comparing

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-07 Thread Case Van Horsen
Case Van Horsen added the comment: Can I suggest the mpmath test suite as a good benchmark? I've used it to test the various optimizations in gmpy2 and it has always been a valuable real-world benchmark. And it is slower in Python 3 than Python 2 -- ___

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-07 Thread Stefan Krah
Stefan Krah added the comment: #26288 brought a great speedup for floats. With fastint5_4.patch *on top of #26288* I see no improvement for floats and a big slowdown for _decimal. -- ___ Python tracker ___

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-07 Thread Yury Selivanov
Yury Selivanov added the comment: > Please don't commit it right now. Yes, due to using macros the patch looks > simple, but macros expanded to complex code. We need more statistics. But what you will use to gather statistics data? Test suite isn't representative, and we already know what wil

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-07 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Please don't commit it right now. Yes, due to using macros the patch looks simple, but macros expanded to complex code. We need more statistics. -- ___ Python tracker __

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-07 Thread Yury Selivanov
Yury Selivanov added the comment: >From what I can see there is no negative impact of the patch on stable macro >benchmarks. There is quite a detectable positive impact on most of integer and float operations from my patch. 13-16% on nbody and spectral_norm benchmarks is still impressive. A

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-06 Thread Yury Selivanov
Yury Selivanov added the comment: Alright, I ran a few benchmarks myself. In rigorous mode regex_v8 has the same performance on my 2013 Macbook Pro and an 8-years old i7 CPU (Linux). Here're results of "perf.py -b raytrace,spectral_norm,meteor_contest,nbody ../cpython/python.exe ../cpython-gi

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-06 Thread Yury Selivanov
Yury Selivanov added the comment: > ### regex_v8 ### > Min: 0.041323 -> 0.048099: 1.16x slower > Avg: 0.041624 -> 0.049318: 1.18x slower I think this is a random fluctuation, that benchmark (and re lib) doesn't use the operators too much. It can't be THAT slower just because of optimizing a f

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-06 Thread Yury Selivanov
Yury Selivanov added the comment: You're also running a very small subset of all benchmarks available. Please try the '-b all' option. I'll also run benchmarks on my machines. -- ___ Python tracker __

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-06 Thread STINNER Victor
STINNER Victor added the comment: Benchmark on fastint5_4.patch. python3 -u perf.py --affinity=2-3,6-7 --rigorous ../default/python.orig ../default/python_fastint5_4 Report on Linux smithers 4.3.4-300.fc23.x86_64 #1 SMP Mon Jan 25 13:39:23 UTC 2016 x86_64 x86_64 Total CPU cores: 8 ### django

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-06 Thread STINNER Victor
STINNER Victor added the comment: Benchmark on inline-2.patch. No speedup, only slowdown. I'm now running benchmark on fastint5_4.patch. $ python3 -u perf.py --affinity=2-3,6-7 --rigorous ../default/python.orig ../default/python.inline-2 Report on Linux smithers 4.3.4-300.fc23.x86_64 #1 SMP M

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > I see two main trends: optimize most cases (optimize most operators for int > and float, ex: fastint5_4.patch) versus optimize very few cases to limit > changes and to limit effects on ceval.c (ex: inline-2.patch). I agree that may be optimizing very few

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
STINNER Victor added the comment: myself> Ok. Now I'm lost. We have so many patches :-) Which one do you prefer? I read again fully this *old* issue, well, *almost* all messages. Well, it's clear that no consensus was found yet :-) I see two main trends: optimize most cases (optimize most oper

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
STINNER Victor added the comment: msg223186, Serhiy Storchaka about inline.patch: "Confirmed speed up about 20%. Surprisingly it affects even integers outside of the of preallocated small integers (-5...255)." The optimization applies to Python int with 0 or 1 digit so in range [-2^30+1; 2^30

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
STINNER Victor added the comment: msg222985: Raymond Hettinger "There also used to be a fast path for binary subscriptions with integer indexes. I would like to see that performance regression fixed if it can be done cleanly." The issue #26280 was opened to track this optimization. -

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: > Ok. Now I'm lost. We have so many patches :-) Which one do you prefer? To no-ones surprise I prefer fastint5, because it optimizes almost all binary operators on both ints and floats. inline-2.patch only optimizes just + and - for just ints. If + and - per

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
STINNER Victor added the comment: inline-2.patch: more complete version of inline.patch. Optimize the same instructions than Python 2: BINARY_ADD, INPLACE_ADD, BINARY_SUBSTRACT, INPLACE_SUBSTRACT. Quick & *dirty* microbenchmark: $ ./python -m timeit -s 'x=1' 'x+x+x+x+x+x+x+x+x+x+x+x+x+x+x+x+

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Changes by Yury Selivanov : Added file: http://bugs.python.org/file41831/fastint5_4.patch ___ Python tracker ___ ___ Python-bugs-list mailing

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Changes by Yury Selivanov : Added file: http://bugs.python.org/file41830/fastint5_3.patch ___ Python tracker ___ ___ Python-bugs-list mailing

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: Attached is the new version of fastint5 patch. I fixed most of the review comments. I also optimized %, << and >> operators. I didn't optimize other operators because they are less common. I guess we have to draw a line somwhere... Victor, thanks a lot fo

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
STINNER Victor added the comment: Serhiy Storchaka: "My patches were just samples. I'm glad that Yury incorporated the main idea and that this helps." Oh, if even Serhiy prefers Yury's patches, I should read them again :-) -- I read fastint5.patch one more time and I finally understood the fo

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: Thanks, Serhiy, > But I don't quite understand why it adds any gain. Perhaps, and this is just a guess - the fast path does only couple of eq tests & one call for the actual op. If it's long+long then long_add will be called directly. PyNumber_Add has more

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: My patches were just samples. I'm glad that Yury incorporated the main idea and that this helps. If apply any patch I would prefer fastint5.patch. But I don't quite understand why it adds any gain. Is this just due to overhead of calling PyNumber_Add? Then w

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: As to weather we want this patch committed or not, here's a mini-macro-something benchmark: $ ./python.exe -m timeit -s "x=2" "x + 10 + x * 20 + x* 10 + 20 -x" 1000 loops, best of 3: 0.115 usec per loop $ python3.5 -m timeit -s "x=2" "x + 10 + x * 20 +

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: Anyways, if it's about macro vs non-macro, I can inline the macro by hand (which I think is an inferior approach here). But I'd like the final code to use my approach of using slots directly, instead of modifying longobject/floatobject to export lots of *inte

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: > Between fastintfloat_alt.patch and fastint5.patch, I prefer > fastintfloat_alt.patch which is much easier to read, so probably much easier > to debug. I hate huge macro when I have to debug code in gdb :-( I also like > very much the idea of *reusing* existi

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
STINNER Victor added the comment: My analysis of benchmarks. Even using CPU isolation to run benchmarks, the results look unreliable for very short benchmarks like 3 ** 2.0: I don't think that fastint_alt can make the operation 16% slower since it doesn't touch this code, no? Well... as expec

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file41823/compare_to.txt ___ Python tracker ___ ___ Python-bugs-list mailing li

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file41822/compare.txt ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
STINNER Victor added the comment: bench_long2.py: my updated microbenchmark to test many types and more operations. compare.txt: compare Python original, fastint_alt.patch, fastintfloat_alt.patch and fastint5.patch. "(*)" marks the minimum of the line, percents are relative to the minimum (if

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: > Regardless of the performance, the fastint5.patch looks like the least invasive approach to me. It also doesn't incur as much maintenance overhead as the others do. Thanks. It's a result of an enlightenment that can only come after running benchmarks all day

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: >> Unless there are any objections, I'll commit fastint5.patch in a day or two. > Please don't. I would like to have time to benchmark all these patches (there > are now 9 patches attached to the issue :-)) and I would like to hear > Serhiy's feedback on your

Re: [issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread M.-A. Lemburg
On 05.02.2016 16:14, STINNER Victor wrote: > > Please don't. I would like to have time to benchmark all these patches (there > are now 9 patches attached to the issue :-)) and I would like to hear > Serhiy's feedback on your latest patches. Regardless of the performance, the fastint5.patch look

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread STINNER Victor
STINNER Victor added the comment: > Unless there are any objections, I'll commit fastint5.patch in a day or two. Please don't. I would like to have time to benchmark all these patches (there are now 9 patches attached to the issue :-)) and I would like to hear Serhiy's feedback on your latest

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-05 Thread Yury Selivanov
Yury Selivanov added the comment: Unless there are any objections, I'll commit fastint5.patch in a day or two. -- ___ Python tracker ___ _

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Yury Selivanov
Yury Selivanov added the comment: Attaching another approach -- fastint5.patch. Similar to what fastint4.patch does, but doesn't export any new APIs. Instead, similarly to abstract.c, it uses type slots directly. -- Added file: http://bugs.python.org/file41815/fastint5.patch

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Yury Selivanov
Yury Selivanov added the comment: Antoine, FWIW I agree on most of your points :) And yes, numpy, scipy, numba, etc rock. Please take a look at my fastint4.patch. All tests pass, no performance regressions, no crazy inlining of floating point exceptions etc. And yet we have a nice improvem

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Yury Selivanov
Yury Selivanov added the comment: tl;dr I'm attaching a new patch - fastint4 -- the fastest of them all. It incorporates Serhiy's suggestion to export long/float functions and use them. I think it's reasonable complete -- please review it, and let's get it committed. == Benchmarks == spec

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Hi Yury, > I'm not sure how to respond to that. Every performance aspect *is* > important. Performance is not a religion (not any more than security or any other matter). It is not helpful to brandish results on benchmarks which have no relevance to real-world

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Yury Selivanov
Yury Selivanov added the comment: > People should stop getting hung up about benchmarks numbers and instead > should first think about what they are trying to *achieve*. FP performance in > pure Python does not seem like an important goal in itself. I'm not sure how to respond to that. Every

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: People should stop getting hung up about benchmarks numbers and instead should first think about what they are trying to *achieve*. FP performance in pure Python does not seem like an important goal in itself. Also, some benchmarks may show variations which ar

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Stefan Krah
Stefan Krah added the comment: I mean, if you run the benchmark 10 times and the unpatched result is always between 11.3 and 12.0 for floats while the patched result is between 12.3 and 12.9, for me the situation is clear. -- ___ Python tracker

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Stefan Krah
Stefan Krah added the comment: I've never seen 20% fluctuation in that benchmark between runs. The benchmark is very stable if you take the average of 10 runs. -- ___ Python tracker ___

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: It is easy to extend fastint_alt.patch to support floats too. Here is new patch. > It's instructive to run ./python Modules/_decimal/tests/bench.py (Hit Ctrl-C > after the first cdecimal result, 5 repetitions or so). Note that this benchmark is not very stab

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Yury Selivanov
Yury Selivanov added the comment: > > Stefan Krah added the comment: > > It's instructive to run ./python Modules/_decimal/tests/bench.py (Hit Ctrl-C > after the first cdecimal result, 5 repetitions or so). > > fastint2.patch speeds up floats enormously and slows down decimal by 6%. > fastint

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Stefan Krah
Stefan Krah added the comment: It's instructive to run ./python Modules/_decimal/tests/bench.py (Hit Ctrl-C after the first cdecimal result, 5 repetitions or so). fastint2.patch speeds up floats enormously and slows down decimal by 6%. fastint_alt.patch slows down float *and* decimal (5% or so)

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Yury Selivanov
Yury Selivanov added the comment: >> But it is faster. That's visible on many benchmarks. Even simple > timeit oneliners can show that. Probably it's because that such > benchmarks usually combine floats and ints, i.e. "2 * smth" instead of > "2.0 * smth". > > So it's not about FP-related calcul

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le 04/02/2016 15:18, Yury Selivanov a écrit : > > But it is faster. That's visible on many benchmarks. Even simple timeit oneliners can show that. Probably it's because that such benchmarks usually combine floats and ints, i.e. "2 * smth" instead of "2.0 * smth"

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Yury Selivanov
Yury Selivanov added the comment: >But the next question is then the overhead on the "slow" path, which requires >a benchmark too! For example, use a subtype of int. telco is such a benchmark (although it's very unstable). It uses decimals extensively. I've tested it many times on three diff

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le 04/02/2016 14:54, Yury Selivanov a écrit : > > 30% faster floats (sic!) is a serious improvement, that shouldn't > just be discarded. Many applications have floating point calculations one way > or another, but don't use numpy because it's an overkill. Can y

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread STINNER Victor
STINNER Victor added the comment: > Why not combine my patch and Serhiy's? First we check if left & right are > both longs. Then we check if they are unicode (for +). And then we have a > fastpath for floats. See my comment on Serhiy's patch. Maybe we can start by check that the type of bo

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Yury Selivanov
Yury Selivanov added the comment: > I agree with Marc-Andre, people doing FP-heavy math in Python use Numpy > (possibly with Numba, Cython or any other additional library). > Micro-optimizing floating-point operations in the eval loop makes little > sense IMO. I disagree. 30% faster floats (

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread STINNER Victor
STINNER Victor added the comment: I prefer fastint_alt.patch design, it's simpler. I added a comment on the review. My numbers, best of 5 timeit runs: $ ./python -m timeit -s "x = 12345" -- "x*2; x*2; x*2; x*2; x*2; x*2; x*2; x*2; x*2; x*2; " * original: 299 ns * fastint2.patch: 282 ns (-17

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: fastint2.patch adds small regression for string multiplication: $ ./python -m timeit -s "x = 'x'" -- "x*2; x*2; x*2; x*2; x*2; x*2; x*2; x*2; x*2; x*2; " Unpatched: 1.46 usec per loop Patched:1.54 usec per loop Here is an alternative patch. It just uses

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread STINNER Victor
STINNER Victor added the comment: > I agree with Marc-Andre, people doing FP-heavy math in Python use Numpy > (possibly with Numba, Cython or any other additional library). > Micro-optimizing floating-point operations in the eval loop makes little > sense IMO. Oh wait, I maybe misunderstood M

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: I agree with Marc-Andre, people doing FP-heavy math in Python use Numpy (possibly with Numba, Cython or any other additional library). Micro-optimizing floating-point operations in the eval loop makes little sense IMO. The point of optimizing integers is that

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 04.02.2016 09:01, STINNER Victor wrote: > > "In a numerics heavy application it's like that all fast paths will trigger > somewhere, but those will likely be better off using numpy or numba. For a > text heavy application such as a web server, only few

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread STINNER Victor
STINNER Victor added the comment: +if (Py_SIZE(left) != 0) { +if (Py_SIZE(right) != 0) { + +#ifdef HAVE_LONG_LONG +mul = PyLong_FromLongLong( +(long long)SINGLE_DIGIT_LONG_AS_LONG(left) * +SINGLE_DIGIT_LONG_AS

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-04 Thread STINNER Victor
STINNER Victor added the comment: "In a numerics heavy application it's like that all fast paths will trigger somewhere, but those will likely be better off using numpy or numba. For a text heavy application such as a web server, only few fast paths will trigger and so the various checks only

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 04.02.2016 07:02, Yury Selivanov wrote: > Attaching a second version of the patch. (BTW, Serhiy, I tried your idea of > using a switch statement to optimize branches > (https://github.com/1st1/cpython/blob/fastint2/Python/ceval.c#L5390) -- no > detecta

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Yury Selivanov
Yury Selivanov added the comment: Attaching a second version of the patch. (BTW, Serhiy, I tried your idea of using a switch statement to optimize branches (https://github.com/1st1/cpython/blob/fastint2/Python/ceval.c#L5390) -- no detectable speed improvement). I decided to add fast path fo

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > BTW, what do you think about using __int128 when available? That way we can > also optimize twodigit PyLongs. __int128 is not always available and it will add too much of complexity for possible less gain. There is many ways to optimize the code and we sh

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Yury Selivanov
Yury Selivanov added the comment: > I don't think. I run benchmarks (for __int128) :-) Never mind... Seems that __int128 is still an experimental feature and some versions of clang even had bugs with it. -- ___ Python tracker

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread STINNER Victor
STINNER Victor added the comment: I don't think. I run benchmarks (for __int128) :-) -- ___ Python tracker ___ ___ Python-bugs-list ma

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Yury Selivanov
Yury Selivanov added the comment: > Fast patch is already implemented in long_mul(). May be we should just use > this function if both arguments are exact int, and apply the switch > optimization inside. Agree. BTW, what do you think about using __int128 when available? That way we can also

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Fast patch is already implemented in long_mul(). May be we should just use this function if both arguments are exact int, and apply the switch optimization inside. -- ___ Python tracker

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Yury Selivanov
Yury Selivanov added the comment: Antoine, yeah, it's probably turbo boost related. There is no easy way to turn it off on mac os x, though. I hope Victor's patch to perf.py will help to mitigate this. Victor, Marc-Andre, Updated results of nano-bench (best of 10): -m timeit -s "loops=tup

Re: [issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread M.-A. Lemburg
On 03.02.2016 18:05, STINNER Victor wrote: > >> python -m timeit "sum([x * x * 1 for x in range(100)])" > > If you only want to benchmark x*y, x+y and list-comprehension, you > should use a tuple for the iterator. ... and precalculate that in the setup: python -m timeit -s "loops=tuple(range(1

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Zach Byrne
Zach Byrne added the comment: I ran 6 benchmarks on my work machine(not the same one as the last set) overnight. Two with just the BINARY_ADD change, two with the BINARY_SUBSCR change, and two with both. I'm attaching the output from all my benchmark runs, but here are the highlights In this ta

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le 03/02/2016 18:21, Yury Selivanov a écrit : > > Yury Selivanov added the comment: > >> Yury suggested running perf.py twice with the binaries swapped > > Yeah, I had some experience with perf.py when its results were > skewed depending on what you test first

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Zach Byrne
Zach Byrne added the comment: > Could you please take a look at the updated patch? Looks ok to me, for whatever that's worth. -- ___ Python tracker ___ __

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Yury Selivanov
Yury Selivanov added the comment: > Yury suggested running perf.py twice with the binaries swapped Yeah, I had some experience with perf.py when its results were skewed depending on what you test first. Hopefully Victor's new patch will fix that http://bugs.python.org/issue26275 --

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Zach Byrne
Zach Byrne added the comment: > I don't understand what this table means (why 4 columns?). Can you explain > what you did? Yury suggested running perf.py twice with the binaries swapped So "faster" and "slower" underneath "Baseline Reference" are runs where the unmodified python binary was the

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Antoine Pitrou
Antoine Pitrou added the comment: > In this table I've flipped the results for running the modified build > as > the reference, but in the new attachment, slower in the right > column means faster, I think :) I don't understand what this table means (why 4 columns?). Can you explain what you d

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread STINNER Victor
STINNER Victor added the comment: > python -m timeit "sum([x * x * 1 for x in range(100)])" If you only want to benchmark x*y, x+y and list-comprehension, you should use a tuple for the iterator. -- ___ Python tracker

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Yury Selivanov
Yury Selivanov added the comment: Attaching a new patch -- rewritten to optimize -, *, +, -=, *= and +=. I also removed the optimization of [] operator -- that should be done in a separate patch and in a separate issue. Some nano-benchmarks (best of 3): python -m timeit "sum([x + x + 1 for

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-02 Thread Yury Selivanov
Yury Selivanov added the comment: unpack_sequence contains 400 lines of this: "a, b, c, d, e, f, g, h, i, j = to_unpack". This code doesn't even touch BINARY_SUBSCR or BINARY_ADD. Zach, could you please run your benchmarks in rigorous mode (perf.py -r)? I'd also suggest to experiment with pu

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-02 Thread Antoine Pitrou
Antoine Pitrou added the comment: Any change that increases the cache or branch predictor footprint of the evaluation loop may make the interpreter slower, even if the change doesn't seem related to a particular benchmark. That may be the reason here. -- __

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-02 Thread Zach Byrne
Zach Byrne added the comment: I took another look at this, and tried applying it to 3.6 and running the latest benchmarks. It applied cleanly, and the benchmark results were similar, this time unpack_sequence and spectral_norm were slower. Spectral norm makes sense, it's doing lots of FP addit

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-02 Thread Yury Selivanov
Yury Selivanov added the comment: I'm assigning this patch to myself to commit it in 3.6 later. -- assignee: -> yselivanov components: +Interpreter Core stage: -> patch review versions: +Python 3.6 -Python 3.5 ___ Python tracker

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-01-11 Thread Zach Byrne
Zach Byrne added the comment: > Can you figure why unpack_sequence and other benchmarks were slower? I didn't look really closely, A few of the slower ones were floating point heavy, which would incur the slow path penalty, but I can dig into unpack_sequence. -- __

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-01-11 Thread Yury Selivanov
Yury Selivanov added the comment: > Anybody still looking at this? I can take another stab at it if it's still in > scope. There were some visible speedups from your patch -- I think we should merge this optimization. Can you figure why unpack_sequence and other benchmarks were slower?

  1   2   >