Re: [Cython] 0.16 release
2012/1/26 Jason Grout : > On 1/25/12 11:39 AM, Robert Bradshaw wrote: >> >> install >> >> https://sage.math.washington.edu:8091/hudson/view/ext-libs/job/sage-build/lastSuccessfulBuild/artifact/cython-devel.spkg >> by downloading it and running "sage -i cython-devel.spkg" > > > > In fact, you could just do > > sage -i > https://sage.math.washington.edu:8091/hudson/view/ext-libs/job/sage-build/lastSuccessfulBuild/artifact/cython-devel.spkg > > and Sage will (at least, should) download it for you, so that's even one > less step! > > Jason > Thanks for detailed instruction! I've successfully built it. "sage -t -gdb ./" doesn't work, is that a bug? vitja@mchome:~/Downloads/sage-4.8$ ./sage -t -gdb devel/sage/sage/combinat/sf/macdonald.py sage -t -gdb "devel/sage/sage/combinat/sf/macdonald.py" Type r at the (gdb) prompt to run the doctests. Type bt if there is a crash to see a traceback. gdb --args python /home/vitja/.sage//tmp/macdonald_6182.py starting cmd gdb --args python /home/vitja/.sage//tmp/macdonald_6182.py ImportError: No module named site [0.2 s] -- The following tests failed: sage -t -gdb "devel/sage/sage/combinat/sf/macdonald.py" Total time for all tests: 0.2 seconds I've found another way to run tests (using sage -sh and then direct python ~/.sage/tmp/...py) So I found one of the problems. Here is minimal cython example: def foo(values): return (0,)*len(values) foo([1,2,3]) len(values) somehow is passed as an integer to PyObject_Multiply() -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
Stefan Behnel, 27.01.2012 09:02: > any exception *propagation* is > still substantially slower than necessary, and that's a general issue. Here's a general take on a code object cache for exception propagation. https://github.com/scoder/cython/commit/ad18e0208 When I raise an exception in test code that propagates through a Python call hierarchy of four functions before being caught, the cache gives me something like a 2x speedup in total. Not bad. When I do the same for cdef functions, it's more like 4-5x. The main idea is to cache the objects in a reallocable C array and bisect into it based on the C code "__LINE__" of the exception, which should be unique enough for a given module. It's a global cache that doesn't limit the lifetime of code objects (well, up to the lifetime of the module, obviously). I don't know if that's a problem because the number of code objects is only bounded by the number of exception origination points in the C source code, which is usually quite large. However, only a tiny fraction of those will ever raise or propagate an exception in practice, so the real number of cached code objects will be substantially smaller. Maybe thorough test suites with lots of failure testing would notice a difference in memory consumption, even though a single code objects isn't all that large either... What do you think? Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
On 28 January 2012 18:38, Stefan Behnel wrote: > Stefan Behnel, 27.01.2012 09:02: >> any exception *propagation* is >> still substantially slower than necessary, and that's a general issue. > > Here's a general take on a code object cache for exception propagation. > > https://github.com/scoder/cython/commit/ad18e0208 > > When I raise an exception in test code that propagates through a Python > call hierarchy of four functions before being caught, the cache gives me > something like a 2x speedup in total. Not bad. When I do the same for cdef > functions, it's more like 4-5x. > > The main idea is to cache the objects in a reallocable C array and bisect > into it based on the C code "__LINE__" of the exception, which should be > unique enough for a given module. > > It's a global cache that doesn't limit the lifetime of code objects (well, > up to the lifetime of the module, obviously). I don't know if that's a > problem because the number of code objects is only bounded by the number of > exception origination points in the C source code, which is usually quite > large. However, only a tiny fraction of those will ever raise or propagate > an exception in practice, so the real number of cached code objects will be > substantially smaller. > > Maybe thorough test suites with lots of failure testing would notice a > difference in memory consumption, even though a single code objects isn't > all that large either... > > What do you think? > > Stefan > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel Nice. I have a question, couldn't you save the frame object instead of the code object? I do think PyCodeObject is rather large, on my 64 bit machine it is 120 bytes, not accounting for any of the objects it holds (not saying that's a problem, just pointing it out). Would it help if we would pass in the position information string object constant, to avoid the PyString_Format? That optimization would only save 14% though. But additionally, the function name could be a string object constant, which could be shared by all exceptions in one function, avoiding another PyString_FromString. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
2012/1/28 Stefan Behnel : > Stefan Behnel, 27.01.2012 09:02: >> any exception *propagation* is >> still substantially slower than necessary, and that's a general issue. > > Here's a general take on a code object cache for exception propagation. > > https://github.com/scoder/cython/commit/ad18e0208 > > When I raise an exception in test code that propagates through a Python > call hierarchy of four functions before being caught, the cache gives me > something like a 2x speedup in total. Not bad. When I do the same for cdef > functions, it's more like 4-5x. > > The main idea is to cache the objects in a reallocable C array and bisect > into it based on the C code "__LINE__" of the exception, which should be > unique enough for a given module. > > It's a global cache that doesn't limit the lifetime of code objects (well, > up to the lifetime of the module, obviously). I don't know if that's a > problem because the number of code objects is only bounded by the number of > exception origination points in the C source code, which is usually quite > large. However, only a tiny fraction of those will ever raise or propagate > an exception in practice, so the real number of cached code objects will be > substantially smaller. > > Maybe thorough test suites with lots of failure testing would notice a > difference in memory consumption, even though a single code objects isn't > all that large either... > > What do you think? > We already have --no-c-in-traceback flag that disables C line numbers in traceback. What's about enabling it by default? -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
On 28 January 2012 19:41, Vitja Makarov wrote: > 2012/1/28 Stefan Behnel : >> Stefan Behnel, 27.01.2012 09:02: >>> any exception *propagation* is >>> still substantially slower than necessary, and that's a general issue. >> >> Here's a general take on a code object cache for exception propagation. >> >> https://github.com/scoder/cython/commit/ad18e0208 >> >> When I raise an exception in test code that propagates through a Python >> call hierarchy of four functions before being caught, the cache gives me >> something like a 2x speedup in total. Not bad. When I do the same for cdef >> functions, it's more like 4-5x. >> >> The main idea is to cache the objects in a reallocable C array and bisect >> into it based on the C code "__LINE__" of the exception, which should be >> unique enough for a given module. >> >> It's a global cache that doesn't limit the lifetime of code objects (well, >> up to the lifetime of the module, obviously). I don't know if that's a >> problem because the number of code objects is only bounded by the number of >> exception origination points in the C source code, which is usually quite >> large. However, only a tiny fraction of those will ever raise or propagate >> an exception in practice, so the real number of cached code objects will be >> substantially smaller. >> >> Maybe thorough test suites with lots of failure testing would notice a >> difference in memory consumption, even though a single code objects isn't >> all that large either... >> >> What do you think? >> > > We already have --no-c-in-traceback flag that disables C line numbers > in traceback. > What's about enabling it by default? > > -- > vitja. > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel I'm quite attached to that feature actually :), it would be pretty annoying to disable that flag every time. And what would disabling that option gain, as the current code still formats the filename and function name. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
On 28 January 2012 19:48, mark florisson wrote: > On 28 January 2012 19:41, Vitja Makarov wrote: >> 2012/1/28 Stefan Behnel : >>> Stefan Behnel, 27.01.2012 09:02: any exception *propagation* is still substantially slower than necessary, and that's a general issue. >>> >>> Here's a general take on a code object cache for exception propagation. >>> >>> https://github.com/scoder/cython/commit/ad18e0208 >>> >>> When I raise an exception in test code that propagates through a Python >>> call hierarchy of four functions before being caught, the cache gives me >>> something like a 2x speedup in total. Not bad. When I do the same for cdef >>> functions, it's more like 4-5x. >>> >>> The main idea is to cache the objects in a reallocable C array and bisect >>> into it based on the C code "__LINE__" of the exception, which should be >>> unique enough for a given module. >>> >>> It's a global cache that doesn't limit the lifetime of code objects (well, >>> up to the lifetime of the module, obviously). I don't know if that's a >>> problem because the number of code objects is only bounded by the number of >>> exception origination points in the C source code, which is usually quite >>> large. However, only a tiny fraction of those will ever raise or propagate >>> an exception in practice, so the real number of cached code objects will be >>> substantially smaller. >>> >>> Maybe thorough test suites with lots of failure testing would notice a >>> difference in memory consumption, even though a single code objects isn't >>> all that large either... >>> >>> What do you think? >>> >> >> We already have --no-c-in-traceback flag that disables C line numbers >> in traceback. >> What's about enabling it by default? >> >> -- >> vitja. >> ___ >> cython-devel mailing list >> cython-devel@python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > I'm quite attached to that feature actually :), it would be pretty > annoying to disable that flag every time. And what would disabling > that option gain, as the current code still formats the filename and > function name. Ah, you mean it would cache less code objects for multiple possible errors in expressions (or statements) on a single source line? ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
2012/1/28 mark florisson : > On 28 January 2012 19:41, Vitja Makarov wrote: >> 2012/1/28 Stefan Behnel : >>> Stefan Behnel, 27.01.2012 09:02: any exception *propagation* is still substantially slower than necessary, and that's a general issue. >>> >>> Here's a general take on a code object cache for exception propagation. >>> >>> https://github.com/scoder/cython/commit/ad18e0208 >>> >>> When I raise an exception in test code that propagates through a Python >>> call hierarchy of four functions before being caught, the cache gives me >>> something like a 2x speedup in total. Not bad. When I do the same for cdef >>> functions, it's more like 4-5x. >>> >>> The main idea is to cache the objects in a reallocable C array and bisect >>> into it based on the C code "__LINE__" of the exception, which should be >>> unique enough for a given module. >>> >>> It's a global cache that doesn't limit the lifetime of code objects (well, >>> up to the lifetime of the module, obviously). I don't know if that's a >>> problem because the number of code objects is only bounded by the number of >>> exception origination points in the C source code, which is usually quite >>> large. However, only a tiny fraction of those will ever raise or propagate >>> an exception in practice, so the real number of cached code objects will be >>> substantially smaller. >>> >>> Maybe thorough test suites with lots of failure testing would notice a >>> difference in memory consumption, even though a single code objects isn't >>> all that large either... >>> >>> What do you think? >>> >> >> We already have --no-c-in-traceback flag that disables C line numbers >> in traceback. >> What's about enabling it by default? >> > I'm quite attached to that feature actually :), it would be pretty > annoying to disable that flag every time. And what would disabling > that option gain, as the current code still formats the filename and > function name. It's rather useful for developers or debugging. Most of the people don't need it. Here is simple benchmark: # upstream/master: 6.38ms # upstream/master (no-c-in-traceback): 3.07ms # scoder/master: 1.31ms def foo(): raise ValueError def testit(): cdef int i for i in range(1): try: foo() except: pass Stefan's branch wins but: - there is only one item in the cache and it's always hit - we can still avoid calling PyString_FromString() making function name and source file name a python const (I've tried it and I get 2.28ms) -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
2012/1/28 mark florisson : > On 28 January 2012 19:48, mark florisson wrote: >> On 28 January 2012 19:41, Vitja Makarov wrote: >>> 2012/1/28 Stefan Behnel : Stefan Behnel, 27.01.2012 09:02: > any exception *propagation* is > still substantially slower than necessary, and that's a general issue. Here's a general take on a code object cache for exception propagation. https://github.com/scoder/cython/commit/ad18e0208 When I raise an exception in test code that propagates through a Python call hierarchy of four functions before being caught, the cache gives me something like a 2x speedup in total. Not bad. When I do the same for cdef functions, it's more like 4-5x. The main idea is to cache the objects in a reallocable C array and bisect into it based on the C code "__LINE__" of the exception, which should be unique enough for a given module. It's a global cache that doesn't limit the lifetime of code objects (well, up to the lifetime of the module, obviously). I don't know if that's a problem because the number of code objects is only bounded by the number of exception origination points in the C source code, which is usually quite large. However, only a tiny fraction of those will ever raise or propagate an exception in practice, so the real number of cached code objects will be substantially smaller. Maybe thorough test suites with lots of failure testing would notice a difference in memory consumption, even though a single code objects isn't all that large either... What do you think? >>> >>> We already have --no-c-in-traceback flag that disables C line numbers >>> in traceback. >>> What's about enabling it by default? >>> >>> -- >>> vitja. >>> ___ >>> cython-devel mailing list >>> cython-devel@python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> I'm quite attached to that feature actually :), it would be pretty >> annoying to disable that flag every time. And what would disabling >> that option gain, as the current code still formats the filename and >> function name. > > Ah, you mean it would cache less code objects for multiple possible > errors in expressions (or statements) on a single source line? Not exactly. I mean PyString_Format() is called to add C filename and C lineno to python function name. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
mark florisson, 28.01.2012 20:07: > On 28 January 2012 18:38, Stefan Behnel wrote: >> Stefan Behnel, 27.01.2012 09:02: >>> any exception *propagation* is >>> still substantially slower than necessary, and that's a general issue. >> >> Here's a general take on a code object cache for exception propagation. >> >> https://github.com/scoder/cython/commit/ad18e0208 >> >> When I raise an exception in test code that propagates through a Python >> call hierarchy of four functions before being caught, the cache gives me >> something like a 2x speedup in total. Not bad. When I do the same for cdef >> functions, it's more like 4-5x. >> >> The main idea is to cache the objects in a reallocable C array and bisect >> into it based on the C code "__LINE__" of the exception, which should be >> unique enough for a given module. >> >> It's a global cache that doesn't limit the lifetime of code objects (well, >> up to the lifetime of the module, obviously). I don't know if that's a >> problem because the number of code objects is only bounded by the number of >> exception origination points in the C source code, which is usually quite >> large. However, only a tiny fraction of those will ever raise or propagate >> an exception in practice, so the real number of cached code objects will be >> substantially smaller. >> >> Maybe thorough test suites with lots of failure testing would notice a >> difference in memory consumption, even though a single code objects isn't >> all that large either... >> >> What do you think? > > Nice. I have a question, couldn't you save the frame object instead of > the code object? Technically, yes. However, eventually, I'd like to make the CodeObject constant for the whole function and let CPython calculate the Python code source line based on the frame's "f_lasti" field when the line number is actually accessed. For now, I wouldn't mind cashing the whole frame until the above optimisation gets implemented. > I do think PyCodeObject is rather large, on my 64 bit machine it is > 120 bytes, not accounting for any of the objects it holds (not saying > that's a problem, just pointing it out). Hmm, ok. That's not really cheap, especially given the amount of redundancy in the content. Maybe we could intern the strings after creating them. > Would it help if we would pass in the position information string > object constant, to avoid the PyString_Format? That optimization would > only save 14% though. PyString_FromFormat() is impressively expensive. So, yes, as Vitja figured out, that would help. But I actually like the feature. > But additionally, the function name could be a > string object constant, which could be shared by all exceptions in one > function, avoiding another PyString_FromString. Yes, and in many cases, certainly for all Python functions, it's already there anyway. I originally rejected that idea because more string constants add to the module initialisation time (performance analysis pending here). It may still be worth doing at least for Python functions. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
Vitja Makarov, 28.01.2012 20:58: > 2012/1/28 mark florisson : >> On 28 January 2012 19:41, Vitja Makarov wrote: >>> 2012/1/28 Stefan Behnel : Stefan Behnel, 27.01.2012 09:02: > any exception *propagation* is > still substantially slower than necessary, and that's a general issue. Here's a general take on a code object cache for exception propagation. https://github.com/scoder/cython/commit/ad18e0208 When I raise an exception in test code that propagates through a Python call hierarchy of four functions before being caught, the cache gives me something like a 2x speedup in total. Not bad. When I do the same for cdef functions, it's more like 4-5x. The main idea is to cache the objects in a reallocable C array and bisect into it based on the C code "__LINE__" of the exception, which should be unique enough for a given module. It's a global cache that doesn't limit the lifetime of code objects (well, up to the lifetime of the module, obviously). I don't know if that's a problem because the number of code objects is only bounded by the number of exception origination points in the C source code, which is usually quite large. However, only a tiny fraction of those will ever raise or propagate an exception in practice, so the real number of cached code objects will be substantially smaller. Maybe thorough test suites with lots of failure testing would notice a difference in memory consumption, even though a single code objects isn't all that large either... What do you think? >>> >>> We already have --no-c-in-traceback flag that disables C line numbers >>> in traceback. >>> What's about enabling it by default? >>> >> I'm quite attached to that feature actually :), it would be pretty >> annoying to disable that flag every time. And what would disabling >> that option gain, as the current code still formats the filename and >> function name. > > It's rather useful for developers or debugging. Most of the people > don't need it. Not untrue. However, at least a majority of developers should be able to make use of it when it's there, and code is several times more often built for testing and debugging than for production. So I consider it a virtue that it's on by default. > Here is simple benchmark: > # upstream/master: 6.38ms > # upstream/master (no-c-in-traceback): 3.07ms > # scoder/master: 1.31ms > def foo(): > raise ValueError > > def testit(): > cdef int i > for i in range(1): > try: > foo() > except: > pass > > Stefan's branch wins but: > - there is only one item in the cache and it's always hit Even if there were substantially more, binary search is so fast you'd hardly notice the difference. (BTW, I just noticed that my binary search implementation is buggy - not a complete surprise. I'll add some tests for it.) > - we can still avoid calling PyString_FromString() making function > name and source file name a python const (I've tried it and I get > 2.28ms) I wouldn't mind, but it would be nice to get lazy initialisation for them. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
On 28 January 2012 19:59, Vitja Makarov wrote: > 2012/1/28 mark florisson : >> On 28 January 2012 19:48, mark florisson wrote: >>> On 28 January 2012 19:41, Vitja Makarov wrote: 2012/1/28 Stefan Behnel : > Stefan Behnel, 27.01.2012 09:02: >> any exception *propagation* is >> still substantially slower than necessary, and that's a general issue. > > Here's a general take on a code object cache for exception propagation. > > https://github.com/scoder/cython/commit/ad18e0208 > > When I raise an exception in test code that propagates through a Python > call hierarchy of four functions before being caught, the cache gives me > something like a 2x speedup in total. Not bad. When I do the same for cdef > functions, it's more like 4-5x. > > The main idea is to cache the objects in a reallocable C array and bisect > into it based on the C code "__LINE__" of the exception, which should be > unique enough for a given module. > > It's a global cache that doesn't limit the lifetime of code objects > (well, > up to the lifetime of the module, obviously). I don't know if that's a > problem because the number of code objects is only bounded by the number > of > exception origination points in the C source code, which is usually quite > large. However, only a tiny fraction of those will ever raise or propagate > an exception in practice, so the real number of cached code objects will > be > substantially smaller. > > Maybe thorough test suites with lots of failure testing would notice a > difference in memory consumption, even though a single code objects isn't > all that large either... > > What do you think? > We already have --no-c-in-traceback flag that disables C line numbers in traceback. What's about enabling it by default? -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel >>> >>> I'm quite attached to that feature actually :), it would be pretty >>> annoying to disable that flag every time. And what would disabling >>> that option gain, as the current code still formats the filename and >>> function name. >> >> Ah, you mean it would cache less code objects for multiple possible >> errors in expressions (or statements) on a single source line? > > Not exactly. I mean PyString_Format() is called to add C filename and > C lineno to python function name. > Ah indeed, the source lineno is only added to the code object of course. > -- > vitja. > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
2012/1/29 Stefan Behnel : > Vitja Makarov, 28.01.2012 20:58: >> 2012/1/28 mark florisson : >>> On 28 January 2012 19:41, Vitja Makarov wrote: 2012/1/28 Stefan Behnel : > Stefan Behnel, 27.01.2012 09:02: >> any exception *propagation* is >> still substantially slower than necessary, and that's a general issue. > > Here's a general take on a code object cache for exception propagation. > > https://github.com/scoder/cython/commit/ad18e0208 > > When I raise an exception in test code that propagates through a Python > call hierarchy of four functions before being caught, the cache gives me > something like a 2x speedup in total. Not bad. When I do the same for cdef > functions, it's more like 4-5x. > > The main idea is to cache the objects in a reallocable C array and bisect > into it based on the C code "__LINE__" of the exception, which should be > unique enough for a given module. > > It's a global cache that doesn't limit the lifetime of code objects > (well, > up to the lifetime of the module, obviously). I don't know if that's a > problem because the number of code objects is only bounded by the number > of > exception origination points in the C source code, which is usually quite > large. However, only a tiny fraction of those will ever raise or propagate > an exception in practice, so the real number of cached code objects will > be > substantially smaller. > > Maybe thorough test suites with lots of failure testing would notice a > difference in memory consumption, even though a single code objects isn't > all that large either... > > What do you think? > We already have --no-c-in-traceback flag that disables C line numbers in traceback. What's about enabling it by default? >>> I'm quite attached to that feature actually :), it would be pretty >>> annoying to disable that flag every time. And what would disabling >>> that option gain, as the current code still formats the filename and >>> function name. >> >> It's rather useful for developers or debugging. Most of the people >> don't need it. > > Not untrue. However, at least a majority of developers should be able to > make use of it when it's there, and code is several times more often built > for testing and debugging than for production. So I consider it a virtue > that it's on by default. > > >> Here is simple benchmark: >> # upstream/master: 6.38ms >> # upstream/master (no-c-in-traceback): 3.07ms >> # scoder/master: 1.31ms >> def foo(): >> raise ValueError >> >> def testit(): >> cdef int i >> for i in range(1): >> try: >> foo() >> except: >> pass >> >> Stefan's branch wins but: >> - there is only one item in the cache and it's always hit > > Even if there were substantially more, binary search is so fast you'd > hardly notice the difference. > Yes, I'm a little bit worried about insertions. But anyway I like it. With --no-c-in-traceback python lineno should be used as a key. > (BTW, I just noticed that my binary search implementation is buggy - not a > complete surprise. I'll add some tests for it.) > > >> - we can still avoid calling PyString_FromString() making function >> name and source file name a python const (I've tried it and I get >> 2.28ms) > > I wouldn't mind, but it would be nice to get lazy initialisation for them. > -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel