Re: [Cython] Bug in Cython producing incorrect C code
On 26 January 2012 06:39, Vitja Makarov wrote: > 2012/1/25 Stefan Behnel : >> mark florisson, 24.01.2012 14:53: >>> On 24 January 2012 11:37, Konrad Hinsen wrote: Compiling the attached Cython file produced the attached C file which has errors in lines 532-534: __pyx_v_self->xx = None; __pyx_v_self->yy = None; __pyx_v_self->zz = None; There is no C symbol "None", so this doesn't compile. I first noticed the bug in Cython 0.15, but it's still in the latest revision from Github. >>> >>> Hm, it seems the problem is that the call to the builtin float results >>> in SimpleCallNode being replaced with PythonCApiNode, which then >>> generates the result code, but the list of coerced nodes are >>> CloneNodes of the original rhs, and CloneNode does not generate the >>> result code of the original rhs (i.e. allocate and assign to a temp), >>> which results in a None result. >> >> Back to the old idea of separating the type analysis into 1) a basic >> typing, inference and entry creation step and 2) a proper type analysis, >> coercion, etc. step. >> > > Yeah! I think the issue must be fixed before release. We can start > moving slowly in this direction and split > CascadedAssignmentNode.analyse_types into parts: > - normal analyse_types()/expressions() > - create clone nodes at some late stage At what stage would the stage 2) proper type analysis take place? Basically nodes may be replaced at any point, and I'm not sure you want to wait until just before code generation to do the coercions (e.g. GILCheck won't catch coercions to object, although assignment nodes seem to check manually). I think this problem can trivially be solved by creating a ProxyNode that should never be replaced by any transform, but it's argument may be replaced. So you wrap self.rhs in a ProxyNode and use that to create your CloneNodes. >> The type driven optimisations would then run in between the two. That would >> simplify the optimisations (which would no longer have to unpack wrapped >> nodes) and improve the type analysis because it could work with the >> optimised types, e.g. return types of optimised builtin functions. >> >> I'm not entirely sure where the type inference should run. It may make more >> sense to move it after the tree optimisations to make use of optimised >> function calls. >> >> While we're at it, we should also replace the current type inference >> mechanism with a control flow based one. >> >> Sounds like a good topic for a Cython hacking workshop. >> > > Nice. Any news on that? > > > -- > vitja. > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
Just wanted to bring this issue to your guys' attention in case you knew what was responsible for this: https://github.com/ipython/ipython/issues/1317#issuecomment-3652550 I traced down the problem (with git bisect) to a seemingly innocuous commit referenced in that GitHub thread. The issue seemed to only present itself in IPython, so likely there was some problem with inspecting the Cython frames for giving context around the full traceback. best, Wes ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
On 26 January 2012 17:56, Wes McKinney wrote: > Just wanted to bring this issue to your guys' attention in case you > knew what was responsible for this: > > https://github.com/ipython/ipython/issues/1317#issuecomment-3652550 > > I traced down the problem (with git bisect) to a seemingly innocuous > commit referenced in that GitHub thread. The issue seemed to only > present itself in IPython, so likely there was some problem with > inspecting the Cython frames for giving context around the full > traceback. > > best, > Wes > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel So commit 0e579823bd34de5d1d9b4aeac2c8d727415cba2d fixed the problem? I don't see how it could do that. Anyway, when I try to run the code from the example I don't get any traceback at all. Could you perhaps paste the exact code that produces the behaviour? ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
On 26 January 2012 18:36, mark florisson wrote: > On 26 January 2012 17:56, Wes McKinney wrote: >> Just wanted to bring this issue to your guys' attention in case you >> knew what was responsible for this: >> >> https://github.com/ipython/ipython/issues/1317#issuecomment-3652550 >> >> I traced down the problem (with git bisect) to a seemingly innocuous >> commit referenced in that GitHub thread. The issue seemed to only >> present itself in IPython, so likely there was some problem with >> inspecting the Cython frames for giving context around the full >> traceback. >> >> best, >> Wes >> ___ >> cython-devel mailing list >> cython-devel@python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > So commit 0e579823bd34de5d1d9b4aeac2c8d727415cba2d fixed the problem? > I don't see how it could do that. Anyway, when I try to run the code > from the example I don't get any traceback at all. Could you perhaps > paste the exact code that produces the behaviour? On a side note, ipython is not something I usually trust to test things out, as it gets things wrong sometimes which can seriously make you question your own sanity. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Bug in Cython producing incorrect C code
2012/1/26 mark florisson : > On 26 January 2012 06:39, Vitja Makarov wrote: >> 2012/1/25 Stefan Behnel : >>> mark florisson, 24.01.2012 14:53: On 24 January 2012 11:37, Konrad Hinsen wrote: > Compiling the attached Cython file produced the attached C file which > has errors in lines 532-534: > > __pyx_v_self->xx = None; > __pyx_v_self->yy = None; > __pyx_v_self->zz = None; > > There is no C symbol "None", so this doesn't compile. > > I first noticed the bug in Cython 0.15, but it's still in the latest > revision from Github. Hm, it seems the problem is that the call to the builtin float results in SimpleCallNode being replaced with PythonCApiNode, which then generates the result code, but the list of coerced nodes are CloneNodes of the original rhs, and CloneNode does not generate the result code of the original rhs (i.e. allocate and assign to a temp), which results in a None result. >>> >>> Back to the old idea of separating the type analysis into 1) a basic >>> typing, inference and entry creation step and 2) a proper type analysis, >>> coercion, etc. step. >>> >> >> Yeah! I think the issue must be fixed before release. We can start >> moving slowly in this direction and split >> CascadedAssignmentNode.analyse_types into parts: >> - normal analyse_types()/expressions() >> - create clone nodes at some late stage > > At what stage would the stage 2) proper type analysis take place? > Basically nodes may be replaced at any point, and I'm not sure you > want to wait until just before code generation to do the coercions > (e.g. GILCheck won't catch coercions to object, although assignment > nodes seem to check manually). > That must be run before GilCheck. Stage 2 is "I believe the tree won't change much later" > I think this problem can trivially be solved by creating a ProxyNode > that should never be replaced by any transform, but it's argument may > be replaced. So you wrap self.rhs in a ProxyNode and use that to > create your CloneNodes. > Do you mean proxy node to be something like this: class ProxyNode(object): def __init__(self, obj): object.__setattr__(self, '_obj', obj) def __getattr__(self, key): return getattr(self._obj, key) def __setattr__(self, key, value): setattr(self._obj, key, value) That might help but I'm not sure how evil that is. It also will require TreeVisitor.find_handler() modification. Node replacement could be avoided by introducing ProxyProperty() I see another problem with proxies: CloneNode or its owner may depend on content of the argument so when it's changed things can be messed up. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Bug in Cython producing incorrect C code
mark florisson, 26.01.2012 16:20: > On 26 January 2012 06:39, Vitja Makarov wrote: >> 2012/1/25 Stefan Behnel: >>> Back to the old idea of separating the type analysis into 1) a basic >>> typing, inference and entry creation step and 2) a proper type analysis, >>> coercion, etc. step. >>> >> >> Yeah! I think the issue must be fixed before release. We can start >> moving slowly in this direction and split >> CascadedAssignmentNode.analyse_types into parts: >> - normal analyse_types()/expressions() >> - create clone nodes at some late stage > > At what stage would the stage 2) proper type analysis take place? After the structural optimisation phase and before any optimisations or other transforms that require complete type information but do not change types anymore. I don't see it being moved to the end of the pipeline, the results will be needed way before that. Even some optimisations may not be possible without complete type analysis. > Basically nodes may be replaced at any point, and I'm not sure you > want to wait until just before code generation to do the coercions > (e.g. GILCheck won't catch coercions to object, although assignment > nodes seem to check manually). There's a large grey area in between. It'll need some refactoring and rebalancing, just as before. But it should be easier than it is now because the grey area will have more anchors in it. > I think this problem can trivially be solved by creating a ProxyNode > that should never be replaced by any transform, but it's argument may > be replaced. So you wrap self.rhs in a ProxyNode and use that to > create your CloneNodes. I can't see what a ProxyNode would do that a CloneNode shouldn't do anyway. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
On Thu, Jan 26, 2012 at 10:37 AM, mark florisson wrote: > On a side note, ipython is not something I usually trust to test > things out, as it gets things wrong sometimes which can seriously make > you question your own sanity. I should note that we'd love to know specifics about problems that severe. Without a concrete report it's impossible for us to fix a problem, unfortunately, and a vague statement like this only serves to spread the notion "ipython is bad, for reasons unspecified". Also, since I know some core Cython devs are also heavy sage users, and the sage command-line is actually ipython, having an issue at the intersection of cython/ipython is problematic for all sage users working at the terminal. So we'd really like to make the cython/ipython combo as robust as possible. I certainly don't expect everyone to use ipython, and obviously when I see something that looks really bizarre, I double-check with plain Python first, as ultimately that is our reference and regressions from plain python are automatically bugs for us. But I have to say that it's been a very long time since I have encountered a situation where ipython produces something completely absurd, and if it happens to you, by all means please let us know, so we can try to fix it. Best, f ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Bug in Cython producing incorrect C code
On 26 January 2012 18:51, Vitja Makarov wrote: > 2012/1/26 mark florisson : >> On 26 January 2012 06:39, Vitja Makarov wrote: >>> 2012/1/25 Stefan Behnel : mark florisson, 24.01.2012 14:53: > On 24 January 2012 11:37, Konrad Hinsen wrote: >> Compiling the attached Cython file produced the attached C file which >> has errors in lines 532-534: >> >> __pyx_v_self->xx = None; >> __pyx_v_self->yy = None; >> __pyx_v_self->zz = None; >> >> There is no C symbol "None", so this doesn't compile. >> >> I first noticed the bug in Cython 0.15, but it's still in the latest >> revision from Github. > > Hm, it seems the problem is that the call to the builtin float results > in SimpleCallNode being replaced with PythonCApiNode, which then > generates the result code, but the list of coerced nodes are > CloneNodes of the original rhs, and CloneNode does not generate the > result code of the original rhs (i.e. allocate and assign to a temp), > which results in a None result. Back to the old idea of separating the type analysis into 1) a basic typing, inference and entry creation step and 2) a proper type analysis, coercion, etc. step. >>> >>> Yeah! I think the issue must be fixed before release. We can start >>> moving slowly in this direction and split >>> CascadedAssignmentNode.analyse_types into parts: >>> - normal analyse_types()/expressions() >>> - create clone nodes at some late stage >> >> At what stage would the stage 2) proper type analysis take place? >> Basically nodes may be replaced at any point, and I'm not sure you >> want to wait until just before code generation to do the coercions >> (e.g. GILCheck won't catch coercions to object, although assignment >> nodes seem to check manually). >> > > That must be run before GilCheck. Stage 2 is "I believe the tree won't > change much later" > > >> I think this problem can trivially be solved by creating a ProxyNode >> that should never be replaced by any transform, but it's argument may >> be replaced. So you wrap self.rhs in a ProxyNode and use that to >> create your CloneNodes. >> > > Do you mean proxy node to be something like this: > > class ProxyNode(object): > def __init__(self, obj): > object.__setattr__(self, '_obj', obj) > > def __getattr__(self, key): > return getattr(self._obj, key) > > def __setattr__(self, key, value): > setattr(self._obj, key, value) > > That might help but I'm not sure how evil that is. > > It also will require TreeVisitor.find_handler() modification. Node > replacement could be avoided by introducing ProxyProperty() > > I see another problem with proxies: CloneNode or its owner may depend > on content of the argument so when it's changed things can be messed > up. Not quite like that. It should be a regular node that is ignored by transforms but delegates a few methods. e.g. class ProxyNode(object): child_attrs = ['arg'] def __init__(self, arg): self.arg = arg def result(self): return self.arg.result() def generate_result_code(self, code): return self.arg.generate_result_code(code) and that's pretty much it (untested). And CloneNode doesn't depend on any specific node, nor should any code wrapping its nodes in ProxyNodes (obviously). The nodes sole purpose is to create an indirection, so that when previously self.rhs got replaced, now self.rhs.arg will be replaced, so self.rhs can be safely shared with other CloneNodes. > -- > vitja. > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Bug in Cython producing incorrect C code
On 26 January 2012 18:53, Stefan Behnel wrote: > mark florisson, 26.01.2012 16:20: >> On 26 January 2012 06:39, Vitja Makarov wrote: >>> 2012/1/25 Stefan Behnel: Back to the old idea of separating the type analysis into 1) a basic typing, inference and entry creation step and 2) a proper type analysis, coercion, etc. step. >>> >>> Yeah! I think the issue must be fixed before release. We can start >>> moving slowly in this direction and split >>> CascadedAssignmentNode.analyse_types into parts: >>> - normal analyse_types()/expressions() >>> - create clone nodes at some late stage >> >> At what stage would the stage 2) proper type analysis take place? > > After the structural optimisation phase and before any optimisations or > other transforms that require complete type information but do not change > types anymore. I don't see it being moved to the end of the pipeline, the > results will be needed way before that. Even some optimisations may not be > possible without complete type analysis. > > >> Basically nodes may be replaced at any point, and I'm not sure you >> want to wait until just before code generation to do the coercions >> (e.g. GILCheck won't catch coercions to object, although assignment >> nodes seem to check manually). > > There's a large grey area in between. It'll need some refactoring and > rebalancing, just as before. But it should be easier than it is now because > the grey area will have more anchors in it. > That's nice. It wouldn't solve the problem at hand though, and I think nothing can unless it will be at the very end of the pipeline. >> I think this problem can trivially be solved by creating a ProxyNode >> that should never be replaced by any transform, but it's argument may >> be replaced. So you wrap self.rhs in a ProxyNode and use that to >> create your CloneNodes. > > I can't see what a ProxyNode would do that a CloneNode shouldn't do anyway. It wouldn't be a replacement, merely an addition (an extra indirection). > Stefan > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
On 26 January 2012 19:10, Fernando Perez wrote: > On Thu, Jan 26, 2012 at 10:37 AM, mark florisson > wrote: >> On a side note, ipython is not something I usually trust to test >> things out, as it gets things wrong sometimes which can seriously make >> you question your own sanity. > > I should note that we'd love to know specifics about problems that > severe. Without a concrete report it's impossible for us to fix a > problem, unfortunately, and a vague statement like this only serves to > spread the notion "ipython is bad, for reasons unspecified". Apologies, it was indeed a rather vague comment. I had some issues with pasting unicode characters a few years back that would get incorrect codepoints in ipython but not regular python, but I'm afraid I don't have any concrete report. As such I withdraw my comment, I just wanted to mention that if something smells iffy it might be a good idea to resort to regular python. > Also, since I know some core Cython devs are also heavy sage users, > and the sage command-line is actually ipython, having an issue at the > intersection of cython/ipython is problematic for all sage users > working at the terminal. So we'd really like to make the > cython/ipython combo as robust as possible. > > I certainly don't expect everyone to use ipython, and obviously when I > see something that looks really bizarre, I double-check with plain > Python first, as ultimately that is our reference and regressions from > plain python are automatically bugs for us. > > But I have to say that it's been a very long time since I have > encountered a situation where ipython produces something completely > absurd, and if it happens to you, by all means please let us know, so > we can try to fix it. > > Best, > > f > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Bug in Cython producing incorrect C code
mark florisson, 26.01.2012 20:15: > On 26 January 2012 18:53, Stefan Behnel wrote: >> mark florisson, 26.01.2012 16:20: >>> I think this problem can trivially be solved by creating a ProxyNode >>> that should never be replaced by any transform, but it's argument may >>> be replaced. So you wrap self.rhs in a ProxyNode and use that to >>> create your CloneNodes. >> >> I can't see what a ProxyNode would do that a CloneNode shouldn't do anyway. > > It wouldn't be a replacement, merely an addition (an extra indirection). What I was trying to say was that a ProxyNode would always be required by a CloneNode, but I don't see where a ProxyNode would be needed outside of a CloneNode. So it seems rather redundant and I don't know if we need a separate node for it. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Bug in Cython producing incorrect C code
On 26 January 2012 19:27, Stefan Behnel wrote: > mark florisson, 26.01.2012 20:15: >> On 26 January 2012 18:53, Stefan Behnel wrote: >>> mark florisson, 26.01.2012 16:20: I think this problem can trivially be solved by creating a ProxyNode that should never be replaced by any transform, but it's argument may be replaced. So you wrap self.rhs in a ProxyNode and use that to create your CloneNodes. >>> >>> I can't see what a ProxyNode would do that a CloneNode shouldn't do anyway. >> >> It wouldn't be a replacement, merely an addition (an extra indirection). > > What I was trying to say was that a ProxyNode would always be required by a > CloneNode, but I don't see where a ProxyNode would be needed outside of a > CloneNode. So it seems rather redundant and I don't know if we need a > separate node for it. Yes it would be needed only for that, but I think the only real alternative is to not use CloneNode at all, i.e. make the transformation Dag mentioned, where you create new rhs (NameNode?) references to the temporary result. > Stefan > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
On Thu, Jan 26, 2012 at 2:21 PM, mark florisson wrote: > On 26 January 2012 19:10, Fernando Perez wrote: >> On Thu, Jan 26, 2012 at 10:37 AM, mark florisson >> wrote: >>> On a side note, ipython is not something I usually trust to test >>> things out, as it gets things wrong sometimes which can seriously make >>> you question your own sanity. >> >> I should note that we'd love to know specifics about problems that >> severe. Without a concrete report it's impossible for us to fix a >> problem, unfortunately, and a vague statement like this only serves to >> spread the notion "ipython is bad, for reasons unspecified". > > Apologies, it was indeed a rather vague comment. I had some issues > with pasting unicode characters a few years back that would get > incorrect codepoints in ipython but not regular python, but I'm afraid > I don't have any concrete report. As such I withdraw my comment, I > just wanted to mention that if something smells iffy it might be a > good idea to resort to regular python. > >> Also, since I know some core Cython devs are also heavy sage users, >> and the sage command-line is actually ipython, having an issue at the >> intersection of cython/ipython is problematic for all sage users >> working at the terminal. So we'd really like to make the >> cython/ipython combo as robust as possible. >> >> I certainly don't expect everyone to use ipython, and obviously when I >> see something that looks really bizarre, I double-check with plain >> Python first, as ultimately that is our reference and regressions from >> plain python are automatically bugs for us. >> >> But I have to say that it's been a very long time since I have >> encountered a situation where ipython produces something completely >> absurd, and if it happens to you, by all means please let us know, so >> we can try to fix it. >> >> Best, >> >> f >> ___ >> cython-devel mailing list >> cython-devel@python.org >> http://mail.python.org/mailman/listinfo/cython-devel > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel To reproduce the problem you'll want to use the specific pandas SHA (99e2eec) that I referenced there. The pandas bug in the issue has been fixed since then. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
Robert Bradshaw, 21.01.2012 23:09: > On Sat, Jan 21, 2012 at 10:50 AM, Stefan Behnel wrote: >> I did some callgrind profiling on Cython's generators and was surprised to >> find that AddTraceback() represents a serious performance penalty for short >> running generators. >> >> I profiled a compiled Python implementation of itertools.groupby(), which >> yields (key, group) tuples where the group is an iterator again. I ran this >> code in Python for benchmarking: >> >> """ >> L = sorted(range(1000)*5) >> >> all(list(g) for k,g in groupby(L)) >> """ >> >> Groups tend to be rather short in real code, often just one or a couple of >> items, so unpacking the group iterator into a list will usually be a quick >> loop and then the generator raises StopIteration on termination and builds >> a traceback for it. According to callgrind (which, I should note, tends to >> overestimate the amount of time spent in memory allocation), the iteration >> during the group unpacking takes about 30% of the overall runtime of the >> all() loop, and the AddTraceback() call at the end of each group traversal >> takes up to 25% (!) on my side. That means that more than 80% of the group >> unpacking time goes into raising StopIteration from the generators. I >> attached the call graph with the relative timings. >> >> About half of the exception raising time is eaten by PyString_FromFormat() >> that builds the function-name + line-position string (which, I may note, is >> basically a convenience feature). This string is a constant for a >> generator's StopIteration exception, at least for each final return point >> in a generator, but here it is being recreated over and over again, for >> each exception that gets raised. >> >> Even if we keep creating a new frame instance each time (which should be ok >> because CPython has a frame instance cache already and we'd only create one >> during the generator lifetime), the whole code object could actually be >> cached after the first creation, preferably bound to the lifetime of the >> generator creator function/method. Or, more generally, one code object per >> generator termination point, which will be a single point in the majority >> of cases. For the specific code above, that should shave off almost 20% of >> the overall runtime of the all() loop. >> >> I think that's totally worth doing. > > Makes sense to me. I did some caching like this for profiling. Here's a ticket for now: http://trac.cython.org/cython_trac/ticket/760 Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
On 26 January 2012 19:40, Wes McKinney wrote: > On Thu, Jan 26, 2012 at 2:21 PM, mark florisson > wrote: >> On 26 January 2012 19:10, Fernando Perez wrote: >>> On Thu, Jan 26, 2012 at 10:37 AM, mark florisson >>> wrote: On a side note, ipython is not something I usually trust to test things out, as it gets things wrong sometimes which can seriously make you question your own sanity. >>> >>> I should note that we'd love to know specifics about problems that >>> severe. Without a concrete report it's impossible for us to fix a >>> problem, unfortunately, and a vague statement like this only serves to >>> spread the notion "ipython is bad, for reasons unspecified". >> >> Apologies, it was indeed a rather vague comment. I had some issues >> with pasting unicode characters a few years back that would get >> incorrect codepoints in ipython but not regular python, but I'm afraid >> I don't have any concrete report. As such I withdraw my comment, I >> just wanted to mention that if something smells iffy it might be a >> good idea to resort to regular python. >> >>> Also, since I know some core Cython devs are also heavy sage users, >>> and the sage command-line is actually ipython, having an issue at the >>> intersection of cython/ipython is problematic for all sage users >>> working at the terminal. So we'd really like to make the >>> cython/ipython combo as robust as possible. >>> >>> I certainly don't expect everyone to use ipython, and obviously when I >>> see something that looks really bizarre, I double-check with plain >>> Python first, as ultimately that is our reference and regressions from >>> plain python are automatically bugs for us. >>> >>> But I have to say that it's been a very long time since I have >>> encountered a situation where ipython produces something completely >>> absurd, and if it happens to you, by all means please let us know, so >>> we can try to fix it. >>> >>> Best, >>> >>> f >>> ___ >>> cython-devel mailing list >>> cython-devel@python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> ___ >> cython-devel mailing list >> cython-devel@python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > To reproduce the problem you'll want to use the specific pandas SHA > (99e2eec) that I referenced there. The pandas bug in the issue has > been fixed since then. > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel I get the same as minrk got: TypeError: unhashable type 'dict' . Anyway, if it's working now (and if other tracebacks are fast (enough) as they should be), I don't think we have anything to worry about. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
2012/1/27 Stefan Behnel : > Robert Bradshaw, 21.01.2012 23:09: >> On Sat, Jan 21, 2012 at 10:50 AM, Stefan Behnel wrote: >>> I did some callgrind profiling on Cython's generators and was surprised to >>> find that AddTraceback() represents a serious performance penalty for short >>> running generators. >>> >>> I profiled a compiled Python implementation of itertools.groupby(), which >>> yields (key, group) tuples where the group is an iterator again. I ran this >>> code in Python for benchmarking: >>> >>> """ >>> L = sorted(range(1000)*5) >>> >>> all(list(g) for k,g in groupby(L)) >>> """ >>> >>> Groups tend to be rather short in real code, often just one or a couple of >>> items, so unpacking the group iterator into a list will usually be a quick >>> loop and then the generator raises StopIteration on termination and builds >>> a traceback for it. According to callgrind (which, I should note, tends to >>> overestimate the amount of time spent in memory allocation), the iteration >>> during the group unpacking takes about 30% of the overall runtime of the >>> all() loop, and the AddTraceback() call at the end of each group traversal >>> takes up to 25% (!) on my side. That means that more than 80% of the group >>> unpacking time goes into raising StopIteration from the generators. I >>> attached the call graph with the relative timings. >>> >>> About half of the exception raising time is eaten by PyString_FromFormat() >>> that builds the function-name + line-position string (which, I may note, is >>> basically a convenience feature). This string is a constant for a >>> generator's StopIteration exception, at least for each final return point >>> in a generator, but here it is being recreated over and over again, for >>> each exception that gets raised. >>> >>> Even if we keep creating a new frame instance each time (which should be ok >>> because CPython has a frame instance cache already and we'd only create one >>> during the generator lifetime), the whole code object could actually be >>> cached after the first creation, preferably bound to the lifetime of the >>> generator creator function/method. Or, more generally, one code object per >>> generator termination point, which will be a single point in the majority >>> of cases. For the specific code above, that should shave off almost 20% of >>> the overall runtime of the all() loop. >>> >>> I think that's totally worth doing. >> >> Makes sense to me. I did some caching like this for profiling. > > Here's a ticket for now: > > http://trac.cython.org/cython_trac/ticket/760 > I think that could be easily fixed. CPython doesn't add any traceback info for generator's ending. https://github.com/vitek/cython/commit/63620bc2a29f3064bbdf7a49eefffaae4e3c369d -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
On Thu, Jan 26, 2012 at 11:21 AM, mark florisson wrote: > Apologies, it was indeed a rather vague comment. I had some issues No worries. > with pasting unicode characters a few years back that would get > incorrect codepoints in ipython but not regular python, but I'm afraid We had a long and distinguished history of unicode bugs (we even had a specific tag for them, as they were so severe). But the vast majority of them have been fixed, the unicode-filtered list only shows three open ones: https://github.com/ipython/ipython/issues?labels=unicode&sort=created&direction=desc&state=open&page=1 So it's possible your problem has indeed been fixed, since we know that at least the worst horrors are gone now, largely thanks to Thomas Kluyver's insane persistence. > I don't have any concrete report. As such I withdraw my comment, I > just wanted to mention that if something smells iffy it might be a > good idea to resort to regular python. That is most certainly a good policy, and I use it myself. IPython is necessarily a big and complex beast, so if something looks really odd, the first sanity check is to remove that layer from the problem. Cheers, f ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] AddTraceback() slows down generators
Vitja Makarov, 26.01.2012 21:19: > 2012/1/27 Stefan Behnel: >> Robert Bradshaw, 21.01.2012 23:09: >>> On Sat, Jan 21, 2012 at 10:50 AM, Stefan Behnel wrote: I did some callgrind profiling on Cython's generators and was surprised to find that AddTraceback() represents a serious performance penalty for short running generators. I profiled a compiled Python implementation of itertools.groupby(), which yields (key, group) tuples where the group is an iterator again. I ran this code in Python for benchmarking: """ L = sorted(range(1000)*5) all(list(g) for k,g in groupby(L)) """ Groups tend to be rather short in real code, often just one or a couple of items, so unpacking the group iterator into a list will usually be a quick loop and then the generator raises StopIteration on termination and builds a traceback for it. According to callgrind (which, I should note, tends to overestimate the amount of time spent in memory allocation), the iteration during the group unpacking takes about 30% of the overall runtime of the all() loop, and the AddTraceback() call at the end of each group traversal takes up to 25% (!) on my side. That means that more than 80% of the group unpacking time goes into raising StopIteration from the generators. I attached the call graph with the relative timings. About half of the exception raising time is eaten by PyString_FromFormat() that builds the function-name + line-position string (which, I may note, is basically a convenience feature). This string is a constant for a generator's StopIteration exception, at least for each final return point in a generator, but here it is being recreated over and over again, for each exception that gets raised. Even if we keep creating a new frame instance each time (which should be ok because CPython has a frame instance cache already and we'd only create one during the generator lifetime), the whole code object could actually be cached after the first creation, preferably bound to the lifetime of the generator creator function/method. Or, more generally, one code object per generator termination point, which will be a single point in the majority of cases. For the specific code above, that should shave off almost 20% of the overall runtime of the all() loop. > > I think that could be easily fixed. CPython doesn't add any traceback > info for generator's ending. > > https://github.com/vitek/cython/commit/63620bc2a29f3064bbdf7a49eefffaae4e3c369d Interesting. It doesn't solve the general problem of slow exceptions, but I think that's a beautiful way of fixing this particular performance problem for generators. Here are the timings. Your change made the above code twice as fast! Cython compiled Python implementation of itertools.groupby(), *without* your change: python2.7 -m timeit -s 'from cytertools import groupby; \ L=sorted(range(1000)*5)' 'all(list(g) for k,g in groupby(L))' 100 loops, best of 3: 3.76 msec per loop Cython compiled Python implementation of itertools.groupby(), *with* your change: python2.7 -m timeit -s 'from cytertools import groupby; \ L=sorted(range(1000)*5)' 'all(list(g) for k,g in groupby(L))' 1000 loops, best of 3: 1.81 msec per loop The real itertools, for comparison: python2.7 -m timeit -s 'from itertools import groupby; \ L=sorted(range(1000)*5)' 'all(list(g) for k,g in groupby(L))' 1000 loops, best of 3: 1.31 msec per loop That's close. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master
Wes McKinney, 26.01.2012 18:56: > Just wanted to bring this issue to your guys' attention in case you > knew what was responsible for this: > > https://github.com/ipython/ipython/issues/1317#issuecomment-3652550 > > I traced down the problem (with git bisect) to a seemingly innocuous > commit referenced in that GitHub thread. The issue seemed to only > present itself in IPython, so likely there was some problem with > inspecting the Cython frames for giving context around the full > traceback. That's not impossible. Traceback frames behave differently in Cython for two reasons: a) because they are only being constructed after the fact in an error handling case, not for normal functions, and b) because Cython doesn't have real code objects for inspection and fakes them in a rather ad-hoc way. For example, there is currently no function signature information in them, and line number computation doesn't use the normal CPython way that matches the byte code position with a compressed line table. Instead, Cython creates a new code object for a given line on the fly and just pretends that the function starts there. This usually works well enough for a traceback, but this kind of differences makes it quite possible that IPython makes assumptions about inspection here that Cython doesn't meet. In another thread ("AddTraceback() slows down generators"), I was proposing to cache the code object for a given function. That would then imply providing a (fake) byte code position map as well and would make it easier to provide static signature information, thus improving the overall compatibility with the code objects that CPython creates. Note that it's unclear without further investigation if the problem you ran into really has to do with these internal details. I'm just raising a possible explanation here. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel