Re: [Cython] Bug in Cython producing incorrect C code

2012-01-26 Thread mark florisson
On 26 January 2012 06:39, Vitja Makarov  wrote:
> 2012/1/25 Stefan Behnel :
>> mark florisson, 24.01.2012 14:53:
>>> On 24 January 2012 11:37, Konrad Hinsen wrote:
 Compiling the attached Cython file produced the attached C file which
 has errors in lines 532-534:

  __pyx_v_self->xx = None;
  __pyx_v_self->yy = None;
  __pyx_v_self->zz = None;

 There is no C symbol "None", so this doesn't compile.

 I first noticed the bug in Cython 0.15, but it's still in the latest
 revision from Github.
>>>
>>> Hm, it seems the problem is that the call to the builtin float results
>>> in SimpleCallNode being replaced with PythonCApiNode, which then
>>> generates the result code, but the list of coerced nodes are
>>> CloneNodes of the original rhs, and CloneNode does not generate the
>>> result code of the original rhs (i.e. allocate and assign to a temp),
>>> which results in a None result.
>>
>> Back to the old idea of separating the type analysis into 1) a basic
>> typing, inference and entry creation step and 2) a proper type analysis,
>> coercion, etc. step.
>>
>
> Yeah! I think the issue must be fixed before release. We can start
> moving slowly in this direction and split
> CascadedAssignmentNode.analyse_types into parts:
>  - normal analyse_types()/expressions()
>  - create clone nodes at some late stage

At what stage would the stage 2) proper type analysis take place?
Basically nodes may be replaced at any point, and I'm not sure you
want to wait until just before code generation to do the coercions
(e.g.  GILCheck won't catch coercions to object, although assignment
nodes seem to check manually).

I think this problem can trivially be solved by creating a ProxyNode
that should never be replaced by any transform, but it's argument may
be replaced. So you wrap self.rhs in a ProxyNode and use that to
create your CloneNodes.

>> The type driven optimisations would then run in between the two. That would
>> simplify the optimisations (which would no longer have to unpack wrapped
>> nodes) and improve the type analysis because it could work with the
>> optimised types, e.g. return types of optimised builtin functions.
>>
>> I'm not entirely sure where the type inference should run. It may make more
>> sense to move it after the tree optimisations to make use of optimised
>> function calls.
>>
>> While we're at it, we should also replace the current type inference
>> mechanism with a control flow based one.
>>
>> Sounds like a good topic for a Cython hacking workshop.
>>
>
> Nice. Any news on that?
>
>
> --
> vitja.
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


[Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread Wes McKinney
Just wanted to bring this issue to your guys' attention in case you
knew what was responsible for this:

https://github.com/ipython/ipython/issues/1317#issuecomment-3652550

I traced down the problem (with git bisect) to a seemingly innocuous
commit referenced in that GitHub thread. The issue seemed to only
present itself in IPython, so likely there was some problem with
inspecting the Cython frames for giving context around the full
traceback.

best,
Wes
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread mark florisson
On 26 January 2012 17:56, Wes McKinney  wrote:
> Just wanted to bring this issue to your guys' attention in case you
> knew what was responsible for this:
>
> https://github.com/ipython/ipython/issues/1317#issuecomment-3652550
>
> I traced down the problem (with git bisect) to a seemingly innocuous
> commit referenced in that GitHub thread. The issue seemed to only
> present itself in IPython, so likely there was some problem with
> inspecting the Cython frames for giving context around the full
> traceback.
>
> best,
> Wes
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel

So commit 0e579823bd34de5d1d9b4aeac2c8d727415cba2d fixed the problem?
I don't see how it could do that. Anyway, when I try to run the code
from the example I don't get any traceback at all. Could you perhaps
paste the exact code that produces the behaviour?
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread mark florisson
On 26 January 2012 18:36, mark florisson  wrote:
> On 26 January 2012 17:56, Wes McKinney  wrote:
>> Just wanted to bring this issue to your guys' attention in case you
>> knew what was responsible for this:
>>
>> https://github.com/ipython/ipython/issues/1317#issuecomment-3652550
>>
>> I traced down the problem (with git bisect) to a seemingly innocuous
>> commit referenced in that GitHub thread. The issue seemed to only
>> present itself in IPython, so likely there was some problem with
>> inspecting the Cython frames for giving context around the full
>> traceback.
>>
>> best,
>> Wes
>> ___
>> cython-devel mailing list
>> cython-devel@python.org
>> http://mail.python.org/mailman/listinfo/cython-devel
>
> So commit 0e579823bd34de5d1d9b4aeac2c8d727415cba2d fixed the problem?
> I don't see how it could do that. Anyway, when I try to run the code
> from the example I don't get any traceback at all. Could you perhaps
> paste the exact code that produces the behaviour?

On a side note, ipython is not something I usually trust to test
things out, as it gets things wrong sometimes which can seriously make
you question your own sanity.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Bug in Cython producing incorrect C code

2012-01-26 Thread Vitja Makarov
2012/1/26 mark florisson :
> On 26 January 2012 06:39, Vitja Makarov  wrote:
>> 2012/1/25 Stefan Behnel :
>>> mark florisson, 24.01.2012 14:53:
 On 24 January 2012 11:37, Konrad Hinsen wrote:
> Compiling the attached Cython file produced the attached C file which
> has errors in lines 532-534:
>
>  __pyx_v_self->xx = None;
>  __pyx_v_self->yy = None;
>  __pyx_v_self->zz = None;
>
> There is no C symbol "None", so this doesn't compile.
>
> I first noticed the bug in Cython 0.15, but it's still in the latest
> revision from Github.

 Hm, it seems the problem is that the call to the builtin float results
 in SimpleCallNode being replaced with PythonCApiNode, which then
 generates the result code, but the list of coerced nodes are
 CloneNodes of the original rhs, and CloneNode does not generate the
 result code of the original rhs (i.e. allocate and assign to a temp),
 which results in a None result.
>>>
>>> Back to the old idea of separating the type analysis into 1) a basic
>>> typing, inference and entry creation step and 2) a proper type analysis,
>>> coercion, etc. step.
>>>
>>
>> Yeah! I think the issue must be fixed before release. We can start
>> moving slowly in this direction and split
>> CascadedAssignmentNode.analyse_types into parts:
>>  - normal analyse_types()/expressions()
>>  - create clone nodes at some late stage
>
> At what stage would the stage 2) proper type analysis take place?
> Basically nodes may be replaced at any point, and I'm not sure you
> want to wait until just before code generation to do the coercions
> (e.g.  GILCheck won't catch coercions to object, although assignment
> nodes seem to check manually).
>

That must be run before GilCheck. Stage 2 is "I believe the tree won't
change much later"


> I think this problem can trivially be solved by creating a ProxyNode
> that should never be replaced by any transform, but it's argument may
> be replaced. So you wrap self.rhs in a ProxyNode and use that to
> create your CloneNodes.
>

Do you mean proxy node to be something like this:

class ProxyNode(object):
def __init__(self, obj):
object.__setattr__(self, '_obj', obj)

def __getattr__(self, key):
return getattr(self._obj, key)

def __setattr__(self, key, value):
setattr(self._obj, key, value)

That might help but I'm not sure how evil that is.

It also will require TreeVisitor.find_handler() modification. Node
replacement could be avoided by introducing ProxyProperty()

I see another problem with proxies: CloneNode or its owner may depend
on content of the argument so when it's changed things can be messed
up.

-- 
vitja.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Bug in Cython producing incorrect C code

2012-01-26 Thread Stefan Behnel
mark florisson, 26.01.2012 16:20:
> On 26 January 2012 06:39, Vitja Makarov wrote:
>> 2012/1/25 Stefan Behnel:
>>> Back to the old idea of separating the type analysis into 1) a basic
>>> typing, inference and entry creation step and 2) a proper type analysis,
>>> coercion, etc. step.
>>>
>>
>> Yeah! I think the issue must be fixed before release. We can start
>> moving slowly in this direction and split
>> CascadedAssignmentNode.analyse_types into parts:
>>  - normal analyse_types()/expressions()
>>  - create clone nodes at some late stage
> 
> At what stage would the stage 2) proper type analysis take place?

After the structural optimisation phase and before any optimisations or
other transforms that require complete type information but do not change
types anymore. I don't see it being moved to the end of the pipeline, the
results will be needed way before that. Even some optimisations may not be
possible without complete type analysis.


> Basically nodes may be replaced at any point, and I'm not sure you
> want to wait until just before code generation to do the coercions
> (e.g.  GILCheck won't catch coercions to object, although assignment
> nodes seem to check manually).

There's a large grey area in between. It'll need some refactoring and
rebalancing, just as before. But it should be easier than it is now because
the grey area will have more anchors in it.


> I think this problem can trivially be solved by creating a ProxyNode
> that should never be replaced by any transform, but it's argument may
> be replaced. So you wrap self.rhs in a ProxyNode and use that to
> create your CloneNodes.

I can't see what a ProxyNode would do that a CloneNode shouldn't do anyway.

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread Fernando Perez
On Thu, Jan 26, 2012 at 10:37 AM, mark florisson
 wrote:
> On a side note, ipython is not something I usually trust to test
> things out, as it gets things wrong sometimes which can seriously make
> you question your own sanity.

I should note that we'd love to know specifics about problems that
severe.  Without a concrete report it's impossible for us to fix a
problem, unfortunately, and a vague statement like this only serves to
spread the notion "ipython is bad, for reasons unspecified".

Also, since I know some core Cython devs are also heavy sage users,
and the sage command-line is actually ipython, having an issue at the
intersection of cython/ipython is problematic for all sage users
working at the terminal.  So we'd really like to make the
cython/ipython combo as robust as possible.

I certainly don't expect everyone to use ipython, and obviously when I
see something that looks really bizarre, I double-check with plain
Python first, as ultimately that is our reference and regressions from
plain python are automatically bugs for us.

But I have to say that it's been a very long time since I have
encountered a situation where ipython produces something completely
absurd, and if it happens to you, by all means please let us know, so
we can try to fix it.

Best,

f
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Bug in Cython producing incorrect C code

2012-01-26 Thread mark florisson
On 26 January 2012 18:51, Vitja Makarov  wrote:
> 2012/1/26 mark florisson :
>> On 26 January 2012 06:39, Vitja Makarov  wrote:
>>> 2012/1/25 Stefan Behnel :
 mark florisson, 24.01.2012 14:53:
> On 24 January 2012 11:37, Konrad Hinsen wrote:
>> Compiling the attached Cython file produced the attached C file which
>> has errors in lines 532-534:
>>
>>  __pyx_v_self->xx = None;
>>  __pyx_v_self->yy = None;
>>  __pyx_v_self->zz = None;
>>
>> There is no C symbol "None", so this doesn't compile.
>>
>> I first noticed the bug in Cython 0.15, but it's still in the latest
>> revision from Github.
>
> Hm, it seems the problem is that the call to the builtin float results
> in SimpleCallNode being replaced with PythonCApiNode, which then
> generates the result code, but the list of coerced nodes are
> CloneNodes of the original rhs, and CloneNode does not generate the
> result code of the original rhs (i.e. allocate and assign to a temp),
> which results in a None result.

 Back to the old idea of separating the type analysis into 1) a basic
 typing, inference and entry creation step and 2) a proper type analysis,
 coercion, etc. step.

>>>
>>> Yeah! I think the issue must be fixed before release. We can start
>>> moving slowly in this direction and split
>>> CascadedAssignmentNode.analyse_types into parts:
>>>  - normal analyse_types()/expressions()
>>>  - create clone nodes at some late stage
>>
>> At what stage would the stage 2) proper type analysis take place?
>> Basically nodes may be replaced at any point, and I'm not sure you
>> want to wait until just before code generation to do the coercions
>> (e.g.  GILCheck won't catch coercions to object, although assignment
>> nodes seem to check manually).
>>
>
> That must be run before GilCheck. Stage 2 is "I believe the tree won't
> change much later"
>
>
>> I think this problem can trivially be solved by creating a ProxyNode
>> that should never be replaced by any transform, but it's argument may
>> be replaced. So you wrap self.rhs in a ProxyNode and use that to
>> create your CloneNodes.
>>
>
> Do you mean proxy node to be something like this:
>
> class ProxyNode(object):
>    def __init__(self, obj):
>        object.__setattr__(self, '_obj', obj)
>
>    def __getattr__(self, key):
>        return getattr(self._obj, key)
>
>    def __setattr__(self, key, value):
>        setattr(self._obj, key, value)
>
> That might help but I'm not sure how evil that is.
>
> It also will require TreeVisitor.find_handler() modification. Node
> replacement could be avoided by introducing ProxyProperty()
>
> I see another problem with proxies: CloneNode or its owner may depend
> on content of the argument so when it's changed things can be messed
> up.

Not quite like that. It should be a regular node that is ignored by
transforms but delegates a few methods. e.g.

class ProxyNode(object):
child_attrs = ['arg']

def __init__(self, arg):
self.arg = arg

def result(self):
return self.arg.result()

def generate_result_code(self, code):
return self.arg.generate_result_code(code)

and that's pretty much it (untested). And CloneNode doesn't depend on
any specific node, nor should any code wrapping its nodes in
ProxyNodes (obviously).

The nodes sole purpose is to create an indirection, so that when
previously self.rhs got replaced, now self.rhs.arg will be replaced,
so self.rhs can be safely shared with other CloneNodes.

> --
> vitja.
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Bug in Cython producing incorrect C code

2012-01-26 Thread mark florisson
On 26 January 2012 18:53, Stefan Behnel  wrote:
> mark florisson, 26.01.2012 16:20:
>> On 26 January 2012 06:39, Vitja Makarov wrote:
>>> 2012/1/25 Stefan Behnel:
 Back to the old idea of separating the type analysis into 1) a basic
 typing, inference and entry creation step and 2) a proper type analysis,
 coercion, etc. step.

>>>
>>> Yeah! I think the issue must be fixed before release. We can start
>>> moving slowly in this direction and split
>>> CascadedAssignmentNode.analyse_types into parts:
>>>  - normal analyse_types()/expressions()
>>>  - create clone nodes at some late stage
>>
>> At what stage would the stage 2) proper type analysis take place?
>
> After the structural optimisation phase and before any optimisations or
> other transforms that require complete type information but do not change
> types anymore. I don't see it being moved to the end of the pipeline, the
> results will be needed way before that. Even some optimisations may not be
> possible without complete type analysis.
>
>
>> Basically nodes may be replaced at any point, and I'm not sure you
>> want to wait until just before code generation to do the coercions
>> (e.g.  GILCheck won't catch coercions to object, although assignment
>> nodes seem to check manually).
>
> There's a large grey area in between. It'll need some refactoring and
> rebalancing, just as before. But it should be easier than it is now because
> the grey area will have more anchors in it.
>

That's nice. It wouldn't solve the problem at hand though, and I think
nothing can unless it will be at the very end of the pipeline.

>> I think this problem can trivially be solved by creating a ProxyNode
>> that should never be replaced by any transform, but it's argument may
>> be replaced. So you wrap self.rhs in a ProxyNode and use that to
>> create your CloneNodes.
>
> I can't see what a ProxyNode would do that a CloneNode shouldn't do anyway.

It wouldn't be a replacement, merely an addition (an extra indirection).

> Stefan
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread mark florisson
On 26 January 2012 19:10, Fernando Perez  wrote:
> On Thu, Jan 26, 2012 at 10:37 AM, mark florisson
>  wrote:
>> On a side note, ipython is not something I usually trust to test
>> things out, as it gets things wrong sometimes which can seriously make
>> you question your own sanity.
>
> I should note that we'd love to know specifics about problems that
> severe.  Without a concrete report it's impossible for us to fix a
> problem, unfortunately, and a vague statement like this only serves to
> spread the notion "ipython is bad, for reasons unspecified".

Apologies, it was indeed a rather vague comment. I had some issues
with pasting unicode characters a few years back that would get
incorrect codepoints in ipython but not regular python, but I'm afraid
I don't have any concrete report. As such I withdraw my comment, I
just wanted to mention that if something smells iffy it might be a
good idea to resort to regular python.

> Also, since I know some core Cython devs are also heavy sage users,
> and the sage command-line is actually ipython, having an issue at the
> intersection of cython/ipython is problematic for all sage users
> working at the terminal.  So we'd really like to make the
> cython/ipython combo as robust as possible.
>
> I certainly don't expect everyone to use ipython, and obviously when I
> see something that looks really bizarre, I double-check with plain
> Python first, as ultimately that is our reference and regressions from
> plain python are automatically bugs for us.
>
> But I have to say that it's been a very long time since I have
> encountered a situation where ipython produces something completely
> absurd, and if it happens to you, by all means please let us know, so
> we can try to fix it.
>
> Best,
>
> f
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Bug in Cython producing incorrect C code

2012-01-26 Thread Stefan Behnel
mark florisson, 26.01.2012 20:15:
> On 26 January 2012 18:53, Stefan Behnel wrote:
>> mark florisson, 26.01.2012 16:20:
>>> I think this problem can trivially be solved by creating a ProxyNode
>>> that should never be replaced by any transform, but it's argument may
>>> be replaced. So you wrap self.rhs in a ProxyNode and use that to
>>> create your CloneNodes.
>>
>> I can't see what a ProxyNode would do that a CloneNode shouldn't do anyway.
> 
> It wouldn't be a replacement, merely an addition (an extra indirection).

What I was trying to say was that a ProxyNode would always be required by a
CloneNode, but I don't see where a ProxyNode would be needed outside of a
CloneNode. So it seems rather redundant and I don't know if we need a
separate node for it.

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Bug in Cython producing incorrect C code

2012-01-26 Thread mark florisson
On 26 January 2012 19:27, Stefan Behnel  wrote:
> mark florisson, 26.01.2012 20:15:
>> On 26 January 2012 18:53, Stefan Behnel wrote:
>>> mark florisson, 26.01.2012 16:20:
 I think this problem can trivially be solved by creating a ProxyNode
 that should never be replaced by any transform, but it's argument may
 be replaced. So you wrap self.rhs in a ProxyNode and use that to
 create your CloneNodes.
>>>
>>> I can't see what a ProxyNode would do that a CloneNode shouldn't do anyway.
>>
>> It wouldn't be a replacement, merely an addition (an extra indirection).
>
> What I was trying to say was that a ProxyNode would always be required by a
> CloneNode, but I don't see where a ProxyNode would be needed outside of a
> CloneNode. So it seems rather redundant and I don't know if we need a
> separate node for it.

Yes it would be needed only for that, but I think the only real
alternative is to not use CloneNode at all, i.e. make the
transformation Dag mentioned, where you create new rhs (NameNode?)
references to the temporary result.

> Stefan
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread Wes McKinney
On Thu, Jan 26, 2012 at 2:21 PM, mark florisson
 wrote:
> On 26 January 2012 19:10, Fernando Perez  wrote:
>> On Thu, Jan 26, 2012 at 10:37 AM, mark florisson
>>  wrote:
>>> On a side note, ipython is not something I usually trust to test
>>> things out, as it gets things wrong sometimes which can seriously make
>>> you question your own sanity.
>>
>> I should note that we'd love to know specifics about problems that
>> severe.  Without a concrete report it's impossible for us to fix a
>> problem, unfortunately, and a vague statement like this only serves to
>> spread the notion "ipython is bad, for reasons unspecified".
>
> Apologies, it was indeed a rather vague comment. I had some issues
> with pasting unicode characters a few years back that would get
> incorrect codepoints in ipython but not regular python, but I'm afraid
> I don't have any concrete report. As such I withdraw my comment, I
> just wanted to mention that if something smells iffy it might be a
> good idea to resort to regular python.
>
>> Also, since I know some core Cython devs are also heavy sage users,
>> and the sage command-line is actually ipython, having an issue at the
>> intersection of cython/ipython is problematic for all sage users
>> working at the terminal.  So we'd really like to make the
>> cython/ipython combo as robust as possible.
>>
>> I certainly don't expect everyone to use ipython, and obviously when I
>> see something that looks really bizarre, I double-check with plain
>> Python first, as ultimately that is our reference and regressions from
>> plain python are automatically bugs for us.
>>
>> But I have to say that it's been a very long time since I have
>> encountered a situation where ipython produces something completely
>> absurd, and if it happens to you, by all means please let us know, so
>> we can try to fix it.
>>
>> Best,
>>
>> f
>> ___
>> cython-devel mailing list
>> cython-devel@python.org
>> http://mail.python.org/mailman/listinfo/cython-devel
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel

To reproduce the problem you'll want to use the specific pandas SHA
(99e2eec) that I referenced there. The pandas bug in the issue has
been fixed since then.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] AddTraceback() slows down generators

2012-01-26 Thread Stefan Behnel
Robert Bradshaw, 21.01.2012 23:09:
> On Sat, Jan 21, 2012 at 10:50 AM, Stefan Behnel wrote:
>> I did some callgrind profiling on Cython's generators and was surprised to
>> find that AddTraceback() represents a serious performance penalty for short
>> running generators.
>>
>> I profiled a compiled Python implementation of itertools.groupby(), which
>> yields (key, group) tuples where the group is an iterator again. I ran this
>> code in Python for benchmarking:
>>
>> """
>> L = sorted(range(1000)*5)
>>
>> all(list(g) for k,g in groupby(L))
>> """
>>
>> Groups tend to be rather short in real code, often just one or a couple of
>> items, so unpacking the group iterator into a list will usually be a quick
>> loop and then the generator raises StopIteration on termination and builds
>> a traceback for it. According to callgrind (which, I should note, tends to
>> overestimate the amount of time spent in memory allocation), the iteration
>> during the group unpacking takes about 30% of the overall runtime of the
>> all() loop, and the AddTraceback() call at the end of each group traversal
>> takes up to 25% (!) on my side. That means that more than 80% of the group
>> unpacking time goes into raising StopIteration from the generators. I
>> attached the call graph with the relative timings.
>>
>> About half of the exception raising time is eaten by PyString_FromFormat()
>> that builds the function-name + line-position string (which, I may note, is
>> basically a convenience feature). This string is a constant for a
>> generator's StopIteration exception, at least for each final return point
>> in a generator, but here it is being recreated over and over again, for
>> each exception that gets raised.
>>
>> Even if we keep creating a new frame instance each time (which should be ok
>> because CPython has a frame instance cache already and we'd only create one
>> during the generator lifetime), the whole code object could actually be
>> cached after the first creation, preferably bound to the lifetime of the
>> generator creator function/method. Or, more generally, one code object per
>> generator termination point, which will be a single point in the majority
>> of cases. For the specific code above, that should shave off almost 20% of
>> the overall runtime of the all() loop.
>>
>> I think that's totally worth doing.
> 
> Makes sense to me. I did some caching like this for profiling.

Here's a ticket for now:

http://trac.cython.org/cython_trac/ticket/760

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread mark florisson
On 26 January 2012 19:40, Wes McKinney  wrote:
> On Thu, Jan 26, 2012 at 2:21 PM, mark florisson
>  wrote:
>> On 26 January 2012 19:10, Fernando Perez  wrote:
>>> On Thu, Jan 26, 2012 at 10:37 AM, mark florisson
>>>  wrote:
 On a side note, ipython is not something I usually trust to test
 things out, as it gets things wrong sometimes which can seriously make
 you question your own sanity.
>>>
>>> I should note that we'd love to know specifics about problems that
>>> severe.  Without a concrete report it's impossible for us to fix a
>>> problem, unfortunately, and a vague statement like this only serves to
>>> spread the notion "ipython is bad, for reasons unspecified".
>>
>> Apologies, it was indeed a rather vague comment. I had some issues
>> with pasting unicode characters a few years back that would get
>> incorrect codepoints in ipython but not regular python, but I'm afraid
>> I don't have any concrete report. As such I withdraw my comment, I
>> just wanted to mention that if something smells iffy it might be a
>> good idea to resort to regular python.
>>
>>> Also, since I know some core Cython devs are also heavy sage users,
>>> and the sage command-line is actually ipython, having an issue at the
>>> intersection of cython/ipython is problematic for all sage users
>>> working at the terminal.  So we'd really like to make the
>>> cython/ipython combo as robust as possible.
>>>
>>> I certainly don't expect everyone to use ipython, and obviously when I
>>> see something that looks really bizarre, I double-check with plain
>>> Python first, as ultimately that is our reference and regressions from
>>> plain python are automatically bugs for us.
>>>
>>> But I have to say that it's been a very long time since I have
>>> encountered a situation where ipython produces something completely
>>> absurd, and if it happens to you, by all means please let us know, so
>>> we can try to fix it.
>>>
>>> Best,
>>>
>>> f
>>> ___
>>> cython-devel mailing list
>>> cython-devel@python.org
>>> http://mail.python.org/mailman/listinfo/cython-devel
>> ___
>> cython-devel mailing list
>> cython-devel@python.org
>> http://mail.python.org/mailman/listinfo/cython-devel
>
> To reproduce the problem you'll want to use the specific pandas SHA
> (99e2eec) that I referenced there. The pandas bug in the issue has
> been fixed since then.
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel

I get the same as minrk got: TypeError: unhashable type 'dict' .
Anyway, if it's working now (and if other tracebacks are fast (enough)
as they should be), I don't think we have anything to worry about.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] AddTraceback() slows down generators

2012-01-26 Thread Vitja Makarov
2012/1/27 Stefan Behnel :
> Robert Bradshaw, 21.01.2012 23:09:
>> On Sat, Jan 21, 2012 at 10:50 AM, Stefan Behnel wrote:
>>> I did some callgrind profiling on Cython's generators and was surprised to
>>> find that AddTraceback() represents a serious performance penalty for short
>>> running generators.
>>>
>>> I profiled a compiled Python implementation of itertools.groupby(), which
>>> yields (key, group) tuples where the group is an iterator again. I ran this
>>> code in Python for benchmarking:
>>>
>>> """
>>> L = sorted(range(1000)*5)
>>>
>>> all(list(g) for k,g in groupby(L))
>>> """
>>>
>>> Groups tend to be rather short in real code, often just one or a couple of
>>> items, so unpacking the group iterator into a list will usually be a quick
>>> loop and then the generator raises StopIteration on termination and builds
>>> a traceback for it. According to callgrind (which, I should note, tends to
>>> overestimate the amount of time spent in memory allocation), the iteration
>>> during the group unpacking takes about 30% of the overall runtime of the
>>> all() loop, and the AddTraceback() call at the end of each group traversal
>>> takes up to 25% (!) on my side. That means that more than 80% of the group
>>> unpacking time goes into raising StopIteration from the generators. I
>>> attached the call graph with the relative timings.
>>>
>>> About half of the exception raising time is eaten by PyString_FromFormat()
>>> that builds the function-name + line-position string (which, I may note, is
>>> basically a convenience feature). This string is a constant for a
>>> generator's StopIteration exception, at least for each final return point
>>> in a generator, but here it is being recreated over and over again, for
>>> each exception that gets raised.
>>>
>>> Even if we keep creating a new frame instance each time (which should be ok
>>> because CPython has a frame instance cache already and we'd only create one
>>> during the generator lifetime), the whole code object could actually be
>>> cached after the first creation, preferably bound to the lifetime of the
>>> generator creator function/method. Or, more generally, one code object per
>>> generator termination point, which will be a single point in the majority
>>> of cases. For the specific code above, that should shave off almost 20% of
>>> the overall runtime of the all() loop.
>>>
>>> I think that's totally worth doing.
>>
>> Makes sense to me. I did some caching like this for profiling.
>
> Here's a ticket for now:
>
> http://trac.cython.org/cython_trac/ticket/760
>

I think that could be easily fixed. CPython doesn't add any traceback
info for generator's ending.

https://github.com/vitek/cython/commit/63620bc2a29f3064bbdf7a49eefffaae4e3c369d

-- 
vitja.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread Fernando Perez
On Thu, Jan 26, 2012 at 11:21 AM, mark florisson
 wrote:
> Apologies, it was indeed a rather vague comment. I had some issues

No worries.

> with pasting unicode characters a few years back that would get
> incorrect codepoints in ipython but not regular python, but I'm afraid

We had a long and distinguished history of unicode bugs (we even had a
specific tag for them, as they were so severe).  But the vast majority
of them have been fixed, the unicode-filtered list only shows three
open ones:

https://github.com/ipython/ipython/issues?labels=unicode&sort=created&direction=desc&state=open&page=1

So it's possible your problem has indeed been fixed, since we know
that at least the worst horrors are gone now, largely thanks to Thomas
Kluyver's insane persistence.

> I don't have any concrete report. As such I withdraw my comment, I
> just wanted to mention that if something smells iffy it might be a
> good idea to resort to regular python.

That is most certainly a good policy, and I use it myself.  IPython is
necessarily a big and complex beast, so if something looks really odd,
the first sanity check is to remove that layer from the problem.

Cheers,

f
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] AddTraceback() slows down generators

2012-01-26 Thread Stefan Behnel
Vitja Makarov, 26.01.2012 21:19:
> 2012/1/27 Stefan Behnel:
>> Robert Bradshaw, 21.01.2012 23:09:
>>> On Sat, Jan 21, 2012 at 10:50 AM, Stefan Behnel wrote:
 I did some callgrind profiling on Cython's generators and was surprised to
 find that AddTraceback() represents a serious performance penalty for short
 running generators.

 I profiled a compiled Python implementation of itertools.groupby(), which
 yields (key, group) tuples where the group is an iterator again. I ran this
 code in Python for benchmarking:

 """
 L = sorted(range(1000)*5)

 all(list(g) for k,g in groupby(L))
 """

 Groups tend to be rather short in real code, often just one or a couple of
 items, so unpacking the group iterator into a list will usually be a quick
 loop and then the generator raises StopIteration on termination and builds
 a traceback for it. According to callgrind (which, I should note, tends to
 overestimate the amount of time spent in memory allocation), the iteration
 during the group unpacking takes about 30% of the overall runtime of the
 all() loop, and the AddTraceback() call at the end of each group traversal
 takes up to 25% (!) on my side. That means that more than 80% of the group
 unpacking time goes into raising StopIteration from the generators. I
 attached the call graph with the relative timings.

 About half of the exception raising time is eaten by PyString_FromFormat()
 that builds the function-name + line-position string (which, I may note, is
 basically a convenience feature). This string is a constant for a
 generator's StopIteration exception, at least for each final return point
 in a generator, but here it is being recreated over and over again, for
 each exception that gets raised.

 Even if we keep creating a new frame instance each time (which should be ok
 because CPython has a frame instance cache already and we'd only create one
 during the generator lifetime), the whole code object could actually be
 cached after the first creation, preferably bound to the lifetime of the
 generator creator function/method. Or, more generally, one code object per
 generator termination point, which will be a single point in the majority
 of cases. For the specific code above, that should shave off almost 20% of
 the overall runtime of the all() loop.
> 
> I think that could be easily fixed. CPython doesn't add any traceback
> info for generator's ending.
> 
> https://github.com/vitek/cython/commit/63620bc2a29f3064bbdf7a49eefffaae4e3c369d

Interesting. It doesn't solve the general problem of slow exceptions, but I
think that's a beautiful way of fixing this particular performance problem
for generators. Here are the timings. Your change made the above code twice
as fast!

Cython compiled Python implementation of itertools.groupby(), *without*
your change:

python2.7 -m timeit -s 'from cytertools import groupby; \
L=sorted(range(1000)*5)' 'all(list(g) for k,g in groupby(L))'
100 loops, best of 3: 3.76 msec per loop

Cython compiled Python implementation of itertools.groupby(), *with* your
change:

python2.7 -m timeit -s 'from cytertools import groupby; \
L=sorted(range(1000)*5)' 'all(list(g) for k,g in groupby(L))'
1000 loops, best of 3: 1.81 msec per loop

The real itertools, for comparison:

python2.7 -m timeit -s 'from itertools import groupby; \
L=sorted(range(1000)*5)' 'all(list(g) for k,g in groupby(L))'
1000 loops, best of 3: 1.31 msec per loop

That's close.

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Slow traceback reconstruction in IPython between 0.15.1 and master

2012-01-26 Thread Stefan Behnel
Wes McKinney, 26.01.2012 18:56:
> Just wanted to bring this issue to your guys' attention in case you
> knew what was responsible for this:
> 
> https://github.com/ipython/ipython/issues/1317#issuecomment-3652550
> 
> I traced down the problem (with git bisect) to a seemingly innocuous
> commit referenced in that GitHub thread. The issue seemed to only
> present itself in IPython, so likely there was some problem with
> inspecting the Cython frames for giving context around the full
> traceback.

That's not impossible. Traceback frames behave differently in Cython for
two reasons: a) because they are only being constructed after the fact in
an error handling case, not for normal functions, and b) because Cython
doesn't have real code objects for inspection and fakes them in a rather
ad-hoc way. For example, there is currently no function signature
information in them, and line number computation doesn't use the normal
CPython way that matches the byte code position with a compressed line
table. Instead, Cython creates a new code object for a given line on the
fly and just pretends that the function starts there. This usually works
well enough for a traceback, but this kind of differences makes it quite
possible that IPython makes assumptions about inspection here that Cython
doesn't meet.

In another thread ("AddTraceback() slows down generators"), I was proposing
to cache the code object for a given function. That would then imply
providing a (fake) byte code position map as well and would make it easier
to provide static signature information, thus improving the overall
compatibility with the code objects that CPython creates.

Note that it's unclear without further investigation if the problem you ran
into really has to do with these internal details. I'm just raising a
possible explanation here.

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel