[Python-Dev] setprofile and settrace inconsistency

2012-06-01 Thread Alon Horev
Hi,

When setting a trace function with settrace, the trace function when called
with a new scope can return another trace function or None, indicating the
inner scope should not be traced.
I used settrace for some time but calling the trace function for every line
of code is a performance killer.
So I moved on to setprofile, which calls a trace function every function
entry/exit. now here's the problem: the return value from the trace
function is ignored (intentionally), denying the possibility to skip
tracing of 'hot' or 'not interesting' code.

I would like to propose two alternatives:
1. setprofile will not ignore the return value and mimic settrace's
behavior.
2. setprofile is just a wrapper around settrace that limits
it's functionality, lets make settrace more flexible so setprofile will be
redundant. here's how: settrace will recieve an argument called 'events',
the trace function will fire only on events contained in that list. for
example: setprofile = partial(settrace, events=['call', 'return'])

I personally prefer the second.

Some context to this issue:
I'm building a python tracer - a logger that records each and every
function call. In order for it to run in production systems, the overhead
should be minimal. I would like to allow the user to say which
function/module/classes to trace or skip, for example: the user will skip
all math/cpu intensive operations. another example: the user will want to
trace his django app code but not the django framework.

your thoughts?

  Thanks, Alon Horev
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] setprofile and settrace inconsistency

2012-06-01 Thread Alon Horev
Hi,

When setting a trace function with settrace, the trace function when called
with a new scope can return another trace function or None, indicating the
inner scope should not be traced.
I used settrace for some time but calling the trace function for every line
of code is a performance killer.
So I moved on to setprofile, which calls a trace function every function
entry/exit. now here's the problem: the return value from the trace
function is ignored (intentionally), denying the possibility to skip
tracing of 'hot' or 'not interesting' code.

I would like to propose two alternatives:
1. setprofile will not ignore the return value and mimic settrace's
behavior.
2. setprofile is just a wrapper around settrace that limits
it's functionality, lets make settrace more flexible so setprofile will be
redundant. here's how: settrace will recieve an argument called 'events',
the trace function will fire only on events contained in that list. for
example: setprofile = partial(settrace, events=['call', 'return'])

I personally prefer the second.

Some context to this issue:
I'm building a python tracer - a logger that records each and every
function call. In order for it to run in production systems, the overhead
should be minimal. I would like to allow the user to say which
function/module/classes to trace or skip, for example: the user will skip
all math/cpu intensive operations. another example: the user will want to
trace his django app code but not the django framework.

your thoughts?

  Thanks, Alon Horev
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] segfault - potential double free when using iterparse

2012-06-12 Thread Alon Horev
Hi All,

First of all, I'm not opening a bug yet as I'm not certain whether this is
a CPython bug or lxml bug.

I'm getting a segfault within python's GC (garbage collector) module.
here's the stack trace:

#0  0x7fc7e9f6b76e in gc_list_remove (op=0x7fc79cef3d98) at
Modules/gcmodule.c:211
#1  PyObject_GC_Del (op=0x7fc79cef3d98) at Modules/gcmodule.c:1503
#2  0x7fc7e9f2ac0f in PyEval_EvalFrameEx (f=,
throwflag=) at Python/ceval.c:2894
#3  0x7fc7e9ea5b79 in gen_send_ex (arg=None, exc=,
gen=) at Objects/genobject.c:84
#4  0x7fc7e9ea6185 in gen_close (self=) at Objects/genobject.c:130
#5  gen_del (self=) at
Objects/genobject.c:165
#6  0x7fc7e9ea5a1b in gen_dealloc (gen=0x7fc7c1ba73c0) at
Objects/genobject.c:32

In order to see what object the gc is freeing i tried casting it to a
PyObject (we're freeing a lxml object):
(gdb) p (PyObject*) op
$17 = 

Similar bugs (http://osdir.com/ml/python.bugs/2000-12/msg00214.html) blame
the extension module for calling dealloc explicitly more than once or doing
forbidden things in __del__.

this is how i use lxml:

from lxml.etree import iterparse

def safe_iterparse(*args, **kwargs):
for event, element in iterparse(*args, **kwargs):
try:
yield (event, element)
finally:
element.clear()

I don't have the data that caused the crash, hopefully I'll get it after
the next crash.
anyone familiar with these kind of bugs in c extensions/cpython/lxml? could
you give pointers to what I should be looking for?

some version info:
CPython version: 2.7.2 on linux.
lxml: 2.3.3


      thanks, Alon Horev
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com