2011/12/25 Stefan Behnel <stefan...@behnel.de>: > Stefan Behnel, 21.12.2011 21:17: > >> Vitja Makarov, 21.12.2011 19:48: >>> >>> Some time ago we were talking about generators optimization by copying >>> local variables from closure into local scope. >> >> >> Yes, I think that will make it easier for the C compiler to make >> optimistic >> assumptions about external values. >> >>> Now I think that should be a good idea to implement this for both >>> generators and regular closure functions. So local var will be used >>> for reference and assignment should be made to both copies. Of course >>> there are some variables that shouldn't be copied: non-local vars, >>> arrays, C++ classes and structures. >> >> >> Basically, anything that external code can modify. That makes it a bit >> tricky to do it also for 'normal' closure functions - the whole idea is >> that there is more than one function that can refer to a variable. >> >>> Also it may be a good idea to move outer scope pointer into local >>> variable. >>> >>> So I'm wondering what is a good test to measure actual speedup? >> >> >> http://blog.behnel.de/index.php?p=163 >> >> Just take the plain Python versions of the iterparse functions and compare >> them before and after the change. The raw C implementation in CPython >> gives >> a good baseline. >> >> Actually, it would be generally interesting to run the Cython versions >> through callgrind to see where the time is actually being spent. > > > Another useful test is the "nqueens" benchmark from the CPython test suite. > It's regularly run on Jenkins in Py2.7 and 3.3. > > https://sage.math.washington.edu:8091/hudson/view/bench/ > > Note that it mostly uses generator expressions, which could easily benefit > from a couple of further optimisations by specialising them, e.g. by > providing a length hint. > > http://trac.cython.org/cython_trac/ticket/756 >
I have implemented local variable copying, it could be found here: https://github.com/vitek/cython/tree/_copy_closure I didn't noticed significant speedup running nqueens test. Indeed I'm not sure it's speedup. It's all about <+-2%. Perhaps some better test is required. Btw, I got ~8% speedup for this dummy test: def foo(): cdef int i, r cdef list o o = [] def bar(): return len(o) for i in range(10000000): bar() r += len(o) return r What's iterparse()? -- vitja. _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel