Hey, Another discussion on lazy evaluation, given the recent activity here: https://github.com/ContinuumIO/numba/pull/6#issuecomment-6117091 A somewhat recent previous thread can be found here: http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060862.html , and a NEP here: https://github.com/numpy/numpy/blob/master/doc/neps/deferred-ufunc-evaluation.rst
I think trying to parse bytecode and build an expression graph for array expressions from that has disadvantages and is harder in general. For instance it won't be able to deal with branching at execution time, and things like inter-procedural analysis will be harder (not to mention you'd have to parse dtype creation). Instead, what you really want to do is hook into a lazy evaluating version of numpy, and generate your own code from the operations it records. It would be great if we implement the NEP listed above, but with a few extensions. I think Numpy should handle the lazy evaluation part, and determine when expressions should be evaluated, etc. However, for each user operation, Numpy will call back a user-installed hook implementing some interface, to allow various packages to provide their own hooks to evaluate vector operations however they want. This will include packages such as Theano, which could run things on the GPU, Numexpr, and in the future https://github.com/markflorisson88/minivect (which will likely have an LLVM backend in the future, and possibly integrated with Numba to allow inlining of numba ufuncs). The project above tries to bring together all the different array expression compilers together in a single framework, to provide efficient array expressions specialized for any data layout (nditer on steroids if you will, with SIMD, threaded and inlining capabilities). We could allow each hook to specify which dtypes it supports, and a minimal data size needed before it should be invoked (to avoid overhead for small arrays, like the openmp 'if' clause). If an operation is not supported, it will simply raise NotImplementedError, which means Numpy will evaluate the expression built so far and run its own implementation, resulting in a non-lazy array. E.g. if a library supports adding things together, but doesn't support the 'sin' function, np.sin(a + b) will result in the library executing a + b, and numpy evaluating sin on the result. So the idea is that the numpy lazy array will wrap an expression graph, which is built when the user performs operations and evaluated when needed (when a result is required or when someone tells numpy to evaluate all lazy arrays). Numpy will simply use the first hook willing to operate on data of the specified size and dtype, and will keep using that hook to build the expression until evaluated. Anyway, this is somewhat of a high-level overview. If there is any interest, we can flesh out the details and extend the NEP. Mark _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
