A Dimecres 10 Gener 2007 22:49, Stefan van der Walt escrigué: > On Wed, Jan 10, 2007 at 08:28:14PM +0100, Francesc Altet wrote: > > El dt 09 de 01 del 2007 a les 23:19 +0900, en/na David Cournapeau va > > > > escriure: > > time (putmask)--> 1.38 > > time (where)--> 2.713 > > time (numexpr where)--> 1.291 > > time (fancy+assign)--> 0.967 > > time (numexpr clip)--> 0.596 > > > > It is interesting to see there how fancy-indexing + assignation is quite > > more efficient than putmask. > > Not on my machine: > > time (putmask)--> 0.181 > time (where)--> 0.783 > time (numexpr where)--> 0.26 > time (fancy+assign)--> 0.202
Yeah, a lot of difference indeed. Just for reference, my results above were done using a Duron (an Athlon but with only 128 KB of secondary cache) at 0.9 GHz. Now, using my laptop (Intel 4 @ 2 GHz, 512 KB of secondary cache), I get: time (putmask)--> 0.244 time (where)--> 2.111 time (numexpr where)--> 0.427 time (fancy+assign)--> 0.316 time (numexpr clip)--> 0.184 so, on my laptop fancy+assign is way slower than putmask. It should be noted also that the implementation of clip in numexpr (i.e. in pure C) is not that much faster than putmask (just a 30%); so perhaps it is not so necessary to come up with a pure C implementation for clip (or at least, on Intel P4 machines!). In any case, it is really shocking seeing how differently can perform the several CPU architectures on this apparently simple problem. BTW, I'm attaching a slightly enhanced version of the clip patch for numexpr that I used for the new benchmark showed here. Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"
Index: interp_body.c =================================================================== --- interp_body.c (revision 2535) +++ interp_body.c (working copy) @@ -177,6 +177,10 @@ case OP_WHERE_FFFF: VEC_ARG3(f_dest = f1 ? f2 : f3); + case OP_CLIP_FFFF: VEC_ARG3(if (f_dest <= f2) f_dest = f2; + else if (f_dest >= f3) f_dest = f3; + else f_dest = f1); + case OP_FUNC_FF: VEC_ARG1(f_dest = functions_f[arg2](f1)); case OP_FUNC_FFF: VEC_ARG2(f_dest = functions_ff[arg3](f1, f2)); Index: interpreter.c =================================================================== --- interpreter.c (revision 2535) +++ interpreter.c (working copy) @@ -64,6 +64,7 @@ OP_SQRT_FF, OP_ARCTAN2_FFF, OP_WHERE_FFFF, + OP_CLIP_FFFF, OP_FUNC_FF, OP_FUNC_FFF, @@ -181,6 +182,9 @@ case OP_WHERE_FFFF: if (n == 0 || n == 1 || n == 2 || n == 3) return 'f'; break; + case OP_CLIP_FFFF: + if (n == 0 || n == 1 || n == 2 || n == 3) return 'f'; + break; case OP_FUNC_FF: if (n == 0 || n == 1) return 'f'; if (n == 2) return 'n'; @@ -1340,6 +1344,7 @@ add_op("sqrt_ff", OP_SQRT_FF); add_op("arctan2_fff", OP_ARCTAN2_FFF); add_op("where_ffff", OP_WHERE_FFFF); + add_op("clip_ffff", OP_CLIP_FFFF); add_op("func_ff", OP_FUNC_FF); add_op("func_fff", OP_FUNC_FFF); Index: expressions.py =================================================================== --- expressions.py (revision 2535) +++ expressions.py (working copy) @@ -104,6 +104,12 @@ return ConstantNode(numpy.where(a, b, c)) return FuncNode('where', [a,b,c]) [EMAIL PROTECTED] +def clip_func(a, b, c): + if isinstance(a, ConstantNode): + raise ValueError("too many dimensions") + return FuncNode('clip', [a,b,c]) + def encode_axis(axis): if isinstance(axis, ConstantNode): axis = axis.value @@ -211,6 +217,7 @@ 'fmod' : func(numpy.fmod, 'float'), 'where' : where_func, + 'clip' : clip_func, 'complex' : func(complex, 'complex'),
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion