A Dimecres 10 Gener 2007 22:49, Stefan van der Walt escrigué:
> On Wed, Jan 10, 2007 at 08:28:14PM +0100, Francesc Altet wrote:
> > El dt 09 de 01 del 2007 a les 23:19 +0900, en/na David Cournapeau va
> >
> > escriure:
> > time (putmask)--> 1.38
> > time (where)--> 2.713
> > time (numexpr where)--> 1.291
> > time (fancy+assign)--> 0.967
> > time (numexpr clip)--> 0.596
> >
> > It is interesting to see there how fancy-indexing + assignation is quite
> > more efficient than putmask.
>
> Not on my machine:
>
> time (putmask)--> 0.181
> time (where)--> 0.783
> time (numexpr where)--> 0.26
> time (fancy+assign)--> 0.202

Yeah, a lot of difference indeed. Just for reference, my results above were 
done using a Duron (an Athlon but with only 128 KB of secondary cache) at 0.9 
GHz. Now, using my laptop (Intel 4 @ 2 GHz, 512 KB of secondary cache), I 
get:

time (putmask)--> 0.244
time (where)--> 2.111
time (numexpr where)--> 0.427
time (fancy+assign)--> 0.316
time (numexpr clip)--> 0.184

so, on my laptop fancy+assign is way slower than putmask. It should be noted 
also that the implementation of clip in numexpr (i.e. in pure C) is not that 
much faster than putmask (just a 30%); so perhaps it is not so necessary to 
come up with a pure C implementation for clip (or at least, on Intel P4 
machines!).

In any case, it is really shocking seeing how differently can perform the 
several CPU architectures on this apparently simple problem.

BTW, I'm attaching a slightly enhanced version of the clip patch for numexpr 
that I used for the new benchmark showed here.

Cheers,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"
Index: interp_body.c
===================================================================
--- interp_body.c	(revision 2535)
+++ interp_body.c	(working copy)
@@ -177,6 +177,10 @@
 
         case OP_WHERE_FFFF: VEC_ARG3(f_dest = f1 ? f2 : f3);
 
+        case OP_CLIP_FFFF: VEC_ARG3(if (f_dest <= f2) f_dest = f2;
+				    else if (f_dest >= f3) f_dest = f3;
+				    else f_dest = f1);
+
         case OP_FUNC_FF: VEC_ARG1(f_dest = functions_f[arg2](f1));
         case OP_FUNC_FFF: VEC_ARG2(f_dest = functions_ff[arg3](f1, f2));
 
Index: interpreter.c
===================================================================
--- interpreter.c	(revision 2535)
+++ interpreter.c	(working copy)
@@ -64,6 +64,7 @@
     OP_SQRT_FF,
     OP_ARCTAN2_FFF,
     OP_WHERE_FFFF,
+    OP_CLIP_FFFF,
     OP_FUNC_FF,
     OP_FUNC_FFF,
 
@@ -181,6 +182,9 @@
         case OP_WHERE_FFFF:
             if (n == 0 || n == 1 || n == 2 || n == 3) return 'f';
             break;
+        case OP_CLIP_FFFF:
+            if (n == 0 || n == 1 || n == 2 || n == 3) return 'f';
+            break;
         case OP_FUNC_FF:
             if (n == 0 || n == 1) return 'f';
             if (n == 2) return 'n';
@@ -1340,6 +1344,7 @@
     add_op("sqrt_ff", OP_SQRT_FF);
     add_op("arctan2_fff", OP_ARCTAN2_FFF);
     add_op("where_ffff", OP_WHERE_FFFF);
+    add_op("clip_ffff", OP_CLIP_FFFF);
     add_op("func_ff", OP_FUNC_FF);
     add_op("func_fff", OP_FUNC_FFF);
 
Index: expressions.py
===================================================================
--- expressions.py	(revision 2535)
+++ expressions.py	(working copy)
@@ -104,6 +104,12 @@
         return ConstantNode(numpy.where(a, b, c))
     return FuncNode('where', [a,b,c])
 
[EMAIL PROTECTED]
+def clip_func(a, b, c):
+    if isinstance(a, ConstantNode):
+        raise ValueError("too many dimensions")
+    return FuncNode('clip', [a,b,c])
+
 def encode_axis(axis):
     if isinstance(axis, ConstantNode):
         axis = axis.value
@@ -211,6 +217,7 @@
     'fmod' : func(numpy.fmod, 'float'),
 
     'where' : where_func,
+    'clip' : clip_func,
 
     'complex' : func(complex, 'complex'),
 
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to