Thanks, Jon; I shall try this. Efren A. Serra (Contractor) DeVine Consulting, Inc. Naval Research Laboratory Marine Meteorology Division 7 Grace Hopper Ave., STOP 2 Monterey, CA 93943 Code 7542 Office: 831-656-4650
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Jonathan WRIGHT Sent: Tuesday, April 10, 2012 1:18 AM To: [email protected] Subject: Re: [PyCUDA] numpy.sum 377x faster than gpuarray.sum Dear Efren, Your numpy timings are incredible. > Array size: 4,000,000 > GPU array time: 0.001961s > numpy array time: 0.000001s This 1 microsecond seems to be rather constant. >> start.record() >> numpy.sum(a)/a.size >> end.record() >> end.synchronize() Could it be that this timing code is for asynchronous GPU calls? Try this: import timeit t=timeit.Timer( setup="from __main__ import numpy,a" , stmt="numpy.sum(a)/a.size") print "Numpy timing", t.timeit(1000)/1000,"s" Same approach could be interesting for your GPU calls if you want to get the python walltimes. Best, Jon _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
