Thanks, Jon; I shall try this.

Efren A. Serra (Contractor)
DeVine Consulting, Inc.
Naval Research Laboratory
Marine Meteorology Division
7 Grace Hopper Ave., STOP 2
Monterey, CA 93943
Code 7542
Office: 831-656-4650

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of 
Jonathan WRIGHT
Sent: Tuesday, April 10, 2012 1:18 AM
To: [email protected]
Subject: Re: [PyCUDA] numpy.sum 377x faster than gpuarray.sum

Dear Efren,

Your numpy timings are incredible.

> Array size: 4,000,000
> GPU array time: 0.001961s
> numpy array time: 0.000001s

This 1 microsecond seems to be rather constant.

>> start.record()
>> numpy.sum(a)/a.size
>> end.record()
>> end.synchronize()

Could it be that this timing code is for asynchronous GPU calls? Try this:

import timeit
t=timeit.Timer( setup="from __main__ import numpy,a" ,
        stmt="numpy.sum(a)/a.size")
print "Numpy timing", t.timeit(1000)/1000,"s"

Same approach could be interesting for your GPU calls if you want to get the 
python walltimes.

Best,

Jon

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to