Re: [PyCUDA] numpy.sum 377x faster than gpuarray.sum

Pazzula, Dominic J Mon, 09 Apr 2012 15:17:35 -0700

Also, are you using numpy with MKL?  Those numpy times are really fast.

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of 
Serra, Mr. Efren, Contractor, Code 7542
Sent: Monday, April 09, 2012 4:26 PM
To: 'Eli Stevens (Gmail)'
Cc: [email protected]
Subject: Re: [PyCUDA] numpy.sum 377x faster than gpuarray.sum


Array size: 4000
GPU array time: 0.000420s
numpy array time: 0.000001s

Array size:40,000
GPU array time: 0.001648s
numpy array time: 0.000002s

Array size: 400,000
GPU array time: 0.000576s
numpy array time: 0.000002s

Array size: 4,000,000
GPU array time: 0.001961s
numpy array time: 0.000001s

Eli, I have just started to experiment with PyCUDA and was hoping to use it to 
do mean, standard deviation of some atmospheric data; however, the numbers 
above don't show much promise.

Efren A. Serra (Contractor)
DeVine Consulting, Inc.
Naval Research Laboratory
Marine Meteorology Division
7 Grace Hopper Ave., STOP 2
Monterey, CA 93943
Code 7542
Office: 831-656-4650

-----Original Message-----
From: Eli Stevens (Gmail) [mailto:[email protected]]
Sent: Monday, April 09, 2012 2:11 PM
To: Serra, Mr. Efren, Contractor, Code 7542
Cc: [email protected]
Subject: Re: [PyCUDA] numpy.sum 377x faster than gpuarray.sum

There are fixed startup costs that do not amortize well over only 400 elements.

What happens when you vary the size of the array over several orders of 
magnitude?

Eli

On Mon, Apr 9, 2012 at 2:05 PM, Serra, Mr. Efren, Contractor, Code
7542 <[email protected]> wrote:
> import numpy
> """
> """
> import pycuda.driver as cuda
> import pycuda.tools
> import pycuda.gpuarray as gpuarray
> import pycuda.autoinit, pycuda.compiler
>
> a=numpy.arange(400)
> a_gpu=gpuarray.arange(400,dtype=numpy.float32)
>
> start=cuda.Event()
> end=cuda.Event()
> start.record()
> gpuarray.sum(a_gpu).get()/a.size
> end.record()
> end.synchronize()
> print "GPU array time: %fs" %(start.time_till(end)*1e-3)
>
> start.record()
> numpy.sum(a)/a.size
> end.record()
> end.synchronize()
> print "numpy array time: %fs" %(start.time_till(end)*1e-3)
>
> GPU array time: 0.000377s
> numpy array time: 0.000001s
>
> Efren A. Serra (Contractor)
> DeVine Consulting, Inc.
> Naval Research Laboratory
> Marine Meteorology Division
> 7 Grace Hopper Ave., STOP 2
> Monterey, CA 93943
> Code 7542
> Office: 831-656-4650
>
>
> _______________________________________________
> PyCUDA mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pycuda

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] numpy.sum 377x faster than gpuarray.sum

Reply via email to