On Mon, 11 Apr 2011 15:25:22 +0200, Martin Hammerschmied <[email protected]> 
wrote:
> hi everyone.
> 
> i'm having an issue using a kernel i wrote. the goal is to simulate a
> recurrent neural network. the experimental code is attached. it
> contains two sequential simulations, the first one runs on the GPU,
> the second one on the CPU. just for performance comparison.
> 
> everything seems to work just fine as long as:
>  - the number of timesteps (num_samples) is below 20k
>  - the size of the network (res_dim) is below 300 nodes
> 
> both results are the same, everything works as expected (GPU -> cracy
> fast, CPU -> ridiculously slow). but as soon as these numbers get too
> high, something strange happens. the computation seems to work but i
> receive the following exception when copying the results back to
> python via gpuarray.get()
> 
> Traceback (most recent call last):
>   File "<ipython console>", line 1, in <module>
>   File 
> "/usr/local/lib/python2.6/dist-packages/spyderlib/widgets/externalshell/startup.py",
> line 122, in runfile
>     execfile(filename, glbs)
>   File "/home/.../sandbox.py", line 71, in <module>
>     tmp = gpu_x.get()
>   File 
> "/usr/local/lib/python2.6/dist-packages/pycuda-0.94.1-py2.6-linux-x86_64.egg/pycuda/gpuarray.py",
> line 115, in get
>     drv.memcpy_dtoh(ary, self.gpudata)
> LaunchError: cuMemcpyDtoH failed: launch timeout
> 
> when i leave out the get() line everything seems to work. the same
> thing happens whithout X running.
> 
> does someone have an idea what's going on?

CUDA reports errors belatedly. (In your case, the 'segfault' in your
kernel is reported on the next memory transfer.) If you'd like to
convince yourself that the error is there before the transfer, call
driver.Context.synchronize() right after your kernel call.

HTH,
Andreas

Attachment: pgpWoRWugEyXA.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to