I recently attempted to run the following code with CUDA 3.2 and
Pycuda 0.94.2 on a Quadro NVS 290 installed on a Linux x86_64 system:
import pycuda.gpuarray as gpuarray
import pycuda.driver as drv
import pycuda.autoinit
import numpy as np
from pycuda.compiler import SourceModule
func_mod = SourceModule("""
#include <pycuda/pycuda-complex.hpp>
#define TYPE pycuda::complex<float>
__global__ void func(TYPE *a, TYPE *b, int N)
{
int idx = threadIdx.x;
if (idx < N)
b[idx] = pow(a[idx], 2);
}
""")
func = func_mod.get_function("func")
N = 10
a = np.complex64(np.random.rand(N)+np.random.rand(N)*1j)
b = np.complex64(np.zeros(N))
func(drv.In(a), drv.Out(b), np.uint32(N), block=(512,1,1))
print 'in: ', a
print 'out (cuda): ', b
print 'out (np): ', a**2
When I did so, I observed the following error:
pytools.prefork.ExecError: error invoking 'nvcc --cubin -arch sm_11
-I/usr/lib64/python2.6/site-packages/pycuda/../../../../include/pycuda
kernel.cu': status 2 invoking 'nvcc --cubin -arch sm_11
-I/usr/lib64/python2.6/site-packages/pycuda/../../../../include/pycuda
kernel.cu': ./kernel.cu(10): Error: External calls are not supported
(found non-inlined call to _ZN6pycuda3powERKNS_7complexIfEEi)
Casting the exponent to a float or pycuda::complex<float> prevents the
error from occurring, but casting it to int does not. Is this expected?
L.G.
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda