I recently attempted to run the following code with CUDA 3.2 and
Pycuda 0.94.2 on a Quadro NVS 290 installed on a Linux x86_64 system:

import pycuda.gpuarray as gpuarray
import pycuda.driver as drv
import pycuda.autoinit
import numpy as np

from pycuda.compiler import SourceModule
func_mod = SourceModule("""
#include <pycuda/pycuda-complex.hpp>                                            
                              
#define TYPE pycuda::complex<float>                                             
                              

__global__ void func(TYPE *a, TYPE *b, int N)
{
    int idx = threadIdx.x;
    if (idx < N) 
        b[idx] = pow(a[idx], 2); 
}
""")

func = func_mod.get_function("func")

N = 10
a = np.complex64(np.random.rand(N)+np.random.rand(N)*1j)
b = np.complex64(np.zeros(N))

func(drv.In(a), drv.Out(b), np.uint32(N), block=(512,1,1))
print 'in: ', a
print 'out (cuda): ', b
print 'out (np): ', a**2

When I did so, I observed the following error:

pytools.prefork.ExecError: error invoking 'nvcc --cubin -arch sm_11
-I/usr/lib64/python2.6/site-packages/pycuda/../../../../include/pycuda
kernel.cu': status 2 invoking 'nvcc --cubin -arch sm_11
-I/usr/lib64/python2.6/site-packages/pycuda/../../../../include/pycuda
kernel.cu': ./kernel.cu(10): Error: External calls are not supported
(found non-inlined call to _ZN6pycuda3powERKNS_7complexIfEEi)

Casting the exponent to a float or pycuda::complex<float> prevents the
error from occurring, but casting it to int does not. Is this expected?

                                                        L.G.


_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to