Hello,
 
I have written a Cuda program that calculates lots of Gauss fits. When I use 
that same program with PyCuda, the time it takes to do the calculations is 
almost 3x the time it takes with nvcc.
With nvcc it takes 380 ms and with PyCuda it takes 1110 ms, while the outcome 
of the calculations is the same.
There is no difference in the device code, because I use the same file for the 
device code in both cases.
 
How is this possible?
Does anybody have an idea?
 
I am not sure, but could it have someting to do with array declarations inside 
a device function?
 
# define lenP 6
# define nPoints 100000
...
 
__device__ void someFunction()
{
 
float residu[nPoints], newResidu[nPoints], pNew[lenP], b[lenP], deltaP[lenP];
float A[lenP*lenP], Jacobian[nPoints*lenP], B[lenP*lenP];
...
 
}
 
 
Thanks,
Michiel.
 
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to