06 июля 2011, 07:17 от Алексей Гурин<[email protected]>:
When i try to pass double precision argument in this code
##########################
import pycuda.autoinit
import pycuda.driver as drv
import numpy as np
from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void someFunc(double *array,double var,int N)
{
int i = blockIdx.x*blockDim.x + threadIdx.x;
if (i < N)
{
array[i] = var;
}
}
""",arch="sm_13")
N = 10000
arrayH = np.zeros(N,np.float64)
arrayD = drv.mem_alloc(N*8)
someFunc = mod.get_function("someFunc")
threadsPerBlock = 10
block = (threadsPerBlock,1,1)
grid = ((N+threadsPerBlock-1)/threadsPerBlock,1)
someFunc(arrayD,np.float64(10),np.int32(N),block=block,grid=grid)
drv.memcpy_dtoh(arrayH,arrayD)
print arrayH
##########################
i get incorrect output
[ 2.12204896e-310 2.12204896e-310 2.12204896e-310 ...,
2.12204896e-310 2.12204896e-310 2.12204896e-310]
when expected
[ 10. 10. 10. ..., 10. 10. 10.]
i tried this on pycuda 2011.1.1 and pycuda 2011.1.2 on GeForce GTX 275
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda
i solved it somehow. When i pass arguments of different sizes like double = 8
bytes, int = 4 bytes and pointer to double = 4 bytes, size in bytes of all args
before double must be a multiple of 8
for example:
double*,double,int - incorrect result
double*,int,double - correct
double*,int,int,double - incorrect
if i put all 8 byte arguments before 4 byte ones it also works correctly
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda