On Sat, Jan 21, 2012 at 12:17 PM, Andreas Kloeckner <[email protected] > wrote:
> On Sat, 21 Jan 2012 11:24:33 -0500, Yifei Li <[email protected]> wrote: > > On Fri, Jan 20, 2012 at 9:58 PM, Anthony LaTorre <[email protected] > >wrote: > > > > > add the padding field! > > > > I know this will work. But without the padding field, the same program > > written in C does not have any issue. > > I'm really not sure what you're trying to do here. Can you explain > properly: > Sorry for the confusion. > > 0) what are you trying to do? > I'm trying to see if the struct example in the tutorial still works without padding. And following is the kernel function I use: __global__ void test(DoubleOperation *a) { a = &a[blockIdx.x]; printf("block %d: %d\n", blockIdx.x, a->datalen); The kernel is launched using 2 blocks, each of which has one thread. > a) what is the problem? > I actually have two questions. 1) The example fails to work without padding, the second block prints the wrong 'datalen'. However, if I use CUDA runtime API instead of pycuda, the result is correct even without the padding. 2) Since the size of the struct without padding is 12 bytes, I tried a different struct but of the same size: struct DoubleOperation{ float x; float y; float z; } And the kernel function is changed to __global__ void test(DoubleOperation *a) { a = &a[blockIdx.x]; printf("block %d: %f %f %fn", blockIdx.x, a->x, a->y, a->z); But this time the values of x, y and z are printed correctly by both blocks. So why does it work even though the struct's size is the same as before? > b) what is this mysterious 'C program' you keep referring to? CUDA C? > I translate the code using pycuda into the one using CUDA runtime API > c) what have you tried? > I tried changing the order of the fields in the struct, but the second block still prints the wrong 'datalen' struct DoubleOperation { float *ptr; int datalen; }; class VecStruct: mem_size = 4 + numpy.intp(0).nbytes def __init__(self, array, struct_arr_ptr): self.data = cuda.to_device(array) cuda.memcpy_htod(int(struct_arr_ptr) , numpy.intp(int(self.data))) cuda.memcpy_htod(int(struct_arr_ptr)+8, numpy.int32(array.size)) > d) what were the outcomes? > > Andreas > >
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
