add the padding field! or you can just flip the order, i.e.
struct Vec
{
float *data;
int len;
}
On Fri, Jan 20, 2012 at 12:29 PM, Yifei Li <[email protected]> wrote:
> Quoted from Chapter 5 of CUDA 4.0 programming guide, which may be
> relevant.
>
> "Reading non-naturally aligned 8-byte or 16-byte words produces incorrect
> results"
>
> But I still don't know how to fix the problem
>
> On Wed, Jan 18, 2012 at 3:01 AM, Andreas Kloeckner <
> [email protected]> wrote:
>
>> On Tue, 17 Jan 2012 16:55:22 -0500, Yifei Li <[email protected]> wrote:
>> > Hi all,
>> >
>> > I modified the example
>> > http://documen.tician.de/pycuda/tutorial.html#advanced-topics by
>> removing
>> > the '__padding' from the structure definition and got incorrect result.
>> > The kernel is launched with 2 blocks and one thread in each block.
>> >
>> > Each thread prints the 'len' field in structure, which should be 3 for
>> > block 0 and 2 for block 1. However, the result I got is:
>> >
>> > block 1: 2097664
>> > block 0: 3
>> >
>> > No such problem if I write the following program using C. Any help is
>> > appreciated.
>>
>> It seems CUDA doesn't automatically align the pointer, without being
>> told to?
>>
>> https://en.wikipedia.org/wiki/Data_structure_alignment
>>
>> Andreas
>>
>>
>
> _______________________________________________
> PyCUDA mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pycuda
>
>
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda