Quoted from Chapter 5 of CUDA 4.0 programming guide, which may be relevant.

"Reading non-naturally aligned 8-byte or 16-byte words produces incorrect
results"

But I still don't know how to fix the problem

On Wed, Jan 18, 2012 at 3:01 AM, Andreas Kloeckner
<[email protected]>wrote:

> On Tue, 17 Jan 2012 16:55:22 -0500, Yifei Li <[email protected]> wrote:
> > Hi all,
> >
> > I modified the example
> > http://documen.tician.de/pycuda/tutorial.html#advanced-topics by
> removing
> > the '__padding'  from the structure definition and got incorrect result.
> > The kernel is launched with 2 blocks and one thread in each block.
> >
> > Each thread prints the 'len' field in structure, which should be 3 for
> > block 0 and 2 for block 1. However, the result I got is:
> >
> > block 1: 2097664
> > block 0: 3
> >
> > No such problem if I write the following program using C.  Any help is
> > appreciated.
>
> It seems CUDA doesn't automatically align the pointer, without being
> told to?
>
> https://en.wikipedia.org/wiki/Data_structure_alignment
>
> Andreas
>
>
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to