On Mon, 21 Mar 2005, Richard Guenther wrote:
> I'd like to specify (for vectorization) the alignment of the
> target of a pointer. I.e. I have a vector of floats that I
> know is suitable aligned and that get's passed to a function
> like
>
> typedef ???? afloatp;
>
> void foo(afloatp __restrict__ a, afloatp __restrict__ b,
> afloatp __restrict__ c)
> {
> int i;
> for (i=0; i<4; ++i)
> a[i] = b[i] + c[i];
> }
>
> now, the obvious
>
> typedef float __attribute__((aligned(16))) * afloatp;
>
> doesn't have any effect on (*a)s alignment, and specifying
> the alignment in the function argument list like
In fact,
#include <stdio.h>
typedef float __attribute__((aligned(16))) afloat;
typedef float __attribute__((aligned(16))) * afloatp;
typedef float afloata[4] __attribute__((aligned(16)));
void foo2(afloat * __restrict__ a, afloatp __restrict__ b,
afloata c)
{
printf("%i %i %i %i\n", __alignof__(*a), __alignof__(a[1]),
__alignof__(a[2]), __alignof__(a[3]));
printf("%i %i %i %i\n", __alignof__(*b), __alignof__(b[1]),
__alignof__(b[2]), __alignof__(b[3]));
printf("%i %i %i %i\n", __alignof__(c[0]), __alignof__(c[1]),
__alignof__(c[2]), __alignof__(c[3]));
}
int main()
{
float x;
foo2(&x, &x, &x);
return 0;
}
compiled with -O2 -fno-inline prints
16 16 16 16
4 4 4 4
4 4 4 4
and the first is obviously not what we want, though
element stride seems to be still four in this case.
Ideally we'd get from a solution
16 4 8 4
though
16 4 4 4
would be acceptable, too.
Richard.