On Mon, 26 Sep 2011, Jakub Jelinek wrote:
> Hi!
>
> Adding Joseph and Jason to CC.
>
> On Mon, Sep 26, 2011 at 04:56:20PM +0200, Richard Guenther wrote:
> > Let's see what kind of fallout we get ;) For example, if the
> > following is valid C code I expect we will vectorize the second
> > loop (disambiguating p[i] and q[i]) bogously:
> >
> > void foo (int *p)
> > {
> > int * __restrict p1 = p;
> > int * __restrict p2 = p + 32;
> > int *q;
> > int i;
> > for (i = 0; i < 32; ++i)
> > p1[i] = p2[i];
> > p = p1;
> > q = p2 - 31;
> > for (i = 0; i < 32; ++i)
> > p[i] = q[i];
> > }
> >
> > because p and q base on different restrict qualified pointers
> > (p1 and p2 respective). At the moment we are safe from this
> > because of the TYPE_RESTRICT checks.
> >
> > Any opinion on the above? Is it valid to base non-restrict
> > pointers on restrict ones? It would be sort-of weird at least,
> > but at least I don't think the first loop use is bogus (even
> > though the pointed-to objects are the same).
>
> If the last loop was
> for (i = 0; i < 32; i++)
> q[i] = p[i];
> then I believe the above would be clearly invalid C99, because
> an object X (say incoming p[4]) would be modified in the same block
That can be fixed by adding some {}s and using a different decl
for the 2nd p/q.
> using a pointer based on p1 and using a pointer not based on p1
> (q), which would violate the requirements that if the object is
> modified through lvalue whose address is based on p1, all modifications
> to B in that block should be done through lvalues whose address is
> based on p1. In the above testcase all modifications are made through
> lvalues whose addresses are p1 based though, so it is less clear.
> Joseph?
>
> Anyway, GCC currently makes p1 and p2 use the same restrict
> base actually, and does vectorize both loops before the patch,
> after the patch as well as with -D__restrict=
> (the vectorization of the second loop is guarded with non-overlap
> check). It looks like a gimplification is wrong:
> int * restrict p1 = p;
>
> int * restrict p2 = p + 128;
>
> is gimplified into:
> p1 = (int * restrict) p;
>
> p.0 = (int * restrict) p;
>
> p2 = p.0 + 128;
>
> which IMHO is incorrect, I'd say p.0 shouldn't be int * restrict,
> but plain int *, only the final value should be TYPE_RESTRICT.
Yeah, but maybe it's already fold that transforms
(int * restrict)(p + 128) to (int * restrict)p + 128. I'm pretty
sure it is.
Richard.