On Mon, May 17, 2021 at 1:46 PM Uecker, Martin
<martin.uec...@med.uni-goettingen.de> wrote:
>
>
>
> Am Montag, den 17.05.2021, 09:44 +0200 schrieb Richard Biener:
> > On Mon, May 17, 2021 at 9:32 AM Uecker, Martin
> > <martin.uec...@med.uni-goettingen.de> wrote:
> > >
> > >
> > > Am Montag, den 17.05.2021, 09:08 +0200 schrieb Richard Biener:
> > > > On Sun, May 16, 2021 at 8:57 AM Uecker, Martin
> > > > <martin.uec...@med.uni-goettingen.de> wrote:
> > > > >
> > > > > Hi Richard,
> > > > >
> > > > > I noticed that GCC 11 has different behavior in the following
> > > > > example relative to 10.2 with -O2. I wonder whether this
> > > > > is an intentional change and if yes, what are the rules?
> > > >
> > > > Yes, this is a fix for the long-standing PR57359 where we
> > > > failed to preserve the order of stores when applying store-
> > > > motion.
> > > >
> > > > The basic rule is that stores may not be re-ordered based on
> > > > TBAA since a store changes the dynamic type of a memory
> > > > location.  That notably happens for C/C++ programs re-using
> > > > allocated storage and for C++ programs using placement
> > > > new.
> > > >
> > > > It looked like a not so important bug since all other compilers
> > > > fall into the same trap, nevertheless they are all wrong in
> > > > doing so (and so was GCC).  All work fine at -O0 of course.
> > > >
> > > > Richard.
> > >
> > > This is excellent news! We were already thinking about
> > > how one could change the effective type rules in C, but
> > > it is not clear how this could have been done.
> >
> > Yes, indeed.  It also turned out the solution found for GCC
> > has zero performance impact in practice.  The idea is simply
> > to re-do dependent stores in program order on each loop exit.
> > Without this "trick" fixing the bug has severe impact on SPEC
> > (fixing as in disabling the invalid store-motion optimizations).
>
> Nice!
>
> Was moving these stores out of a loop the only
> case where GCC did this incorrect store-motions?

Those are the ones I'm aware of, yes.

> The other areas where the effective type
> rules appear unclear and/or not matching reality are:
>
> - effective type of aggregates when they are incrementally
> constructed or changed  (not sure whether this is relevant
> for optimization in GCC)

So like

struct S { int i; };
(int *)p = 1;
 ... = *(struct S *)p;
or
 ... = ((struct S*)p)->i;

this and cases with multiple members should work fine with GCC.

> - access paths, i.e. is 'i.p' an access to 'i' or is
> 'a[3]' an access to 'a' with array type (assuming 'a'
> is an lvalue of array type before it decays and not
> a pointer). And if yes, when does this access happen?
> I think GCC relies on this.

Yes, GCC assumes that i.p is an access to 'i' (but not
in address-taking context like &i.p).  Likewise for
((struct S *)p)->i

> - initial sequence rule

We do not honor this, but in practice if you have

struct S1 { int i; float f; };
struct S2 { int i; double g; };

then accessing the initial sequence as *(int *) works,
but

struct S1 s1 = {};
int i = ((struct S2 *)&s1)->i;

does not because we consider it an access as 'S2' due
to the previous point.  If it were just an access to 'i' it
would work again.

> Anything else?

The most recent interesting case from the C++ world is
when they allowed

struct S { int n; std::byte a[16]; }

S s;
new (s.a) T();
S s2;
s2 = s;

thus construct an arbitrary type into an aggregate member
of std::byte (or char type I believe).  With GCCs memory model
copying this like s2.a = s.a would have worked (because std::byte
and char alias everything).  But when you do s2 = s then the
accesses as 'struct S' do not TBAA conflict with say 'double'
(if you make T == double).  For this reason the C++ frontend
now drops TBAA for all std::byte/char array containing aggregeates ...

I do know this kind of stuff is used in practice in C code
as well but usually C people do not use aggregate copying
but instead use memcpy (which would be fine).

So maybe just an anecdote but eventually such situation occurs
in (future) C as well.

Richard.


> Best,
> Martin
>
>
>
> > Richard.
> >
> > > Best,
> > > Martin
> > >
> > >
> > >
> > > > > Thanks!
> > > > > Martin
> > > > >
> > > > > https://godbolt.org/z/57res7ax1
> > > > >
> > > > > #include <stdio.h>
> > > > > #include <stdlib.h>
> > > > >
> > > > >
> > > > > __attribute__((__noinline__, __weak__))
> > > > > void f(long unk, void *pa, void *pa2, void *pb, long *x) {
> > > > >   for (long i = 0; i < unk; ++i) {
> > > > >     int oldy = *(int *)pa;
> > > > >     *(float *)pb = 42;
> > > > >     *(int *)pa2 = oldy ^ x[i];
> > > > >   }
> > > > > }
> > > > >
> > > > > int main(void) {
> > > > >   void *p = malloc(sizeof(int));
> > > > >   *(int *)p = 13;
> > > > >   f(1, p, p, p, (long []){ 0 });
> > > > >   printf("*pa(%d)\n", *(int *)p);
> > > > > }

Reply via email to