https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104800

--- Comment #15 from Martin Uecker <muecker at gwdg dot de> ---
(In reply to rguent...@suse.de from comment #13)
> On Wed, 9 Mar 2022, muecker at gwdg dot de wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104800
> > 
> > --- Comment #11 from Martin Uecker <muecker at gwdg dot de> ---
> > (In reply to Richard Biener from comment #9)
> > > (In reply to Martin Uecker from comment #8)
> > > > The standard specifies in 5.1.2.3p6 that
> > > > 
> > > > "— Volatile accesses to objects are evaluated strictly
> > > > according to the rules of the abstract machine."
> > > > 
> > > > and
> > > > 
> > > > "This is the observable behavior of the program."
> > > > 
> > > > 
> > > > If a trap is moved before a volatile access so that the access never
> > > > happens, than this changes the observable behavior because the volatile
> > > > access was then not evaluated strictly according to the abstract 
> > > > machine.
> > > 
> > > Well, the volatile access _was_ evaluated strictly according to the 
> > > abstract
> > > machine. 
> > 
> > Not if there is a trap.
> > 
> > > Can't your argument be stretched in a way that for
> > > 
> > >  global = 2;
> > >  *volatile = 1;
> > > 
> > > your reasoning says that since the volatile has to be evaluated strictly
> > > according to the abstract machine that the full abstract machine status
> > > has to be reflected at the point of the volatile and thus the write of
> > > the global (non-volatile) memory has to be observable at that point
> > > and so we may not move accesses to global memory across (earlier or later)
> > > volatile accesses?
> > 
> > The state of the global variables is not directly observable.
> > 
> > > IMHO the case with the division is similar, you just introduce the extra
> > > twist of a trap.
> > 
> > The point is that the trap prevents the volatile store to happen.
> > 
> > > The two volatile accesses in your example are still evaluated according
> > > to the abstract machine, just all non-volatile (and non-I/O) statements
> > > are not necessarily.
> > 
> > The problem is that the volatile store might not be evaluated if there is a
> > trap.
> 
> So?  The abstract machine evaluation simply does not have reached the
> second volatile store then.  

There is only one volatile store. If store_to_x is false, the
volatile store is reached followed by the division. So according
to the abstract machine evaluation the store is executed in
this case. If the compiler reorders the instructions it might not.

> All indirect memory accesses might trap
> (again undefined behavior), direct stores might trap if they are
> mapped to readonly memory (again undefined behavior).  General
> FP operations might trap (sNaNs), other general operations might trap
> (certain CPU insns trap on overflow, some CPUs have NaTs that trap, etc.).

Yes, if those may trap, then reordering volatile stores
with those instructions is dangerous. 

> I'm raising these issues because while the case at hand (division
> by zero trap and volatiles) might be an obvious candidate to "fix"
> just for the sake of QOI the question is where to stop?

I would stop once behaviour is fully correct assuming a sensible
interpretation of the C standard.  If the C standard does not make
sense, we should change the standard or introduce explicit
non-standard compilation modes.

> Take
> 
> volatile flag;
> void foo (int *p)
> {
>   if (p == NULL)
>     {
>       flag = 1;
>       *p = 1;
>     }
> }
> 
> it's an artifact that we currently fail to eliminate the branch
> because it leads to UB *p = 1, and I think it would be valid to
> eliminate it.  Would that be in violation of your desired
> reading of the standard since it elides the volatile store
> (the UB happens later)?

Yes. This is my "desired interpretation" because it 1) matches
the wording and 2) gives useful for guarantees about the
behaviour of the program even in case of errors (which we all
know exist). 

I also think that in the presence of volatile stores
reliable and correct behaviour of a program is more important
than optimization,

The quote from Jeff in comment 5 would also imply that this is
not currently considered a valid optimization in CC.

Reply via email to