Hi Richard,

Am Mittwoch, den 17.04.2019, 11:41 +0200 schrieb Richard Biener:
> On Wed, Apr 17, 2019 at 11:15 AM Peter Sewell <peter.sew...@cl.cam.ac.uk> 
> wrote:
> > 
> > On 17/04/2019, Richard Biener <richard.guent...@gmail.com> wrote:
> > > On Fri, Apr 12, 2019 at 5:31 PM Peter Sewell <peter.sew...@cl.cam.ac.uk>
> > > wrote:

...
> > > So this is not what GCC implements which tracks provenance through
> > > non-pointer types to a limited extent when only copying is taking place.
> > > 
> > > Your proposal makes
> > > 
> > >  int a, b;
> > >  int *p = &a;
> > >  int *q = &b;
> > >  uintptr_t pi = (uintptr_t)p; //expose
> > >  uintptr_t qi = (uintptr_t)q; //expose
> > >  pi += 4;
> > >  if (pi == qi)
> > >    *(int *)pi = 1;
> > > 
> > > well-defined since (int *)pi now has the provenance of &b.
> > 
> > Yes.  (Just to be clear: it's not that we think the above example is
> > desirable in itself, but it's well-defined as a consequence of what
> > we do to make other common idioms, eg pointer bit manipulation,
> > well-defined.)
> > 
> > > Note GCC, when tracking provenance of non-pointer type
> > > adds like in
> > > 
> > >   int *p = &a;
> > >   uintptr_t pi = (uintptr_t)p;
> > >   pi += 4;
> > > 
> > > considers pi to have provenance "anything" (not sure if you
> > > have something like that) since we add 4 which has provenance
> > > "anything" to pi which has provenance &a.
> > 
> > We don't at present have a provenance "anything", but if the gcc
> > "anything" means that it's assumed that it might alias with anything,
> > then it looks like gcc's implementing a sound approximation to
> > the proposal here?
> 
> GCC makes the code well-defined whereas the proposal would make
> dereferencing a pointer based on pi invoke undefined behavior?

No, if there is an exposed object where pi points to, it is
defined behaviour. 

>  Since
> your proposal is based on an abstract machine there isn't anything
> like a pointer with multiple provenances (which "anything" is), just
> pointers with no provenance (pointing outside of any object), right?

This is correct. What the proposal does though is put a limit
on where pointers obtained from integers are allowed to point
to: They cannot point to non-exposed objects. I assume GCC
"anything" provenances also cannot point to all possible
objects.

> For points-to analysis we of course have to track all possible
> provenances of a pointer (and if we know it doesn't point inside
> any object we make it point to nothing).

Yes, a compiler should track what it knows (it could also track
if it knows that some pointers point to the same object, etc.)
while the abstract machine knows everything there is to know.

> Btw, GCC changed its behavior here to support optimizing matlab
> generated C code which passes pointers to arrays across functions
> by marshalling them in two float typed halves (yikes!).  GCC is able
> to properly track provenance across the decomposition / recomposition
> when doing points-to analysis ;)

Impressive ;-)  I would have thought that such encoding
happens at ABI boundaries, where you cannot track anyway.
But this seems to occur inside compiled code?

While we do not attach a provenance to integers
in our proposal, it does not necessarily imply that a compiler
is not allowed to track such information. It then depends on
how it uses it.

For example,

int z;
int x;
uintptr_t pi = (uintptr_t)&x;

// encode in two floats ;-)

// pass floats around

// decode

int* p = (int*)pi;

If the compiler can prove that the address is still
the same, it can also reattach the original provenance
under some conditions.

But there is a caveat: It can only do this is it cannot
also be  a one-after pointer for z (or some other object).
If the address of 'z' is not exposed, it may be able to
assume this.

> Btw, one thing GCC struggles is when it applies rules that clearly
> apply to pointer dereferences to pointer equality compares where
> the standard has that special casing of comparing two pointers
> where one points one after an object requiring the comparison
> to evaluate to true when the objects are adjacent.  GCC
> currently statically optimizes if (&x + 1 == &y) to false for
> this reason (but not the corresponding integer comparison).

Yes, according to the current rules (and this doesn't change in
the proposal) two points comparing equal does not imply that
they are interchangable. Making the comparison unspecified 
(as C++) would not help. We could make it undefined, which
would make all optimizations based on the assumption that
the pointer are interchangable valid. But I fear that this
would introduce a corner case that could lead to subtle
and hard-to-detect bugs.

Martin

> Richard.
> 
> > 
> > best,
> > Peter
> > 
> > 
> > > > The user-disambiguation refinement adds some complexity but supports
> > > > roundtrip casts, from pointer to integer and back, of pointers that
> > > > are one-past a storage instance.
> 
> 

Reply via email to