Hi Richard, Am Mittwoch, den 17.04.2019, 11:41 +0200 schrieb Richard Biener: > On Wed, Apr 17, 2019 at 11:15 AM Peter Sewell <peter.sew...@cl.cam.ac.uk> > wrote: > > > > On 17/04/2019, Richard Biener <richard.guent...@gmail.com> wrote: > > > On Fri, Apr 12, 2019 at 5:31 PM Peter Sewell <peter.sew...@cl.cam.ac.uk> > > > wrote:
... > > > So this is not what GCC implements which tracks provenance through > > > non-pointer types to a limited extent when only copying is taking place. > > > > > > Your proposal makes > > > > > > int a, b; > > > int *p = &a; > > > int *q = &b; > > > uintptr_t pi = (uintptr_t)p; //expose > > > uintptr_t qi = (uintptr_t)q; //expose > > > pi += 4; > > > if (pi == qi) > > > *(int *)pi = 1; > > > > > > well-defined since (int *)pi now has the provenance of &b. > > > > Yes. (Just to be clear: it's not that we think the above example is > > desirable in itself, but it's well-defined as a consequence of what > > we do to make other common idioms, eg pointer bit manipulation, > > well-defined.) > > > > > Note GCC, when tracking provenance of non-pointer type > > > adds like in > > > > > > int *p = &a; > > > uintptr_t pi = (uintptr_t)p; > > > pi += 4; > > > > > > considers pi to have provenance "anything" (not sure if you > > > have something like that) since we add 4 which has provenance > > > "anything" to pi which has provenance &a. > > > > We don't at present have a provenance "anything", but if the gcc > > "anything" means that it's assumed that it might alias with anything, > > then it looks like gcc's implementing a sound approximation to > > the proposal here? > > GCC makes the code well-defined whereas the proposal would make > dereferencing a pointer based on pi invoke undefined behavior? No, if there is an exposed object where pi points to, it is defined behaviour. > Since > your proposal is based on an abstract machine there isn't anything > like a pointer with multiple provenances (which "anything" is), just > pointers with no provenance (pointing outside of any object), right? This is correct. What the proposal does though is put a limit on where pointers obtained from integers are allowed to point to: They cannot point to non-exposed objects. I assume GCC "anything" provenances also cannot point to all possible objects. > For points-to analysis we of course have to track all possible > provenances of a pointer (and if we know it doesn't point inside > any object we make it point to nothing). Yes, a compiler should track what it knows (it could also track if it knows that some pointers point to the same object, etc.) while the abstract machine knows everything there is to know. > Btw, GCC changed its behavior here to support optimizing matlab > generated C code which passes pointers to arrays across functions > by marshalling them in two float typed halves (yikes!). GCC is able > to properly track provenance across the decomposition / recomposition > when doing points-to analysis ;) Impressive ;-) I would have thought that such encoding happens at ABI boundaries, where you cannot track anyway. But this seems to occur inside compiled code? While we do not attach a provenance to integers in our proposal, it does not necessarily imply that a compiler is not allowed to track such information. It then depends on how it uses it. For example, int z; int x; uintptr_t pi = (uintptr_t)&x; // encode in two floats ;-) // pass floats around // decode int* p = (int*)pi; If the compiler can prove that the address is still the same, it can also reattach the original provenance under some conditions. But there is a caveat: It can only do this is it cannot also be a one-after pointer for z (or some other object). If the address of 'z' is not exposed, it may be able to assume this. > Btw, one thing GCC struggles is when it applies rules that clearly > apply to pointer dereferences to pointer equality compares where > the standard has that special casing of comparing two pointers > where one points one after an object requiring the comparison > to evaluate to true when the objects are adjacent. GCC > currently statically optimizes if (&x + 1 == &y) to false for > this reason (but not the corresponding integer comparison). Yes, according to the current rules (and this doesn't change in the proposal) two points comparing equal does not imply that they are interchangable. Making the comparison unspecified (as C++) would not help. We could make it undefined, which would make all optimizations based on the assumption that the pointer are interchangable valid. But I fear that this would introduce a corner case that could lead to subtle and hard-to-detect bugs. Martin > Richard. > > > > > best, > > Peter > > > > > > > > The user-disambiguation refinement adds some complexity but supports > > > > roundtrip casts, from pointer to integer and back, of pointers that > > > > are one-past a storage instance. > >