https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86259
--- Comment #27 from Davin McCall <davmac at davmac dot org> --- (In reply to Martin Sebor from comment #24) > The code in example #21 has the same bug: > [...] ... due to provenance, you are claiming, if I understand correctly. But I don't see anything in the current language standard that allows or even supports this reasoning (perhaps I'm missing it). For the other examples you can say that the result of the pointer arithmetic is not defined (because it is not specified by 6.5.6). But for this case, the pointer was cast to an integer type before any arithmetic was performed. > The strlen call is undefined because (char*)sp_ip is known to point just > past the last element of u.s.b. It actually points at the first element of u.s.b - we start with &u.s.b, subtract the offset of that element from the container object (the offset will be 4), then add 4. I don't think this by itself invalidates what you have said, though. > It wouldn't matter if there happened to be > a valid string at that address -- there isn't in this case because what's > there is a char[4] with no terminating NUL. That is true only if "address" means something more than "pointer value". I can assert that ((char *)sp_ip + 4) and (u.xx + 4) are equal before the strlen, and the compiler optimises away the assert. Furthermore, there is definitely a valid string at u.xx + 4 and therefore at ((char *)&u) + 4. The provenance rules you're suggesting lead to the conclusion that I can check (via an '==' comparison) if a pointer refers to a particular object, and find that it does, but then invoke undefined behaviour when dereferencing it [*]. While there may be changes in the committee pipeline that would make this the case, in the language as defined now I don't see how this interpretation can be justified. [*] or if such a pointer comparison would also be undefined, i could anyway cast both pointers to an integer type and compare them then. > The pointer wasn't derived from > that address. The pointer was derived from u.s.b and points to u.s.b + > sizeof u.s.b, and there can never be anything valid beyond the end of an > object. (It points at u.s.b, actually). > > [...] Just like it's not valid to increment a pointer from > a[0][1] to a[1][0] and dereference the latter in 'char a[2][2]; it's not > valid to increment a pointer to one struct member to point to another and > dereference it. Again, there was no pointer arithmetic (other than the line containing 'strlen', but that particular case the pointer has the address of the union object, which has been cast to (char *), and the '+ 4' should be valid then, surely, by 6.3.2.7 paragraph 7 (ignoring that it requires 'successive increments' rather than arbitrary addition, or is that supposed to be significant?). I believe I understand the point of the provenance rules, but I do not think it is right to implement provenance as transferring to integers, on-by-default, in a compiler for the current language specification.