http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49279
--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-10-05 15:48:57 UTC --- Created attachment 25423 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25423 CAST_RESTRICT removal Attaching a test patch that just removed CAST_RESTRICT altogether, plus IRC discussion that lead to it. The only testsuite regressions are Wobjsize-1.c and strlenopt-4gf.c which show an important security related problem - we probably shouldn't be folding builtins if DECL_INITIAL (fndecl) != NULL && DECL_DECLARED_INLINE_P (fndecl) && cfun && !cfun->after_inlining, because then we happily fold e.g. char buf[2]; strcpy (buf, "abcd"); into __builtin_memcpy even when strcpy is always_inline inline wrapper that calls __builtin___strcpy_chk and would complain about the buffer overflow resp. add runtime checking. jakub> IMHO something like RESTRICT_CAST_EXPR gimple stmts or handling as CAST_RESTRICT only non-restrict -> restrict assignments to non-artificial vars would be some form of smaller CAST_RESTRICT removal jakub> with both, even if we forward propagate or CSE the rhs, we wouldn't create the undesirable restrict tags richi> yeah richi> it would also remove that awkward single non-useless pointer cast jakub> while still restrict could be (somewhat) supported for inlined routines and even for cases where one writes { int *__restrict p = something; *p ... in the code richi> of course we'd have to handle CAST_RESTRICT everywhere ... richi> but currently it's very fragile jakub> yes, it would be some work richi> maybe try removing the cast handling first? richi> and see how many people complain? richi> it would be the solution for release branches anyway I guess jakub> would be nice to see SPEC numbers with that removed first richi> not a single restrict in SPEC ... jakub> really? Even in 2k6? richi> really. richi> test coverage is very bad for restrict :/ richi> we get some via fortran of course richi> but those are for globals and params only jakub> but with inlined fortran routines you get the restrict casts, right? richi> but you get them like the case you want to remove - descriptor->data = (restrict *)x; richi> but yes jakub> yeah, those are definitely undesirable richi> though we either get a decl in the caller (even better) or another restrict param passed throuhg in fortran jakub> anyway, if you want, I can try the RESTRICT_CAST_EXPR (or CAST_RESTRICT_EXPR, better suggestion) approach richi> so for fortran I believe we wouldn't lose anything richi> I suppose only the frontends would insert those? How would those be different in the case of inlining two copies? jakub> I'd think the FEs as well as the inliner would insert those jakub> not sure if I understood well the inlining two copies problem though jakub> you have inline with __restrict parameter(s), inline it twice richi> restrict pointers on arguments point to { restrict-tag NONLOCAL }, the disambiguator ignores NONLOCAL, but that's a problem if two restrict pointers are based on one that only points to NONLOCAL (like a non-restrict parameter from the caller) richi> foo (int *p) { bar(p); bar(p); } bar (int * restrict p) { *p = 1; } would disambiguate the two stores to p richi> that's the issue in the other PR richi> needs quite some obfuscation to trigger of course ;) jakub> and the problem was during scheduling? jakub> or something that intermixed the bodies? richi> yes, but I can imagine that you could trigger bogus CSE as well richi> the alias info is clearly wrong richi> my idea was to improve the based-on tracking by giving each pointer a distinct pointed-to tag richi> instead of just NONLOCAL if we don't know richi> of course that will increase the points-to set sizes even more richi> so I don't like it too much jakub> not sure I understand how that would help richi> ISTR that issue is also why we require two restrict-tag pointers and don't disambiguate int * restrict p and int *q richi> you'd get { TAG1 NONLOCAL ptr } { TAG2 NONLOCAL ptr } and thus no disambiguation richi> see the bitmap_intersect_p check in pt_solutions_same_restrict_base jakub> on the other side, restrict in C99 is very tight to the scope in which the cast occurs jakub> so perhaps we could disambiguate through the explicit cast restricts only if in the right BLOCK? richi> ick richi> what's the "right block"? richi> also that doesn't sound like a viable middle-end concept jakub> not sure if it would be enough to just consider TREE_BLOCK of the RESTRICT_CAST_EXPR stmts richi> we don't implement C99 restrict in the middle-end, we implement something that should capture it jakub> "right block" in the sense the same block or one block is subblock of the other (transitively) richi> you'd have to remember which RESTRICT_CAST_EXPR feeds a memory access richi> at least PTA propagation won't be able to avoid leaking the restrict tags out of the blocks jakub> so, if we have baz (int *__restrict p, int *__restrict q) { *p = 1; *q = 1; return *p; } inline, both the cast restricts would be the same block, so can be used to disambiguate richi> but at disambiguation we don't see the casts but two memory accesses jakub> if you inline it twice, we wouldn't disambiguate p' and p'' based accesses jakub> if we can from the CAST_RESTRICT artificial heap var get to the TREE_BLOCK to which it is attached, we could use that... richi> the heap var doesn't exist - it's just a UID ... jakub> perhaps as proof of concept we could start with only testing TREE_BLOCKs for equality, not walking to the parents jakub> but we can somewhere on the side map that UID to TREE_BLOCK, can't we? richi> and we can't go from random UIDs to DECLs (and they don't exist) richi> huh, sure we could richi> it sounds awkward though, tailored to stupid C99 richi> btw, you don't know which bit is the restrict tag, so you'd need to iterate over all and check them all jakub> bitmap which uids are cast restrict tags? richi> if you have p1 { R1 R2 } p2 { R3 R4 } is it enough to have one pair that has matching blocks, say R1 and R4? richi> or do all possible pairs to be useful for disambiguation? jakub> not sure, when exactly you get two or more restrict tags in points-to set? x = y ? p1 : p2 ? richi> OTOH, the simpler patch that just removes the offending four lines that handle restrict casts sounds very very much more appealing to me ;) richi> from PHIs for example, or from context-insensitive parts (when we go through memory or calls) jakub> I can try to bootstrap that, wonder what the fallout in the testsuite would be richi> nothing we can't XFAIL jakub> sure, but the question is if we don't pessimize real-world code that way too much richi> at least for fixing the two bugs on release branches that sounds best richi> well, it opens up optimizing restrict vs. non-restrict qualified pointers jakub> probably jakub> btw, do we create restrict tags for fields of automatic aggregates which we can name? richi> yes, I think so jakub> without CAST_RESTRICT, I guess in my short testcase it would be fine if s.p was given a restrict tag (but apparently it doesn't get it) richi> let me check jakub> even if we take its address, just going through the pointer (t->p) shouldn't have restrict cast and thus not miscompile it richi> ah no, we don't initialize them richi> they don't point to anything initially richi> so we assume we can compute precise points-to sets for them jakub> probably not even with s = { 1, p }; in the source richi> right, because we gimplify that richi> if you make them local static they get restrict tags jakub> well, local statics are like global jakub> they live forever richi> right jakub> though, with the automatic vars after scheduling we might hit the same issue as with cast restrict - if code from different blocks overlaps