https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104754
Aldy Hernandez <aldyh at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed| |2022-03-03 CC| |amacleod at redhat dot com --- Comment #1 from Aldy Hernandez <aldyh at gcc dot gnu.org> --- Confirmed on a cross to m68k-unknown-linux-gnu. Interestingly this may actually be a regression against GCC11, at least on this target (and possibly the others mentioned though I haven't checked). The test verifies that there are no calls to foo(). On m68k the gate to foo() flows through here (threadfull2 dump right before vrp2): <bb 3> [local count: 715863673]: # ivtmp.9_23 = PHI <ivtmp.9_24(11), ivtmp.9_7(9)> bar (); _2 = (void *) ivtmp.9_23; _1 = MEM[(long int *)_2]; ivtmp.9_24 = ivtmp.9_23 + 4; if (_1 == 1) goto <bb 4>; [20.24%] else goto <bb 5>; [79.76%] <bb 4> [local count: 144890806]: foo (); ivtmp.9_24 has been set previously in BB9 to: ivtmp.9_7 = (unsigned int) &b; VRP2 can't seem to do anything with the above sequence, since it can't figure out what _1 is. I suppose it could, since there is enough information to to get at "b" at -O3. On x86, where the test passes, we have the following before vrp2: <bb 3> [local count: 477266310]: # c_4 = PHI <c_14(7)> bar (); _15 = (sizetype) c_4; _17 = MEM[(long int *)&b + _15 * 8]; if (_17 == 1) goto <bb 4>; [20.24%] else goto <bb 5>; [79.76%] <bb 4> [local count: 96598701]: foo (); c_29 = c_4 + 1; goto <bb 8>; [100.00%] which vrp2 can happily optimize to: <bb 6> [local count: 477266310]: bar (); _17 = 0; if (_17 == 1) goto <bb 3>; [20.24%] else goto <bb 4>; [79.76%] ... ... <bb 3> [local count: 96598701]: foo (); goto <bb 7>; [100.00%] Thus leading to foo's demise by ccp4. I haven't dug deep, but this is likely due to the pointer equivalence tracking we use in evrp/VRP2 not being able to see that this is all funny talk for b[]: ivtmp.9_7 = (unsigned int) &b; ... ... # ivtmp.9_23 = PHI <ivtmp.9_24(11), ivtmp.9_7(9)> _2 = (void *) ivtmp.9_23; _1 = MEM[(long int *)_2]; if (_1 == 1) We have plans for a proper pointer range class for GCC13, though I wonder whether we'll be able to handle the above gymnastics. FWIW, the above transformation seems to be ivopts at play. Whereas on x86 we go from: <bb 3> [local count: 715863673]: # c_19 = PHI <c_14(12), c_20(10)> bar (); _1 = b[c_19][0]; if (_1 == 1) to: <bb 3> [local count: 715863673]: # c_19 = PHI <c_14(12), c_20(10)> bar (); _23 = (sizetype) c_19; _1 = MEM[(long int *)&b + _23 * 8]; if (_1 == 1) goto <bb 4>; [20.24%] on m68k we transform the sequence to: <bb 3> [local count: 715863673]: # ivtmp.9_23 = PHI <ivtmp.9_24(12), ivtmp.9_7(10)> bar (); _2 = (void *) ivtmp.9_23; _1 = MEM[(long int *)_2]; ivtmp.9_24 = ivtmp.9_23 + 4; if (_1 == 1) Perhaps someone with more target-foo can opine.