https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85598
--- Comment #13 from rguenther at suse dot de <rguenther at suse dot de> --- On Fri, 23 Nov 2018, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85598 > > --- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to [email protected] from comment #11) > > You need to union the PHI argument ranges. The result you can intersect > > with the existing range info of the PHI result of course. You basically > > want to do to the PHI what [E]VRP does ... > > If I were to recompute the range from scratch, yes, I would. > The way it is written, it is able to optimize just the common case, where it > uses the existing computed range and removes a single constant from the range, > either the minimum or maximum, if it can prove the minimum (resp. maximum) > will > never occur. It does so by proving all PHI arguments but one are constants > not > equal to that value and one is from the edge where there is GIMPLE_COND > comparison of the PHI arg SSA_NAME against that single constant. > > Now, I guess we could somewhat generalize it e.g. by accepting SSA_NAMEs with > ranges instead of constants and just prove (perhaps through vr-values.h APIs) > that the chosen constant is not in those ranges. > > > It's phiopt2 where this is important for the testcase? Or phiopt3 > > after loop? Because after loop there's forwprop before and ccp > > after phiopt both of which would be a better fit. Late CCP probably > > doesn't need to be the SSA-propagator-iterating variant but could > > do with a simpler DOM-style approach and thus could be replaced > > by EVRP if it weren't for that lacking the alignment/nonzero bits > > computation... > > For this testcase it needs to be done I believe in between dce2 (vrp1 does the > jump threading and dce2 does the cleanup in which the PHI <0> is propagated > into the PHI argument) and printf-return-value2 which needs it. > In its current form it doesn't work in between cselim and dom2 because in > between those passes (which include phiopt2) there is a forwarder block added Note that dom after vrp1 should be able to adjust the value-ranges given it uses EVRP to track ranges... why doesn't that work? > and the code doesn't handle that (though, especially if it does the > optimization from walking the PHIs rather than from the GIMPLE_COND) it could > skip a single forwarder block too. > SO, before cselim and in dom2 dump we have: > <bb 3> [local count: 1063004407]: > # RANGE [0, 256] NONZERO 511 > # x_10 = PHI <0(2), x_5(3)> > __builtin_snprintf (&temp, 4, "%%%02X", x_10); > # RANGE [1, 256] NONZERO 511 > x_5 = x_10 + 1; > if (x_5 != 256) > goto <bb 3>; [98.99%] > else > goto <bb 4>; [1.01%] > which the patch can optimize, but in between those we have: > <bb 3> [local count: 1063004407]: > # RANGE [0, 256] NONZERO 511 > # x_10 = PHI <0(2), x_5(5)> > __builtin_snprintf (&temp, 4, "%%%02X", x_10); > # RANGE [1, 256] NONZERO 511 > x_5 = x_10 + 1; > if (x_5 != 256) > goto <bb 5>; [98.99%] > else > goto <bb 4>; [1.01%] > > <bb 5> [local count: 1052266994]: > goto <bb 3>; [100.00%] > which has the extra forwarder. So, on this testcase it is in the end > optimized > in phiopt3. There is forwprop3 in between those passes. > > > So I fear any "proper" solution isn't for stage3 but still I'd go > > with factoring (enough of) vr_values::extract_range_from_phi_node > > plus the register_edge_asserts thing if we need to look at > > dominating conditions. > > Yes, I'm afraid this is just GCC 9 hack and I hope somebody (Aldy, Martin) > will > do the right thing for GCC 10 and this hack can be removed. > >
