On Wed, 4 Jan 2017, Jakub Jelinek wrote: > On Wed, Jan 04, 2017 at 10:51:28AM +0100, Richard Biener wrote: > > > The clzl-like integral unary builtins at least from my short testing > > > on x86_64-linux and aarch64-linux usually benefit from > > > factor_out_conditional_conversion not being performed, i.e. the > > > sign-extension (or zero-extension) being performed closer to the actual > > > builtin, because both library calls and inline code usually computes > > > the result in the same mode as the single argument has, so the artificial > > > cast to int for the return value and back can be optimized away if it is > > > adjacent, but not if it happens in some other bb. > > > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > > > Hmm, I don't like special-casing like this. > > > > > Or shall I also check that gimple_bb (SSA_NAME_DEF_STMT (new_arg0)) > > > is equal to gimple_bb (arg0_def_stmt)? > > > > Looking at the original patch and its testcases it seems like > > it wants to deal with the case of the conversion source being > > used in the controlling predicate (I've seen other reports of > > this transform being harmful). That is, it's supposed to be > > an enablement transform for further phiopt cases (for the > > gcc.dg/tree-ssa/pr66726.c case min-max replacement). > > factor_out_conditional_conversion handles 2 IMHO quite different cases. > One is the case where there are 2 SSA_NAMEs and 2 conversions and the > optimization turns those 2 conversions before the PHI into a single one > after the PHI. That is likely beneficial in most cases. > The other case is when there is a single conversion, one SSA_NAME and one > INTEGER_CST. This one might be in rare cases beneficial, in many cases a > wash, in many cases pessimizing. But perhaps it enables some phiopt > optimization in some cases. > So, unless we want to throw away the SSA_NAME + INTEGER_CST handling > of factor_out_conditional_conversion altogether, we need some heuristics > on whether it is beneficial or not. If ignoring the cases where we can get > better code by keeping the conversion closer to the operation that produced > the value (e.g. this clzl case where the operation computes it in wider mode > and then just downcasts it and extension after it can be optimized away > because of it), I guess it is a matter of whether the target is whole > word operations or not, what the modes of the converted type and conversion > result type are etc. > It is not clear what you are proposing.
For the SSA_NAME + INTEGER_CST case restrict it to the case if (x_1 > 5) tem_2 = (char) x_1; # tem_3 = PHI <tem_2, 5> that is, (char) x_1 uses x_1 and that also appears in the controlling GIMPLE_COND. That's what enables followup minmax replacement (gcc.dg/tree-ssa/pr66726.c). Another case where it might be profitable is if the BB is empty after the conversion is sunk (may enable other value-replacement cases). > > Your patch misses a testcase so I can't really see whether such condition > > would fix it. > > A testcase is in the PR, #c8 and #c9. Haven't turned those into a > testsuite/ testcase because it looks very fragile to me (the only thing > I can imagine with that would be gcc.target/i386/ and gcc.target/aarch64/ > tests with a single function per testcase and doing scan-assembler-not > or something similar there). > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)