On Fri, Oct 18, 2013 at 2:21 AM, Zhenqiang Chen <zhenqiang.c...@linaro.org> wrote: > On 18 October 2013 00:58, Jeff Law <l...@redhat.com> wrote: >> On 10/17/13 05:03, Richard Biener wrote: >>>>> >>>>> Is it OK for trunk? >>>> >>>> >>>> I had a much simpler change which did basically the same from 4.7 (I >>>> can update it if people think this is a better approach). >>> >>> >>> I like that more (note you can now use is_gimple_condexpr as predicate >>> for force_gimple_operand). >> >> The obvious question is whether or not Andrew's simpler change picks up as >> many transformations as Zhenqiang's change. If not are the things missed >> important. >> >> Zhenqiang, can you do some testing of your change vs Andrew P.'s change? > > Here is a rough compare: > > 1) Andrew P.'s change can not handle ssa-ifcombine-ccmp-3.c (included > in my patch). Root cause is that it does not skip "LABEL". The guard > to do this opt should be the same the bb_has_overhead_p in my patch.
This should be an easy change, I am working on this right now. > > 2) Andrew P.'s change always generate TRUTH_AND_EXPR, which is not > efficient for "||". e.g. For ssa-ifcombine-ccmp-6.c, it will generate > > _3 = a_2(D) > 0; > _5 = b_4(D) > 0; > _6 = _3 | _5; > _9 = c_7(D) <= 0; > _10 = ~_6; > _11 = _9 & _10; > if (_11 == 0) > > With my patch, it will generate > > _3 = a_2(D) > 0; > _5 = b_4(D) > 0; > _6 = _3 | _5; > _9 = c_7(D) > 0; > _10 = _6 | _9; > if (_10 != 0) As mentioned otherwise, this seems like a missed optimization inside forwprop. When I originally wrote this code there used to be two cases one for & and one for |, but this was removed sometime and I just made the code evolve with that. > > 3) The good thing of Andrew P.'s change is that "Inverse the order of > the basic block walk" so it can do combine recursively. > > But I think we need some heuristic to control the number of ifs. Move > too much compares from > the inner_bb to outer_bb is not good. I think this depends on the target. For MIPS we don't want an upper bound as integer comparisons go directly to GPRs. I wrote this patch with MIPS in mind as that was the target I was working on. > > 4) Another good thing of Andrew P.'s change is that it reuses some > existing functions. So it looks much simple. Thanks, Andrew Pinski > >>> >>> With that we should be able to kill the fold-const.c transform? >> >> That would certainly be nice and an excellent follow-up for Zhenqiang. > > That's my final goal to "kill the fold-const.c transform". I think we > may combine the two changes to make a "simple" and "good" patch. > > Thanks! > -Zhenqiang