On 06/10/2015 02:36 PM, Kyrill Tkachov wrote:

On 02/06/15 17:50, Jeff Law wrote:
On 06/02/2015 09:57 AM, Kyrill Tkachov wrote:
I'm stuck on noce_process_if_block (in ifcvt.c) and what I think is a
restriction that the THEN-block contents have to be only a single set
insn. This fails on aarch64 because we get an extra zero_extend.

In particular, the following check in noce_process_if_block triggers:
   insn_a = first_active_insn (then_bb);
   if (! insn_a
       || insn_a != last_active_insn (then_bb, FALSE)
       || (set_a = single_set (insn_a)) == NULL_RTX)
     return FALSE;

Is there any particular reason why the code shouldn't be able to handle
arbitrarily large contents
in then_bb (within a sane limit)?
It's just never been implemented or tested per this comment in
noce_process_if_block.

  /* We're looking for patterns of the form

     (1) if (...) x = a; else x = b;
     (2) x = b; if (...) x = a;
     (3) if (...) x = a;   // as if with an initial x = x.

     The later patterns require jumps to be more expensive.

     ??? For future expansion, look for multiple X in such patterns.  */

I think folks would look favorably upon removing that limitation, obviously
with some kind of cost checking.

Thanks, I've made some progress towards making it more aggressive.
A question since I'm in the area...
noce_try_cmove_arith that I've been messing around with has this code:

   /* A conditional move from two memory sources is equivalent to a
      conditional on their addresses followed by a load.  Don't do this
      early because it'll screw alias analysis.  Note that we've
      already checked for no side effects.  */
   /* ??? FIXME: Magic number 5.  */
   if (cse_not_expected
       && MEM_P (a) && MEM_P (b)
       && MEM_ADDR_SPACE (a) == MEM_ADDR_SPACE (b)
       && if_info->branch_cost >= 5)


Any ideas on where the rationale for that 5 came from?
I see it's been there since the very introduction of ifcvt.c
I'd like to replace it with something more sane, maybe even remove it?

I don't recally where the 5 came from. As Jeff mentioned last week, I was working on ia64 at the time. My best guess is that this optimization produced a performance regression on x86, and that I chose a number that triggered this for ia64 but not other targets.


r~

Reply via email to