On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:
> The second change we need is an additional simplification.
> 
> If we have
> (subreg:M1 (zero_extend:M2 (x))
> 
> Where M1 > M2 and both are scalar integer modes.  It's advantageous to 
> strip the SUBREG and instead have a wider extension.

Should you also check M1 is not multiple registers?

> Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
> Thoughts?

It looks fine to me.  Well, some comments...

> @@ -2643,6 +2644,24 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
>                  || GET_CODE (src) == LSHIFTRT)
>           nshift++;
>       }
> +
> +      /* If I0 loads a memory and I3 sets the same memory, then I2 and I3
> +      are likely manipulating its value.  Ideally we'll be able to combine
> +      all four insns into a bitfield insertion of some kind. 
> +
> +      Note the source in I0 might be inside a sign/zero extension and the
> +      memory modes in I0 and I3 might be different.  So extract the address
> +      from the destination of I3 and search for it in the source of I0.
> +
> +      In the event that there's a match but the source/dest do not actually
> +      refer to the same memory, the worst that happens is we try some
> +      combinations that we wouldn't have otherwise.  */
> +      if ((set0 = single_set (i0))
> +       && (set3 = single_set (i3))
> +       && GET_CODE (SET_DEST (set3)) == MEM
> +       && rtx_referenced_p (XEXP (SET_DEST (set3), 0), SET_SRC (set0)))
> +     ngood += 2;

I think you should test MEM_P (SET_SRC (set0)), too.  Or even just test
rtx_equal_p (SET_DEST (set3), SET_SRC (set0)) ?

> +
>        if (ngood < 2 && nshift < 2)
>       return 0;
>      }
> @@ -5663,6 +5682,25 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, 
> int in_dest,
>         return CONST0_RTX (mode);
>        }
>  
> +      /* If we have (subreg:M1 (zero_extend:M2 (x))) or 
> +      (subreg:M1 (sign_extend: M2 (x))) where M1 is wider
> +      then M2, then go ahead and just widen the original extension.
> +
> +      While the subreg is useful in saying "I don't care about those
> +      upper bits.  Squashing out the subreg results in simpler RTL that
> +      is more easily matched.  */

Closing quote missing.

> +      if ((GET_CODE (SUBREG_REG (x)) == ZERO_EXTEND
> +        || GET_CODE (SUBREG_REG (x)) == SIGN_EXTEND)
> +       && SCALAR_INT_MODE_P (GET_MODE (x))
> +       && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (x)))
> +       && GET_MODE (x) > GET_MODE (SUBREG_REG (x)))

GET_MODE_SIZE instead?


Does this do anything good for the "dec mem" thing on x86?  That would
be a nice bonus :-)


Segher

Reply via email to