On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote: > The second change we need is an additional simplification. > > If we have > (subreg:M1 (zero_extend:M2 (x)) > > Where M1 > M2 and both are scalar integer modes. It's advantageous to > strip the SUBREG and instead have a wider extension.
Should you also check M1 is not multiple registers? > Bootstrapped and regression tested on x86_64-unknown-linux-gnu. > Thoughts? It looks fine to me. Well, some comments... > @@ -2643,6 +2644,24 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, > rtx_insn *i0, > || GET_CODE (src) == LSHIFTRT) > nshift++; > } > + > + /* If I0 loads a memory and I3 sets the same memory, then I2 and I3 > + are likely manipulating its value. Ideally we'll be able to combine > + all four insns into a bitfield insertion of some kind. > + > + Note the source in I0 might be inside a sign/zero extension and the > + memory modes in I0 and I3 might be different. So extract the address > + from the destination of I3 and search for it in the source of I0. > + > + In the event that there's a match but the source/dest do not actually > + refer to the same memory, the worst that happens is we try some > + combinations that we wouldn't have otherwise. */ > + if ((set0 = single_set (i0)) > + && (set3 = single_set (i3)) > + && GET_CODE (SET_DEST (set3)) == MEM > + && rtx_referenced_p (XEXP (SET_DEST (set3), 0), SET_SRC (set0))) > + ngood += 2; I think you should test MEM_P (SET_SRC (set0)), too. Or even just test rtx_equal_p (SET_DEST (set3), SET_SRC (set0)) ? > + > if (ngood < 2 && nshift < 2) > return 0; > } > @@ -5663,6 +5682,25 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, > int in_dest, > return CONST0_RTX (mode); > } > > + /* If we have (subreg:M1 (zero_extend:M2 (x))) or > + (subreg:M1 (sign_extend: M2 (x))) where M1 is wider > + then M2, then go ahead and just widen the original extension. > + > + While the subreg is useful in saying "I don't care about those > + upper bits. Squashing out the subreg results in simpler RTL that > + is more easily matched. */ Closing quote missing. > + if ((GET_CODE (SUBREG_REG (x)) == ZERO_EXTEND > + || GET_CODE (SUBREG_REG (x)) == SIGN_EXTEND) > + && SCALAR_INT_MODE_P (GET_MODE (x)) > + && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (x))) > + && GET_MODE (x) > GET_MODE (SUBREG_REG (x))) GET_MODE_SIZE instead? Does this do anything good for the "dec mem" thing on x86? That would be a nice bonus :-) Segher