On Wed, Jan 06, 2010 at 10:15:58AM +0000, Andrew Haley wrote:
> On 01/06/2010 09:59 AM, Mark Colby wrote:
> >>>> Yabbut, how come RTL cse can handle it in x86_64, but PPC not?
> >>>
> >>> Probably because the RTL on x86_64 uses and's and ior's, but PPC uses
> >>> set's of zero_extract's (insvsi).
> >>
> >> Aha! Yes, that'll probably be it. It should be easy to fix cse to
> >> recognize those too.
>
> > I'm not familiar with the gcc source yet, but just in case I get the
> > time to look at this, could anyone give me a file/line ref to dive
> > into and examine?
>
> Would you believe cse.c? :-)
>
> I can't find the line without investigating further.
>
> Andrew.
>
> P.S. This is a nontrivial task if you don't know gcc, but might be a
> good place for a beginner to start. OTOH, might be hard: no way to
> know without digging.
I've digged a little bit and this optimizes the testcase on PowerPC 32-bit.
The patch is completely untested though.
On PowerPC 64-bit which apparently doesn't use ZERO_EXTRACT in this case I
see a different issue. It generates
li 3,0
ori 3,3,32820
sldi 3,3,16
while IMHO 2 insns to load the constant would be completely sufficient,
apparently rs6000_emit_set_long_const needs work.
lis 3,0x8034
extsw 3,3
or
li 3,0x401a
sldi 3,3,17
etc. do IMHO the same.
2010-01-06 Jakub Jelinek <[email protected]>
* cse.c (cse_insn): Optimize lhs ZERO_EXTRACT if only CONST_INTs are
involved.
--- gcc/cse.c.jj 2009-11-25 16:47:36.000000000 +0100
+++ gcc/cse.c 2010-01-06 16:00:41.000000000 +0100
@@ -4436,6 +4436,7 @@ cse_insn (rtx insn)
for (i = 0; i < n_sets; i++)
{
+ bool repeat = false;
rtx src, dest;
rtx src_folded;
struct table_elt *elt = 0, *p;
@@ -5029,6 +5030,72 @@ cse_insn (rtx insn)
break;
}
+ /* Try to optimize
+ (set (reg:M N) (const_int A))
+ (set (reg:M2 O) (const_int B))
+ (set (zero_extract:M2 (reg:M N) (const_int C) (const_int D))
+ (reg:M2 O)). */
+ if (GET_CODE (SET_DEST (sets[i].rtl)) == ZERO_EXTRACT
+ && CONST_INT_P (trial)
+ && CONST_INT_P (XEXP (SET_DEST (sets[i].rtl), 1))
+ && CONST_INT_P (XEXP (SET_DEST (sets[i].rtl), 2))
+ && REG_P (XEXP (SET_DEST (sets[i].rtl), 0))
+ && (GET_MODE_BITSIZE (GET_MODE (SET_DEST (sets[i].rtl)))
+ >= INTVAL (XEXP (SET_DEST (sets[i].rtl), 1)))
+ && ((unsigned) INTVAL (XEXP (SET_DEST (sets[i].rtl), 1))
+ + (unsigned) INTVAL (XEXP (SET_DEST (sets[i].rtl), 2))
+ <= HOST_BITS_PER_WIDE_INT))
+ {
+ rtx dest_reg = XEXP (SET_DEST (sets[i].rtl), 0);
+ rtx width = XEXP (SET_DEST (sets[i].rtl), 1);
+ rtx pos = XEXP (SET_DEST (sets[i].rtl), 2);
+ unsigned int dest_hash = HASH (dest_reg, GET_MODE (dest_reg));
+ struct table_elt *dest_elt
+ = lookup (dest_reg, dest_hash, GET_MODE (dest_reg));
+ rtx dest_cst = NULL;
+
+ if (dest_elt)
+ for (p = dest_elt->first_same_value; p; p = p->next_same_value)
+ if (p->is_const && CONST_INT_P (p->exp))
+ {
+ dest_cst = p->exp;
+ break;
+ }
+ if (dest_cst)
+ {
+ HOST_WIDE_INT val = INTVAL (dest_cst);
+ HOST_WIDE_INT mask;
+ unsigned int shift;
+ if (BITS_BIG_ENDIAN)
+ shift = GET_MODE_BITSIZE (GET_MODE (dest_reg))
+ - INTVAL (pos) - INTVAL (width);
+ else
+ shift = INTVAL (pos);
+ if (INTVAL (width) == HOST_BITS_PER_WIDE_INT)
+ mask = ~(HOST_WIDE_INT) 0;
+ else
+ mask = ((HOST_WIDE_INT) 1 << INTVAL (width)) - 1;
+ val &= ~(mask << shift);
+ val |= (INTVAL (trial) & mask) << shift;
+ val = trunc_int_for_mode (val, GET_MODE (dest_reg));
+ validate_unshare_change (insn, &SET_DEST (sets[i].rtl),
+ dest_reg, 1);
+ validate_unshare_change (insn, &SET_SRC (sets[i].rtl),
+ GEN_INT (val), 1);
+ if (apply_change_group ())
+ {
+ rtx note = find_reg_note (insn, REG_EQUAL, NULL_RTX);
+ if (note)
+ {
+ remove_note (insn, note);
+ df_notes_rescan (insn);
+ }
+ repeat = true;
+ break;
+ }
+ }
+ }
+
/* We don't normally have an insn matching (set (pc) (pc)), so
check for this separately here. We will delete such an
insn below.
@@ -5104,6 +5171,13 @@ cse_insn (rtx insn)
}
}
+ /* If we changed the insn too much, handle this set from scratch. */
+ if (repeat)
+ {
+ i--;
+ continue;
+ }
+
src = SET_SRC (sets[i].rtl);
/* In general, it is good to have a SET with SET_SRC == SET_DEST.
Jakub