[cc:ing sh, spu and tilegx maintainers]
Richard Sandiford <[email protected]> writes:
> Andrew Pinski <[email protected]> writes:
>> On Thu, Sep 27, 2012 at 11:13 AM, Uros Bizjak <[email protected]> wrote:
>>> 2012-09-27 Uros Bizjak <[email protected]>
>>>
>>> PR rtl-optimization/54457
>>> * simplify-rtx.c (simplify_subreg):
>>> Simplify (subreg:M (op:N ((x:N) (y:N)), 0)
>>> to (op:M (subreg:M (x:N) 0) (subreg:M (x:N) 0)), where
>>> the outer subreg is effectively a truncation to the original mode M.
>>
>>
>> When I was doing something similar on our internal toolchain at
>> Cavium. I found doing this caused a regression on MIPS64 n32 in
>> gcc.c-torture/execute/20040709-1.c Where:
>>
>>
>> (insn 15 14 16 2 (set (reg/v:DI 200 [ y ])
>> (reg:DI 2 $2)) t.c:16 301 {*movdi_64bit}
>> (expr_list:REG_DEAD (reg:DI 2 $2)
>> (nil)))
>>
>> (insn 16 15 17 2 (set (reg:DI 210)
>> (zero_extract:DI (reg/v:DI 200 [ y ])
>> (const_int 29 [0x1d])
>> (const_int 0 [0]))) t.c:16 249 {extzvdi}
>> (expr_list:REG_DEAD (reg/v:DI 200 [ y ])
>> (nil)))
>>
>> (insn 17 16 23 2 (set (reg:SI 211)
>> (truncate:SI (reg:DI 210))) t.c:16 175 {truncdisi2}
>> (expr_list:REG_DEAD (reg:DI 210)
>> (nil)))
>>
>> Gets converted to:
>> (insn 23 17 26 2 (set (reg/i:SI 2 $2)
>> (and:SI (reg:SI 2 $2 [+4 ])
>> (const_int 536870911 [0x1fffffff]))) t.c:18 156 {*andsi3}
>> (nil))
>>
>> Which is considered an ext instruction
>>
>> And with the Octeon simulator which causes undefined arguments to
>> 32bit word operations to come out as 0xDEADBEEF which showed the
>> regression. I fixed it by changing it to produce TRUNCATE instead of
>> the subreg.
>>
>> I did the simplification on ior/and rather than plus/minus/mult so the
>> issue is only when expanding to this to and/ior.
>
> Hmm, hadn't thought of that. I think some of the existing subreg
> optimisations suffer the same problem. I.e. we can't assume that
> subreg truncations of nested operands are OK just because the outer
> subreg is OK.
>
> I've got a patch I'm testing.
The idea is to split most of the lowpart subreg handling out of
simplify_subreg and apply it to TRUNCATE too. There are three reasons:
- I wanted to make the !TRULY_NOOP_TRUNCATION truncation simplifications
as similar to subreg truncation simplifications as possible.
- Some of the current lowpart subreg simplifications are also correct
for vector truncations.
- Ideally, using simplify_gen_unary (TRUNCATE, ...) instead of
simplify_gen_subreg shouldn't penalise TRULY_NOOP_TRUNCATION targets.
There is already code to use gen_lowpart_no_emit for truncations that
reduce to subregs, but as things stand, gen_lowpart_no_emit only
passes objects like SUBREG, REG, MEM, etc., to simplify_gen_subreg;
others go through gen_lowpart_SUBREG and get no recursive simplification.
We inherited this code from a 1996 patch (r13058):
if ((TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op))
? (num_sign_bit_copies (op, GET_MODE (op))
> (unsigned int) (GET_MODE_PRECISION (GET_MODE (op))
- GET_MODE_PRECISION (mode)))
...
return rtl_hooks.gen_lowpart_no_emit (mode, op);
I don't see any reason for the sign-bit check. If truncations are noops,
we should be able to use a subreg regardless.
Other than removing that check, the patch just moves simplifications around.
I've not tried to match new patterns.
The other !TRULY_NOOP_TRUNCATION targets are sh64, spu and tilegx.
I don't think sh64 has any patterns that would be adversely affected,
although the patch ought to make these patterns redundant:
(define_insn_and_split "*logical_sidisi3"
[(set (match_operand:SI 0 "arith_reg_dest" "=r,r")
(truncate:SI (sign_extend:DI
(match_operator:SI 3 "logical_operator"
[(match_operand:SI 1 "arith_reg_operand" "%r,r")
(match_operand:SI 2 "logical_operand" "r,I10")]))))]
"TARGET_SHMEDIA"
"#"
"&& 1"
[(set (match_dup 0) (match_dup 3))])
(define_insn_and_split "*logical_sidi3_2"
[(set (match_operand:DI 0 "arith_reg_dest" "=r,r")
(sign_extend:DI (truncate:SI (sign_extend:DI
(match_operator:SI 3 "logical_operator"
[(match_operand:SI 1 "arith_reg_operand" "%r,r")
(match_operand:SI 2 "logical_operand" "r,I10")])))))]
"TARGET_SHMEDIA"
"#"
"&& 1"
[(set (match_dup 0) (sign_extend:DI (match_dup 3)))])
combine should now simplify the first to the normal SI logical op
and the second to *logical_sidisi3. I don't think any spu or tilegx
patterns are affected either way.
Tested on x86_64-linux-gnu, mipsisa32-elf and mipsisa64-elf. Also tested
by making sure that there were no code differences for a set of gcc .ii
files on gcc20 (-O2 -march=native). OK to install?
Richard
gcc/
* machmode.h (GET_MODE_UNIT_PRECISION): New macro.
* simplify-rtx.c (simplify_truncation): New function.
(simplify_unary_operation_1): Use it. Remove sign bit test
for !TRULY_NOOP_TRUNCATION_MODES_P.
(simplify_subreg): Use simplify_int_lowpart for TRUNCATE.
* config/mips/mips.c (mips_truncated_op_cost): New function.
(mips_rtx_costs): Adjust test for BADDU.
* config/mips/mips.md (*baddu_di<mode>): Push truncates to operands.
Index: gcc/machmode.h
===================================================================
--- gcc/machmode.h 2012-06-23 08:30:36.000000000 +0100
+++ gcc/machmode.h 2012-10-06 10:03:47.146873855 +0100
@@ -217,6 +217,11 @@ #define GET_MODE_UNIT_SIZE(MODE) \
#define GET_MODE_UNIT_BITSIZE(MODE) \
((unsigned short) (GET_MODE_UNIT_SIZE (MODE) * BITS_PER_UNIT))
+#define GET_MODE_UNIT_PRECISION(MODE) \
+ (GET_MODE_INNER (MODE) == VOIDmode \
+ ? GET_MODE_PRECISION (MODE) \
+ : GET_MODE_PRECISION (GET_MODE_INNER (MODE)))
+
/* Get the number of units in the object. */
extern const unsigned char mode_nunits[NUM_MACHINE_MODES];
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c 2012-10-02 20:34:15.969129966 +0100
+++ gcc/simplify-rtx.c 2012-10-06 10:08:08.349859303 +0100
@@ -564,6 +564,167 @@ simplify_replace_rtx (rtx x, const_rtx o
return simplify_replace_fn_rtx (x, old_rtx, 0, new_rtx);
}
+/* Try to simplify a MODE truncation of OP, which has OP_MODE. */
+
+static rtx
+simplify_truncation (enum machine_mode mode, rtx op,
+ enum machine_mode op_mode)
+{
+ unsigned int precision = GET_MODE_UNIT_PRECISION (mode);
+ unsigned int op_precision = GET_MODE_UNIT_PRECISION (op_mode);
+ gcc_assert (precision <= op_precision);
+
+ /* Optimize truncations of zero and sign extended values. */
+ if (GET_CODE (op) == ZERO_EXTEND
+ || GET_CODE (op) == SIGN_EXTEND)
+ {
+ /* There are three possibilities. If MODE is the same as the
+ origmode, we can omit both the extension and the subreg.
+ If MODE is not larger than the origmode, we can apply the
+ truncation without the extension. Finally, if the outermode
+ is larger than the origmode, we can just extend to the appropriate
+ mode. */
+ enum machine_mode origmode = GET_MODE (XEXP (op, 0));
+ if (mode == origmode)
+ return XEXP (op, 0);
+ else if (precision <= GET_MODE_UNIT_PRECISION (origmode))
+ return simplify_gen_unary (TRUNCATE, mode,
+ XEXP (op, 0), origmode);
+ else
+ return simplify_gen_unary (GET_CODE (op), mode,
+ XEXP (op, 0), origmode);
+ }
+
+ /* Simplify (truncate:SI (op:DI (x:DI) (y:DI)))
+ to (op:SI (truncate:SI (x:DI)) (truncate:SI (x:DI))). */
+ if (GET_CODE (op) == PLUS
+ || GET_CODE (op) == MINUS
+ || GET_CODE (op) == MULT)
+ {
+ rtx op0 = simplify_gen_unary (TRUNCATE, mode, XEXP (op, 0), op_mode);
+ if (op0)
+ {
+ rtx op1 = simplify_gen_unary (TRUNCATE, mode, XEXP (op, 1), op_mode);
+ if (op1)
+ return simplify_gen_binary (GET_CODE (op), mode, op0, op1);
+ }
+ }
+
+ /* Simplify (truncate:QI (lshiftrt:SI (sign_extend:SI (x:QI)) C)) into
+ to (ashiftrt:QI (x:QI) C), where C is a suitable small constant and
+ the outer subreg is effectively a truncation to the original mode. */
+ if ((GET_CODE (op) == LSHIFTRT
+ || GET_CODE (op) == ASHIFTRT)
+ /* Ensure that OP_MODE is at least twice as wide as MODE
+ to avoid the possibility that an outer LSHIFTRT shifts by more
+ than the sign extension's sign_bit_copies and introduces zeros
+ into the high bits of the result. */
+ && 2 * precision <= op_precision
+ && CONST_INT_P (XEXP (op, 1))
+ && GET_CODE (XEXP (op, 0)) == SIGN_EXTEND
+ && GET_MODE (XEXP (XEXP (op, 0), 0)) == mode
+ && INTVAL (XEXP (op, 1)) < precision)
+ return simplify_gen_binary (ASHIFTRT, mode,
+ XEXP (XEXP (op, 0), 0), XEXP (op, 1));
+
+ /* Likewise (truncate:QI (lshiftrt:SI (zero_extend:SI (x:QI)) C)) into
+ to (lshiftrt:QI (x:QI) C), where C is a suitable small constant and
+ the outer subreg is effectively a truncation to the original mode. */
+ if ((GET_CODE (op) == LSHIFTRT
+ || GET_CODE (op) == ASHIFTRT)
+ && CONST_INT_P (XEXP (op, 1))
+ && GET_CODE (XEXP (op, 0)) == ZERO_EXTEND
+ && GET_MODE (XEXP (XEXP (op, 0), 0)) == mode
+ && INTVAL (XEXP (op, 1)) < precision)
+ return simplify_gen_binary (LSHIFTRT, mode,
+ XEXP (XEXP (op, 0), 0), XEXP (op, 1));
+
+ /* Likewise (truncate:QI (ashift:SI (zero_extend:SI (x:QI)) C)) into
+ to (ashift:QI (x:QI) C), where C is a suitable small constant and
+ the outer subreg is effectively a truncation to the original mode. */
+ if (GET_CODE (op) == ASHIFT
+ && CONST_INT_P (XEXP (op, 1))
+ && (GET_CODE (XEXP (op, 0)) == ZERO_EXTEND
+ || GET_CODE (XEXP (op, 0)) == SIGN_EXTEND)
+ && GET_MODE (XEXP (XEXP (op, 0), 0)) == mode
+ && INTVAL (XEXP (op, 1)) < precision)
+ return simplify_gen_binary (ASHIFT, mode,
+ XEXP (XEXP (op, 0), 0), XEXP (op, 1));
+
+ /* Recognize a word extraction from a multi-word subreg. */
+ if ((GET_CODE (op) == LSHIFTRT
+ || GET_CODE (op) == ASHIFTRT)
+ && SCALAR_INT_MODE_P (mode)
+ && SCALAR_INT_MODE_P (op_mode)
+ && precision >= BITS_PER_WORD
+ && 2 * precision <= op_precision
+ && CONST_INT_P (XEXP (op, 1))
+ && (INTVAL (XEXP (op, 1)) & (precision - 1)) == 0
+ && INTVAL (XEXP (op, 1)) >= 0
+ && INTVAL (XEXP (op, 1)) < op_precision)
+ {
+ int byte = subreg_lowpart_offset (mode, op_mode);
+ int shifted_bytes = INTVAL (XEXP (op, 1)) / BITS_PER_UNIT;
+ return simplify_gen_subreg (mode, XEXP (op, 0), op_mode,
+ (WORDS_BIG_ENDIAN
+ ? byte - shifted_bytes
+ : byte + shifted_bytes));
+ }
+
+ /* If we have a TRUNCATE of a right shift of MEM, make a new MEM
+ and try replacing the TRUNCATE and shift with it. Don't do this
+ if the MEM has a mode-dependent address. */
+ if ((GET_CODE (op) == LSHIFTRT
+ || GET_CODE (op) == ASHIFTRT)
+ && SCALAR_INT_MODE_P (op_mode)
+ && MEM_P (XEXP (op, 0))
+ && CONST_INT_P (XEXP (op, 1))
+ && (INTVAL (XEXP (op, 1)) % GET_MODE_BITSIZE (mode)) == 0
+ && INTVAL (XEXP (op, 1)) > 0
+ && INTVAL (XEXP (op, 1)) < GET_MODE_BITSIZE (op_mode)
+ && ! mode_dependent_address_p (XEXP (XEXP (op, 0), 0),
+ MEM_ADDR_SPACE (XEXP (op, 0)))
+ && ! MEM_VOLATILE_P (XEXP (op, 0))
+ && (GET_MODE_SIZE (mode) >= UNITS_PER_WORD
+ || WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN))
+ {
+ int byte = subreg_lowpart_offset (mode, op_mode);
+ int shifted_bytes = INTVAL (XEXP (op, 1)) / BITS_PER_UNIT;
+ return adjust_address_nv (XEXP (op, 0), mode,
+ (WORDS_BIG_ENDIAN
+ ? byte - shifted_bytes
+ : byte + shifted_bytes));
+ }
+
+ /* (truncate:SI (OP:DI ({sign,zero}_extend:DI foo:SI))) is
+ (OP:SI foo:SI) if OP is NEG or ABS. */
+ if ((GET_CODE (op) == ABS
+ || GET_CODE (op) == NEG)
+ && (GET_CODE (XEXP (op, 0)) == SIGN_EXTEND
+ || GET_CODE (XEXP (op, 0)) == ZERO_EXTEND)
+ && GET_MODE (XEXP (XEXP (op, 0), 0)) == mode)
+ return simplify_gen_unary (GET_CODE (op), mode,
+ XEXP (XEXP (op, 0), 0), mode);
+
+ /* (truncate:A (subreg:B (truncate:C X) 0)) is
+ (truncate:A X). */
+ if (GET_CODE (op) == SUBREG
+ && SCALAR_INT_MODE_P (mode)
+ && SCALAR_INT_MODE_P (op_mode)
+ && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (op)))
+ && GET_CODE (SUBREG_REG (op)) == TRUNCATE
+ && subreg_lowpart_p (op))
+ return simplify_gen_unary (TRUNCATE, mode, XEXP (SUBREG_REG (op), 0),
+ GET_MODE (XEXP (SUBREG_REG (op), 0)));
+
+ /* (truncate:A (truncate:B X)) is (truncate:A X). */
+ if (GET_CODE (op) == TRUNCATE)
+ return simplify_gen_unary (TRUNCATE, mode, XEXP (op, 0),
+ GET_MODE (XEXP (op, 0)));
+
+ return NULL_RTX;
+}
+
/* Try to simplify a unary operation CODE whose output mode is to be
MODE with input operand OP whose mode was originally OP_MODE.
Return zero if no simplification can be made. */
@@ -689,12 +850,6 @@ simplify_unary_operation_1 (enum rtx_cod
op_mode = mode;
in2 = simplify_gen_unary (NOT, op_mode, in2, op_mode);
- if (GET_CODE (in2) == NOT && GET_CODE (in1) != NOT)
- {
- rtx tem = in2;
- in2 = in1; in1 = tem;
- }
-
return gen_rtx_fmt_ee (GET_CODE (op) == IOR ? AND : IOR,
mode, in1, in2);
}
@@ -821,44 +976,24 @@ simplify_unary_operation_1 (enum rtx_cod
if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
break;
- /* (truncate:SI ({sign,zero}_extend:DI foo:SI)) == foo:SI. */
- if ((GET_CODE (op) == SIGN_EXTEND
- || GET_CODE (op) == ZERO_EXTEND)
- && GET_MODE (XEXP (op, 0)) == mode)
- return XEXP (op, 0);
-
- /* (truncate:SI (OP:DI ({sign,zero}_extend:DI foo:SI))) is
- (OP:SI foo:SI) if OP is NEG or ABS. */
- if ((GET_CODE (op) == ABS
- || GET_CODE (op) == NEG)
- && (GET_CODE (XEXP (op, 0)) == SIGN_EXTEND
- || GET_CODE (XEXP (op, 0)) == ZERO_EXTEND)
- && GET_MODE (XEXP (XEXP (op, 0), 0)) == mode)
- return simplify_gen_unary (GET_CODE (op), mode,
- XEXP (XEXP (op, 0), 0), mode);
+ /* Don't optimize (lshiftrt (mult ...)) as it would interfere
+ with the umulXi3_highpart patterns. */
+ if (GET_CODE (op) == LSHIFTRT
+ && GET_CODE (XEXP (op, 0)) == MULT)
+ break;
- /* (truncate:A (subreg:B (truncate:C X) 0)) is
- (truncate:A X). */
- if (GET_CODE (op) == SUBREG
- && GET_CODE (SUBREG_REG (op)) == TRUNCATE
- && subreg_lowpart_p (op))
- return simplify_gen_unary (TRUNCATE, mode, XEXP (SUBREG_REG (op), 0),
- GET_MODE (XEXP (SUBREG_REG (op), 0)));
+ if (GET_MODE (op) != VOIDmode)
+ {
+ temp = simplify_truncation (mode, op, GET_MODE (op));
+ if (temp)
+ return temp;
+ }
/* If we know that the value is already truncated, we can
- replace the TRUNCATE with a SUBREG. Note that this is also
- valid if TRULY_NOOP_TRUNCATION is false for the corresponding
- modes we just have to apply a different definition for
- truncation. But don't do this for an (LSHIFTRT (MULT ...))
- since this will cause problems with the umulXi3_highpart
- patterns. */
- if ((TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op))
- ? (num_sign_bit_copies (op, GET_MODE (op))
- > (unsigned int) (GET_MODE_PRECISION (GET_MODE (op))
- - GET_MODE_PRECISION (mode)))
- : truncated_to_mode (mode, op))
- && ! (GET_CODE (op) == LSHIFTRT
- && GET_CODE (XEXP (op, 0)) == MULT))
+ replace the TRUNCATE with a SUBREG. */
+ if (GET_MODE_NUNITS (mode) == 1
+ && (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op))
+ || truncated_to_mode (mode, op)))
return rtl_hooks.gen_lowpart_no_emit (mode, op);
/* A truncate of a comparison can be replaced with a subreg if
@@ -5595,14 +5730,6 @@ simplify_subreg (enum machine_mode outer
return NULL_RTX;
}
- /* Merge implicit and explicit truncations. */
-
- if (GET_CODE (op) == TRUNCATE
- && GET_MODE_SIZE (outermode) < GET_MODE_SIZE (innermode)
- && subreg_lowpart_offset (outermode, innermode) == byte)
- return simplify_gen_unary (TRUNCATE, outermode, XEXP (op, 0),
- GET_MODE (XEXP (op, 0)));
-
/* SUBREG of a hard register => just change the register number
and/or mode. If the hard register is not valid in that mode,
suppress this simplification. If the hard register is the stack,
@@ -5688,160 +5815,23 @@ simplify_subreg (enum machine_mode outer
return NULL_RTX;
}
- /* Optimize SUBREG truncations of zero and sign extended values. */
- if ((GET_CODE (op) == ZERO_EXTEND
- || GET_CODE (op) == SIGN_EXTEND)
- && SCALAR_INT_MODE_P (innermode)
- && GET_MODE_PRECISION (outermode) < GET_MODE_PRECISION (innermode))
+ /* A SUBREG resulting from a zero extension may fold to zero if
+ it extracts higher bits that the ZERO_EXTEND's source bits. */
+ if (GET_CODE (op) == ZERO_EXTEND)
{
unsigned int bitpos = subreg_lsb_1 (outermode, innermode, byte);
-
- /* If we're requesting the lowpart of a zero or sign extension,
- there are three possibilities. If the outermode is the same
- as the origmode, we can omit both the extension and the subreg.
- If the outermode is not larger than the origmode, we can apply
- the truncation without the extension. Finally, if the outermode
- is larger than the origmode, but both are integer modes, we
- can just extend to the appropriate mode. */
- if (bitpos == 0)
- {
- enum machine_mode origmode = GET_MODE (XEXP (op, 0));
- if (outermode == origmode)
- return XEXP (op, 0);
- if (GET_MODE_PRECISION (outermode) <= GET_MODE_PRECISION (origmode))
- return simplify_gen_subreg (outermode, XEXP (op, 0), origmode,
- subreg_lowpart_offset (outermode,
- origmode));
- if (SCALAR_INT_MODE_P (outermode))
- return simplify_gen_unary (GET_CODE (op), outermode,
- XEXP (op, 0), origmode);
- }
-
- /* A SUBREG resulting from a zero extension may fold to zero if
- it extracts higher bits that the ZERO_EXTEND's source bits. */
- if (GET_CODE (op) == ZERO_EXTEND
- && bitpos >= GET_MODE_PRECISION (GET_MODE (XEXP (op, 0))))
+ if (bitpos >= GET_MODE_PRECISION (GET_MODE (XEXP (op, 0))))
return CONST0_RTX (outermode);
}
- /* Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
- to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)), where
- the outer subreg is effectively a truncation to the original mode. */
- if ((GET_CODE (op) == PLUS
- || GET_CODE (op) == MINUS
- || GET_CODE (op) == MULT)
- && SCALAR_INT_MODE_P (outermode)
+ if (SCALAR_INT_MODE_P (outermode)
&& SCALAR_INT_MODE_P (innermode)
&& GET_MODE_PRECISION (outermode) < GET_MODE_PRECISION (innermode)
&& byte == subreg_lowpart_offset (outermode, innermode))
{
- rtx op0 = simplify_gen_subreg (outermode, XEXP (op, 0),
- innermode, byte);
- if (op0)
- {
- rtx op1 = simplify_gen_subreg (outermode, XEXP (op, 1),
- innermode, byte);
- if (op1)
- return simplify_gen_binary (GET_CODE (op), outermode, op0, op1);
- }
- }
-
- /* Simplify (subreg:QI (lshiftrt:SI (sign_extend:SI (x:QI)) C), 0) into
- to (ashiftrt:QI (x:QI) C), where C is a suitable small constant and
- the outer subreg is effectively a truncation to the original mode. */
- if ((GET_CODE (op) == LSHIFTRT
- || GET_CODE (op) == ASHIFTRT)
- && SCALAR_INT_MODE_P (outermode)
- && SCALAR_INT_MODE_P (innermode)
- /* Ensure that OUTERMODE is at least twice as wide as the INNERMODE
- to avoid the possibility that an outer LSHIFTRT shifts by more
- than the sign extension's sign_bit_copies and introduces zeros
- into the high bits of the result. */
- && (2 * GET_MODE_PRECISION (outermode)) <= GET_MODE_PRECISION (innermode)
- && CONST_INT_P (XEXP (op, 1))
- && GET_CODE (XEXP (op, 0)) == SIGN_EXTEND
- && GET_MODE (XEXP (XEXP (op, 0), 0)) == outermode
- && INTVAL (XEXP (op, 1)) < GET_MODE_PRECISION (outermode)
- && subreg_lsb_1 (outermode, innermode, byte) == 0)
- return simplify_gen_binary (ASHIFTRT, outermode,
- XEXP (XEXP (op, 0), 0), XEXP (op, 1));
-
- /* Likewise (subreg:QI (lshiftrt:SI (zero_extend:SI (x:QI)) C), 0) into
- to (lshiftrt:QI (x:QI) C), where C is a suitable small constant and
- the outer subreg is effectively a truncation to the original mode. */
- if ((GET_CODE (op) == LSHIFTRT
- || GET_CODE (op) == ASHIFTRT)
- && SCALAR_INT_MODE_P (outermode)
- && SCALAR_INT_MODE_P (innermode)
- && GET_MODE_PRECISION (outermode) < GET_MODE_PRECISION (innermode)
- && CONST_INT_P (XEXP (op, 1))
- && GET_CODE (XEXP (op, 0)) == ZERO_EXTEND
- && GET_MODE (XEXP (XEXP (op, 0), 0)) == outermode
- && INTVAL (XEXP (op, 1)) < GET_MODE_PRECISION (outermode)
- && subreg_lsb_1 (outermode, innermode, byte) == 0)
- return simplify_gen_binary (LSHIFTRT, outermode,
- XEXP (XEXP (op, 0), 0), XEXP (op, 1));
-
- /* Likewise (subreg:QI (ashift:SI (zero_extend:SI (x:QI)) C), 0) into
- to (ashift:QI (x:QI) C), where C is a suitable small constant and
- the outer subreg is effectively a truncation to the original mode. */
- if (GET_CODE (op) == ASHIFT
- && SCALAR_INT_MODE_P (outermode)
- && SCALAR_INT_MODE_P (innermode)
- && GET_MODE_PRECISION (outermode) < GET_MODE_PRECISION (innermode)
- && CONST_INT_P (XEXP (op, 1))
- && (GET_CODE (XEXP (op, 0)) == ZERO_EXTEND
- || GET_CODE (XEXP (op, 0)) == SIGN_EXTEND)
- && GET_MODE (XEXP (XEXP (op, 0), 0)) == outermode
- && INTVAL (XEXP (op, 1)) < GET_MODE_PRECISION (outermode)
- && subreg_lsb_1 (outermode, innermode, byte) == 0)
- return simplify_gen_binary (ASHIFT, outermode,
- XEXP (XEXP (op, 0), 0), XEXP (op, 1));
-
- /* Recognize a word extraction from a multi-word subreg. */
- if ((GET_CODE (op) == LSHIFTRT
- || GET_CODE (op) == ASHIFTRT)
- && SCALAR_INT_MODE_P (innermode)
- && GET_MODE_PRECISION (outermode) >= BITS_PER_WORD
- && GET_MODE_PRECISION (innermode) >= (2 * GET_MODE_PRECISION (outermode))
- && CONST_INT_P (XEXP (op, 1))
- && (INTVAL (XEXP (op, 1)) & (GET_MODE_PRECISION (outermode) - 1)) == 0
- && INTVAL (XEXP (op, 1)) >= 0
- && INTVAL (XEXP (op, 1)) < GET_MODE_PRECISION (innermode)
- && byte == subreg_lowpart_offset (outermode, innermode))
- {
- int shifted_bytes = INTVAL (XEXP (op, 1)) / BITS_PER_UNIT;
- return simplify_gen_subreg (outermode, XEXP (op, 0), innermode,
- (WORDS_BIG_ENDIAN
- ? byte - shifted_bytes
- : byte + shifted_bytes));
- }
-
- /* If we have a lowpart SUBREG of a right shift of MEM, make a new MEM
- and try replacing the SUBREG and shift with it. Don't do this if
- the MEM has a mode-dependent address or if we would be widening it. */
-
- if ((GET_CODE (op) == LSHIFTRT
- || GET_CODE (op) == ASHIFTRT)
- && SCALAR_INT_MODE_P (innermode)
- && MEM_P (XEXP (op, 0))
- && CONST_INT_P (XEXP (op, 1))
- && GET_MODE_SIZE (outermode) < GET_MODE_SIZE (GET_MODE (op))
- && (INTVAL (XEXP (op, 1)) % GET_MODE_BITSIZE (outermode)) == 0
- && INTVAL (XEXP (op, 1)) > 0
- && INTVAL (XEXP (op, 1)) < GET_MODE_BITSIZE (innermode)
- && ! mode_dependent_address_p (XEXP (XEXP (op, 0), 0),
- MEM_ADDR_SPACE (XEXP (op, 0)))
- && ! MEM_VOLATILE_P (XEXP (op, 0))
- && byte == subreg_lowpart_offset (outermode, innermode)
- && (GET_MODE_SIZE (outermode) >= UNITS_PER_WORD
- || WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN))
- {
- int shifted_bytes = INTVAL (XEXP (op, 1)) / BITS_PER_UNIT;
- return adjust_address_nv (XEXP (op, 0), outermode,
- (WORDS_BIG_ENDIAN
- ? byte - shifted_bytes
- : byte + shifted_bytes));
+ rtx tem = simplify_truncation (outermode, op, innermode);
+ if (tem)
+ return tem;
}
return NULL_RTX;
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c 2012-10-02 21:02:21.000000000 +0100
+++ gcc/config/mips/mips.c 2012-10-06 10:06:42.617864078 +0100
@@ -3527,6 +3527,17 @@ mips_set_reg_reg_cost (enum machine_mode
}
}
+/* Return the cost of an operand X that can be trucated for free.
+ SPEED says whether we're optimizing for size or speed. */
+
+static int
+mips_truncated_op_cost (rtx x, bool speed)
+{
+ if (GET_CODE (x) == TRUNCATE)
+ x = XEXP (x, 0);
+ return set_src_cost (x, speed);
+}
+
/* Implement TARGET_RTX_COSTS. */
static bool
@@ -3907,12 +3918,13 @@ mips_rtx_costs (rtx x, int code, int out
case ZERO_EXTEND:
if (outer_code == SET
&& ISA_HAS_BADDU
- && (GET_CODE (XEXP (x, 0)) == TRUNCATE
- || GET_CODE (XEXP (x, 0)) == SUBREG)
&& GET_MODE (XEXP (x, 0)) == QImode
- && GET_CODE (XEXP (XEXP (x, 0), 0)) == PLUS)
+ && GET_CODE (XEXP (x, 0)) == PLUS)
{
- *total = set_src_cost (XEXP (XEXP (x, 0), 0), speed);
+ rtx plus = XEXP (x, 0);
+ *total = (COSTS_N_INSNS (1)
+ + mips_truncated_op_cost (XEXP (plus, 0), speed)
+ + mips_truncated_op_cost (XEXP (plus, 1), speed));
return true;
}
*total = mips_zero_extend_cost (mode, XEXP (x, 0));
Index: gcc/config/mips/mips.md
===================================================================
--- gcc/config/mips/mips.md 2012-10-02 20:37:34.000000000 +0100
+++ gcc/config/mips/mips.md 2012-10-06 10:03:47.156873851 +0100
@@ -1305,9 +1305,8 @@ (define_insn "*baddu_si"
(define_insn "*baddu_di<mode>"
[(set (match_operand:GPR 0 "register_operand" "=d")
(zero_extend:GPR
- (truncate:QI
- (plus:DI (match_operand:DI 1 "register_operand" "d")
- (match_operand:DI 2 "register_operand" "d")))))]
+ (plus:QI (truncate:QI (match_operand:DI 1 "register_operand" "d"))
+ (truncate:QI (match_operand:DI 2 "register_operand" "d")))))]
"ISA_HAS_BADDU && TARGET_64BIT"
"baddu\\t%0,%1,%2"
[(set_attr "alu_type" "add")])