https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80754
Bug ID: 80754
Summary: invalid smull instructions generated after r247881
Product: gcc
Version: 8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: amker at gcc dot gnu.org
Target Milestone: ---
Hi,
After r247881, below invalid smull instructions are generated:
smull r2, r2, lr, r3
in test gcc.c-torture/execute/pr53645-2.c for arm-none-linux-gnueabi and
cortex-a9
The revision simply changes rtx cost for tieable modes:
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 321363f..d9f57c3 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -4164,6 +4164,13 @@ rtx_cost (rtx x, machine_mode mode, enum rtx_code
outer_code,
return COSTS_N_INSNS (2 + factor);
break;
+ case TRUNCATE:
+ if (MODES_TIEABLE_P (mode, GET_MODE (XEXP (x, 0))))
+ {
+ total = 0;
+ break;
+ }
+ /* FALLTHRU */
default:
if (targetm.rtx_costs (x, mode, outer_code, opno, &total, speed))
return total
I noticed that in arm.h/arm.c:
/* Implement MODES_TIEABLE_P. */
bool
arm_modes_tieable_p (machine_mode mode1, machine_mode mode2)
{
if (GET_MODE_CLASS (mode1) == GET_MODE_CLASS (mode2))
return true;
/* We specifically want to allow elements of "structure" modes to
be tieable to the structure. This more general condition allows
other rarer situations too. */
if (TARGET_NEON
&& (VALID_NEON_DREG_MODE (mode1)
|| VALID_NEON_QREG_MODE (mode1)
|| VALID_NEON_STRUCT_MODE (mode1))
&& (VALID_NEON_DREG_MODE (mode2)
|| VALID_NEON_QREG_MODE (mode2)
|| VALID_NEON_STRUCT_MODE (mode2)))
return true;
return false;
}
So SImode/DImode are tieable on ARM target, while we have :
(define_insn "*smulsi3_highpart_v6"
[(set (match_operand:SI 0 "s_register_operand" "=r")
(truncate:SI
(lshiftrt:DI
(mult:DI
(sign_extend:DI (match_operand:SI 1 "s_register_operand" "r"))
(sign_extend:DI (match_operand:SI 2 "s_register_operand" "r")))
(const_int 32))))
(clobber (match_scratch:SI 3 "=r"))]
"TARGET_32BIT && arm_arch6"
"smull%?\\t%3, %0, %2, %1"
[(set_attr "type" "smull")
(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")]
)
It looks operand 0 and 3 now get allocated to the same register.
I think this might be a backend issue either in tieable interface or
"*smulsi3_highpart_v6" constraint?
Thanks,
bin