On 12/23/23 01:58, YunQiang Su wrote:
On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
if 31 or above bits is polluted by an bitops, we will need an
truncate. Let's emit one, and mark let's use the same hardreg
as in and out, the RTL may like:
(insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
(truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
(nil))
We use /s/u flags to mark it as really needed, as in
combine_simplify_rtx, this insn may be considered as truncated,
so let's skip this combination.
gcc/ChangeLog:
PR: 104914.
* combine.cc (try_combine): Skip combine with truncate if
dest is subreg and has /u/s flags on platforms
TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true.
* expr.cc (expand_assignment): Emit a truncate insn, if
31+ bits is polluted for SImode.
gcc/testsuite/ChangeLog:
PR: 104914.
* gcc.target/mips/pr104914.c: New testcase.
I would suggest you show the RTL before/after whatever transformation
has caused problems on your target and explain why you think the
transformation is incorrect.
Focus on the RTL semantics as well as the target specific semantics
because both are critically important here.
I strongly suspect you're just papering over a problem elsewhere.
---
gcc/combine.cc | 23 +++++++++++++++++++++-
gcc/expr.cc | 17 ++++++++++++++++
gcc/testsuite/gcc.target/mips/pr104914.c | 25 ++++++++++++++++++++++++
3 files changed, 64 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 1cda4dd57f2..04b9c414053 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -3294,6 +3294,28 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1,
rtx_insn *i0,
n_occurrences = 0; /* `subst' counts here */
subst_low_luid = DF_INSN_LUID (i2);
+ /* Don't try to combine a TRUNCATE INSN, if it's DEST is SUBREG and has
+ FLAG /s/u. We use these 2 flags to mark this INSN as really needed:
+ normally, it means that the bits of 31+ of this variable is polluted
+ by a bitops. The reason of existing of case (subreg:SI (reg:DI)) is
+ that, the same hardreg may act as src and dest. */
+ if (TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)
+ && INSN_P (i2))
+ {
+ rtx i2dest_o = SET_DEST (PATTERN (i2));
+ rtx i2src_o = SET_SRC (PATTERN (i2));
+ if (GET_CODE (i2dest_o) == SUBREG
+ && GET_MODE (i2dest_o) == SImode
+ && GET_MODE (SUBREG_REG (i2dest_o)) == DImode
+ && SUBREG_PROMOTED_VAR_P (i2dest_o)
+ && SUBREG_PROMOTED_GET (i2dest_o) == SRP_SIGNED
+ && GET_CODE (i2src_o) == TRUNCATE
+ && GET_MODE (i2src_o) == SImode
+ && rtx_equal_p (SUBREG_REG (i2dest_o), XEXP (i2src_o, 0))
+ )
+ return 0;
+ }
So checking SI/DI like this is just wrong. There's nothing special
about SI/DI. Checking for equality between the destination and source
also seems wrong -- if the state of the sign bit is wrong, it's wrong
regardless of whether or not the source/destination register is the same.
@@ -5326,7 +5348,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
UNIQUE_COPY is true if each substitution must be unique. We do this
by copying if `n_occurrences' is nonzero. */
-
static rtx
subst (rtx x, rtx from, rtx to, bool in_dest, bool in_cond, bool unique_copy)
{
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 9fef2bf6585..f7236040a34 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -6284,6 +6284,23 @@ expand_assignment (tree to, tree from, bool nontemporal)
nontemporal, reversep);
convert_move (SUBREG_REG (to_rtx), to_rtx1,
SUBREG_PROMOTED_SIGN (to_rtx));
+
+ rtx last = get_last_insn ();
+ if (TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)
+ && known_ge (bitregion_end, 31)
+ && SUBREG_PROMOTED_VAR_P (to_rtx)
+ && SUBREG_PROMOTED_SIGN (to_rtx) == SRP_SIGNED
+ && GET_MODE (to_rtx) == SImode
+ && GET_MODE (SUBREG_REG (to_rtx)) == DImode
+ && GET_CODE (SET_SRC (PATTERN (last))) == SIGN_EXTEND
+ )
+ {
+ insn_code icode = convert_optab_handler
+ (trunc_optab, SImode, DImode);
+ if (icode != CODE_FOR_nothing)
+ emit_unop_insn (icode, to_rtx,
+ SUBREG_REG (to_rtx), TRUNCATE);
+ }
Similar comments about the modes apply here.
But again, my sense is there's a higher level problem here and that
these changes are just papering over it.
Jeff