This patch fixes a problem with splitting of complex AVX512 ternlog
instructions on x86_64. A recent change allows the ternlog pattern
to have multiple mem-like operands prior to reload, by emitting any
"reloads" as necessary during split1, before register allocation.
The issue is that this code calls force_reg to place the mem-like
operand into a register, but unfortunately the vec_duplicate (broadcast)
form of operands supported by ternlog isn't considered a "general_operand",
i.e. supported by all instructions. This mismatch triggers an ICE in
the middle-end's force_reg, even though the x86 supports loading these
vec_duplicate operands into a vector register in a single (move)
instruction.
This patch resolves this problem by replacing force_reg with calls
to gen_reg_rtx and emit_move (as the i386 backend, unlike the middle-end,
knows these will be recognized by recog).
I'll admit that I've been unable to reproduce this error without a
testcase, but my assumption when developing the previous patch was
that was safe to call force_reg on a vec_duplicate, which this PR
shows to be wrong (currently). I'll let smarter minds pronounce on
whether changing i386.md's definition of general_operand may be an
alternate solution, but such a change can be independent of this fix.
[I've a related patch to expand the CONST_VECTORs allowed in
ix86_legitimate_constant_p before reload, but keeping everything
happy is tricky].
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures. Ok for mainline?
2024-07-04 Roger Sayle <[email protected]>
gcc/ChangeLog
PR target/115751
* config/i386/i386-expand.c (ix86_expand_ternlog): Avoid use of
force_reg to "reload" non-register operands, as these may contain
vec_duplicate (broadcast) operands that aren't supported by
force_reg. Use (safer) gen_reg_rtx and emit_move instead.
Thanks in advance (sorry for the inconvenience),
Roger
--
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index dd2c3a8..178dd0b 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -26049,14 +26049,25 @@ ix86_expand_ternlog (machine_mode mode, rtx op0, rtx
op1, rtx op2, int idx,
break;
}
- tmp0 = register_operand (op0, mode) ? op0 : force_reg (mode, op0);
+ if (!register_operand (op0, mode))
+ {
+ /* We can't use force_reg (mode, op0). */
+ tmp0 = gen_reg_rtx (GET_MODE (op0));
+ emit_move_insn (tmp0,op0);
+ }
+ else
+ tmp0 = op0;
if (GET_MODE (tmp0) != mode)
tmp0 = gen_lowpart (mode, tmp0);
if (!op1 || rtx_equal_p (op0, op1))
tmp1 = copy_rtx (tmp0);
else if (!register_operand (op1, mode))
- tmp1 = force_reg (mode, op1);
+ {
+ /* We can't use force_reg (mode, op1). */
+ tmp1 = gen_reg_rtx (GET_MODE (op1));
+ emit_move_insn (tmp1, op1);
+ }
else
tmp1 = op1;
if (GET_MODE (tmp1) != mode)