[For some reason this message didn't reach my gmail account]
> 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO
> operands to vector broadcast from an integer with AVX2.
> 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which
> won't increase stack alignment requirement and blocks transformation by
> the combine pass.
> 3. Update PR 87767 tests to expect integer broadcast instead of broadcast
> from memory.
> 4. Update avx512f_cond_move.c to expect integer broadcast.
+ else if (TARGET_64BIT
+ && ix86_broadcast (val, GET_MODE_BITSIZE (DImode),
+ val_broadcast))
+ {
+ /* NB: MOVQ takes a 32-bit signed immediate operand. */
+ if (trunc_int_for_mode (val_broadcast, SImode) != val_broadcast)
+ return nullptr;
+ broadcast_mode = DImode;
+ }
+ else
+ return nullptr;
We have MOVABS insn and movdi_internal knows when to switch between
MOVQ and MOVABS.
+ if (!ix86_expand_vector_init_duplicate (false, vector_mode, target,
+ GEN_INT (val_broadcast)))
+ gcc_unreachable ();
We are using:
bool ok = ix86_expand_vector_init_duplicate (...);
gcc_assert (ok);
idiom throughout i386/. Let's keep it this way.
+ if (REGNO (target) < FIRST_PSEUDO_REGISTER)
+ target = gen_rtx_REG (mode, REGNO (target));
+ else
+ target = convert_to_mode (mode, target, 1);
+
This is not needed. lowpart_subreg should do the trick when changing
mode of hard regs (also see comment for ix86_gen_scratch_sse_rtx).
+ rtx first;
+
+ if (can_create_pseudo_p ()
+ && GET_MODE_SIZE (mode) >= 16
+ && GET_MODE_CLASS (mode) == MODE_VECTOR_INT
+ && (MEM_P (op1)
+ && SYMBOL_REF_P (XEXP (op1, 0))
+ && CONSTANT_POOL_ADDRESS_P (XEXP (op1, 0)))
+ && (first = ix86_broadcast_from_integer_constant (mode, op1)))
+ {
+ /* Broadcast to XMM/YMM/ZMM register from an integer constant. */
+ op1 = ix86_gen_scratch_sse_rtx (mode, false);
+ if (!ix86_expand_vector_init_duplicate (false, mode, op1, first))
+ gcc_unreachable ();
+ emit_move_insn (op0, op1);
+ return;
Please try to avoid assignment inside the condition. And also use
"gcc_assert (ok)" here.
+/* Return a scratch register in MODE for vector load and store. If
+ CONSTANT_INT_BROADCAST is true, it is used to hold constant integer
+ broadcast result. */
+
+rtx
+ix86_gen_scratch_sse_rtx (machine_mode mode,
+ bool constant_int_broadcast)
This function should always return hard reg, simply:
return gen_rtx_REG (mode, (TARGET_64BIT
? LAST_REX_SSE_REG : LAST_SSE_REG));
The complications with pseudo does not bring us anything (at the end
we need a hard reg anyway, and I guess reload knows quite well how to
avoid used temporary).
The function can then be renamed to ix86_gen_scratch_sse_reg.
* gcc.target/i386/avx512f-broadcast-pr87767-1.c: Expect integer
broadcast.
* gcc.target/i386/avx512f-broadcast-pr87767-5.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Likewise.
* gcc.target/i386/avx512f_cond_move.c: Also pass
-mprefer-vector-width=512 and expect integer broadcast.
No review for the above changes for AVX512 tests, someone else should
check if the new code is better here.
Uros.