On Fri, Jul 5, 2024 at 2:54 AM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> This patch fixes a problem with splitting of complex AVX512 ternlog
> instructions on x86_64.  A recent change allows the ternlog pattern
> to have multiple mem-like operands prior to reload, by emitting any
> "reloads" as necessary during split1, before register allocation.
> The issue is that this code calls force_reg to place the mem-like
> operand into a register, but unfortunately the vec_duplicate (broadcast)
> form of operands supported by ternlog isn't considered a "general_operand",
> i.e. supported by all instructions.  This mismatch triggers an ICE in
> the middle-end's force_reg, even though the x86 supports loading these
> vec_duplicate operands into a vector register in a single (move)
> instruction.
>
> This patch resolves this problem by replacing force_reg with calls
> to gen_reg_rtx and emit_move (as the i386 backend, unlike the middle-end,
> knows these will be recognized by recog).
>
> I'll admit that I've been unable to reproduce this error without a
> testcase, but my assumption when developing the previous patch was
> that was safe to call force_reg on a vec_duplicate, which this PR
> shows to be wrong (currently).  I'll let smarter minds pronounce on
> whether changing i386.md's definition of general_operand may be an
> alternate solution, but such a change can be independent of this fix.
> [I've a related patch to expand the CONST_VECTORs allowed in
> ix86_legitimate_constant_p before reload, but keeping everything
> happy is tricky].
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
Ok.
>
>
> 2024-07-04  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR target/115751
>         * config/i386/i386-expand.c (ix86_expand_ternlog): Avoid use of
>         force_reg to "reload" non-register operands, as these may contain
>         vec_duplicate (broadcast) operands that aren't supported by
>         force_reg.  Use (safer) gen_reg_rtx and emit_move instead.
>
>
> Thanks in advance (sorry for the inconvenience),
> Roger
> --
>


-- 
BR,
Hongtao

Reply via email to