[Bug target/121118] ICE when constructing a fixed-length SVE predicate

cvs-commit at gcc dot gnu.org via Gcc-bugs Mon, 04 Aug 2025 03:47:42 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121118


--- Comment #2 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Richard Sandiford <rsand...@gcc.gnu.org>:

https://gcc.gnu.org/g:f702b593e7268ab161053bafd097f1b09933b783

commit r16-2731-gf702b593e7268ab161053bafd097f1b09933b783
Author: Richard Sandiford <richard.sandif...@arm.com>
Date:   Mon Aug 4 11:45:28 2025 +0100

    aarch64: Use VNx16BI for more SVE WHILE* results [PR121118]

    PR121118 is about a case where we try to construct a predicate
    constant using a permutation of a PFALSE and a WHILELO.  The WHILELO
    is a .H operation and its result has mode VNx8BI.  However, the
    permute instruction expects both inputs to be VNx16BI, leading to
    an unrecognisable insn ICE.

    VNx8BI is effectively a form of VNx16BI in which every odd-indexed
    bit is insignificant.  In the PR's testcase that's OK, since those
    bits will be dropped by the permutation.  But if the WHILELO had been a
    VNx4BI, so that only every fourth bit is significant, the input to the
    permutation would have had undefined bits.  The testcase in the patch
    has an example of this.

    This feeds into a related ACLE problem that I'd been meaning to
    fix for a long time: every bit of an svbool_t result is significant,
    and so every ACLE intrinsic that returns an svbool_t should return a
    VNx16BI.  That doesn't currently happen for ACLE svwhile* intrinsics.

    This patch fixes both issues together.

    We still need to keep the current WHILE* patterns for autovectorisation,
    where the result mode should match the element width.  The patch
    therefore adds a new set of patterns that are defined to return
    VNx16BI instead.  For want of a better scheme, it uses an "_acle"
    suffix to distinguish these new patterns from the "normal" ones.

    The formulation used is:

      (and:VNx16BI (subreg:VNx16BI normal-pattern 0) C)

    where C has mode VNx16BI and is a canonical ptrue for normal-pattern's
    element width (so that the low bit of each element is set and the upper
    bits are clear).

    This is a bit clunky, and leads to some repetition.  But it has two
    advantages:

    * After g:965564eafb721f8000013a3112f1bba8d8fae32b, converting the
      above expression back to normal-pattern's mode will reduce to
      normal-pattern, so that the pattern for testing the result using a
      PTEST doesn't change.

    * It gives RTL optimisers a bit more information, as the new tests
      demonstrate.

    In the expression above, C is matched using a new "special" predicate
    aarch64_ptrue_all_operand, where "special" means that the mode on the
    predicate is not necessarily the mode of the expression.  In this case,
    C always has mode VNx16BI, but the mode on the predicate indicates which
    kind of canonical PTRUE is needed.

    gcc/
            PR testsuite/121118
            * config/aarch64/iterators.md (VNx16BI_ONLY): New mode iterator.
            * config/aarch64/predicates.md (aarch64_ptrue_all_operand): New
            predicate.
            * config/aarch64/aarch64-sve.md
           
(@aarch64_sve_while_<while_optab_cmp><GPI:mode><VNx16BI_ONLY:mode>_acle)
           
(@aarch64_sve_while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle)
           
(*aarch64_sve_while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle)
            (*while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle_cc): New
            patterns.
            * config/aarch64/aarch64-sve-builtins-functions.h
            (while_comparison::expand): Use the new _acle patterns that
            always return a VNx16BI.
            * config/aarch64/aarch64-sve-builtins-sve2.cc
            (svwhilerw_svwhilewr_impl::expand): Likewise.
            * config/aarch64/aarch64.cc
            (aarch64_sve_move_pred_via_while): Likewise.

    gcc/testsuite/
            PR testsuite/121118
            * gcc.target/aarch64/sve/acle/general/pr121118_1.c: New test.
            * gcc.target/aarch64/sve/acle/general/whilele_13.c: Likewise.
            * gcc.target/aarch64/sve/acle/general/whilelt_6.c: Likewise.
            * gcc.target/aarch64/sve2/acle/general/whilege_1.c: Likewise.
            * gcc.target/aarch64/sve2/acle/general/whilegt_1.c: Likewise.
            * gcc.target/aarch64/sve2/acle/general/whilerw_5.c: Likewise.
            * gcc.target/aarch64/sve2/acle/general/whilewr_5.c: Likewise.

[Bug target/121118] ICE when constructing a fixed-length SVE predicate

Reply via email to