Christophe Lyon via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > This is v2 of this patch series, addressing the comments I received. > The changes v1 -> v2 are: > > - Patch 3: added an executable test, and updated > check_effective_target_arm_mve_hw > - Patch 4: split into patch 4 and patch 14 (to keep numbering the same > for the other patches) > - Patch 5: updated arm_class_likely_spilled_p as suggested. > - Patch 7: updated test_vector_ops_duplicate in simplify-rtx.c as > suggested. > - Patch 8: added V2DI -> HI/hi mapping in MVE_VPRED/MVE_vpred > iterators, removed now useless mve_vpselq_<supf>v2di, and fixed > mov<mode> expander. > - Patch 9: arm_mode_to_pred_mode now returns opt_machine_mode, removed > useless floating-point checks in vec_cmpu. > - Patch 12: replaced hi with v8bi in v2di load/store instructions > > I'll squash patch 2 with patch patch 9 and patch 3 with patch 8.
This looks good to me part from the question in 12/14 and the couple of other (very) minor nits. Thanks, Richard > Original text: > > This patch series addresses PR 100757 and 101325 by representing > vectors of predicates (MVE VPR.P0 register) as vectors of booleans > rather than using HImode. > > As this implies a lot of mostly mechanical changes, I have tried to > split the patches in a way that should help reviewers, but the split > is a bit artificial. > > Patches 1-3 add new tests. > > Patches 4-6 are small independent improvements. > > Patch 7 implements the predicate qualifier, but does not change any > builtin yet. > > Patch 8 is the first of the two main patches, and uses the new > qualifier to describe the vcmp and vpsel builtins that are useful for > auto-vectorization of comparisons. > > Patch 9 is the second main patch, which fixes the vcond_mask expander. > > Patches 10-13 convert almost all the remaining builtins with HI > operands to use the predicate qualifier. After these, there are still > a few builtins with HI operands left, about which I am not sure: vctp, > vpnot, load-gather and store-scatter with v2di operands. In fact, > patches 11/12 update some STR/LDR qualifiers in a way that breaks > these v2di builtins although existing tests still pass. > > Christophe Lyon (14): > arm: Add new tests for comparison vectorization with Neon and MVE > arm: Add tests for PR target/100757 > arm: Add tests for PR target/101325 > arm: Add GENERAL_AND_VPR_REGS regclass > arm: Add support for VPR_REG in arm_class_likely_spilled_p > arm: Fix mve_vmvnq_n_<supf><mode> argument mode > arm: Implement MVE predicates as vectors of booleans > arm: Implement auto-vectorized MVE comparisons with vectors of boolean > predicates > arm: Fix vcond_mask expander for MVE (PR target/100757) > arm: Convert remaining MVE vcmp builtins to predicate qualifiers > arm: Convert more MVE builtins to predicate qualifiers > arm: Convert more load/store MVE builtins to predicate qualifiers > arm: Convert more MVE/CDE builtins to predicate qualifiers > arm: Add VPR_REG to ALL_REGS > > gcc/config/arm/arm-builtins.c | 228 +++-- > gcc/config/arm/arm-modes.def | 5 + > gcc/config/arm/arm-protos.h | 3 +- > gcc/config/arm/arm-simd-builtin-types.def | 4 + > gcc/config/arm/arm.c | 130 ++- > gcc/config/arm/arm.h | 5 +- > gcc/config/arm/arm_mve_builtins.def | 746 ++++++++-------- > gcc/config/arm/iterators.md | 5 + > gcc/config/arm/mve.md | 832 ++++++++++-------- > gcc/config/arm/neon.md | 39 + > gcc/config/arm/vec-common.md | 52 -- > gcc/simplify-rtx.c | 26 +- > .../arm/acle/cde-mve-full-assembly.c | 264 +++--- > .../gcc.target/arm/simd/mve-vcmp-f32-2.c | 32 + > .../gcc.target/arm/simd/neon-compare-1.c | 78 ++ > .../gcc.target/arm/simd/neon-compare-2.c | 13 + > .../gcc.target/arm/simd/neon-compare-3.c | 14 + > .../arm/simd/neon-compare-scalar-1.c | 57 ++ > .../gcc.target/arm/simd/neon-vcmp-f16.c | 12 + > .../gcc.target/arm/simd/neon-vcmp-f32-2.c | 15 + > .../gcc.target/arm/simd/neon-vcmp-f32-3.c | 12 + > .../gcc.target/arm/simd/neon-vcmp-f32.c | 12 + > gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c | 22 + > .../gcc.target/arm/simd/pr100757-2.c | 20 + > .../gcc.target/arm/simd/pr100757-3.c | 20 + > .../gcc.target/arm/simd/pr100757-4.c | 19 + > gcc/testsuite/gcc.target/arm/simd/pr100757.c | 19 + > .../gcc.target/arm/simd/pr101325-2.c | 19 + > gcc/testsuite/gcc.target/arm/simd/pr101325.c | 14 + > gcc/testsuite/lib/target-supports.exp | 3 +- > 30 files changed, 1611 insertions(+), 1109 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325-2.c > create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c