This is v3 of this patch series, fixing issues I discovered before
committing v2 (which had been approved).
Thanks a lot to Richard Sandiford for his help.
The changes v2 -> v3 are:
Patch 4: Fix arm_hard_regno_nregs and CLASS_MAX_NREGS to support VPR.
Patch 7: Changes to the underlying representation of vectors of
booleans to account for the different expectations between AArch64/SVE
and Arm/MVE.
Patch 8: Re-use and extend existing thumb2_movhi* patterns instead of
duplicating them in mve_mov<mode>. This requires the introduction of a
new constraint to match a constant vector of booleans. Add a new RTL
test.
Patch 9: Introduce check_effective_target_arm_mve and skip
gcc.dg/signbit-2.c, because with MVE there is no fallback architecture
unlike SVE or AVX512.
Patch 12: Update less load/store MVE builtins
(mve_vldrdq_gather_base_z_<supf>v2di,
mve_vldrdq_gather_offset_z_<supf>v2di,
mve_vldrdq_gather_shifted_offset_z_<supf>v2di,
mve_vstrdq_scatter_base_p_<supf>v2di,
mve_vstrdq_scatter_offset_p_<supf>v2di,
mve_vstrdq_scatter_offset_p_<supf>v2di_insn,
mve_vstrdq_scatter_shifted_offset_p_<supf>v2di,
mve_vstrdq_scatter_shifted_offset_p_<supf>v2di_insn,
mve_vstrdq_scatter_base_wb_p_<supf>v2di,
mve_vldrdq_gather_base_wb_z_<supf>v2di,
mve_vldrdq_gather_base_nowb_z_<supf>v2di,
mve_vldrdq_gather_base_wb_z_<supf>v2di_insn) for which we keep HI mode
for vpr_register_operand.
Patch 13: No need to update
gcc.target/arm/acle/cde-mve-full-assembly.c anymore since we re-use
the mov pattern that emits '@ movhi' in the assembly.
Patch 15: This is a new patch to fix a problem I noticed during this
v2->v3 update.
I'll squash patch 2 with patch 9 and patch 3 with patch 8.
Original text:
This patch series addresses PR 100757 and 101325 by representing
vectors of predicates (MVE VPR.P0 register) as vectors of booleans
rather than using HImode.
As this implies a lot of mostly mechanical changes, I have tried to
split the patches in a way that should help reviewers, but the split
is a bit artificial.
Patches 1-3 add new tests.
Patches 4-6 are small independent improvements.
Patch 7 implements the predicate qualifier, but does not change any
builtin yet.
Patch 8 is the first of the two main patches, and uses the new
qualifier to describe the vcmp and vpsel builtins that are useful for
auto-vectorization of comparisons.
Patch 9 is the second main patch, which fixes the vcond_mask expander.
Patches 10-13 convert almost all the remaining builtins with HI
operands to use the predicate qualifier. After these, there are still
a few builtins with HI operands left, about which I am not sure: vctp,
vpnot, load-gather and store-scatter with v2di operands. In fact,
patches 11/12 update some STR/LDR qualifiers in a way that breaks
these v2di builtins although existing tests still pass.
Christophe Lyon (15):
arm: Add new tests for comparison vectorization with Neon and MVE
arm: Add tests for PR target/100757
arm: Add tests for PR target/101325
arm: Add GENERAL_AND_VPR_REGS regclass
arm: Add support for VPR_REG in arm_class_likely_spilled_p
arm: Fix mve_vmvnq_n_<supf><mode> argument mode
arm: Implement MVE predicates as vectors of booleans
arm: Implement auto-vectorized MVE comparisons with vectors of boolean
predicates
arm: Fix vcond_mask expander for MVE (PR target/100757)
arm: Convert remaining MVE vcmp builtins to predicate qualifiers
arm: Convert more MVE builtins to predicate qualifiers
arm: Convert more load/store MVE builtins to predicate qualifiers
arm: Convert more MVE/CDE builtins to predicate qualifiers
arm: Add VPR_REG to ALL_REGS
arm: Fix constraint check for V8HI in mve_vector_mem_operand
gcc/config/aarch64/aarch64-modes.def | 8 +-
gcc/config/arm/arm-builtins.c | 224 +++--
gcc/config/arm/arm-builtins.h | 4 +-
gcc/config/arm/arm-modes.def | 8 +
gcc/config/arm/arm-protos.h | 4 +-
gcc/config/arm/arm-simd-builtin-types.def | 4 +
gcc/config/arm/arm.c | 169 ++--
gcc/config/arm/arm.h | 9 +-
gcc/config/arm/arm_mve_builtins.def | 746 ++++++++--------
gcc/config/arm/constraints.md | 6 +
gcc/config/arm/iterators.md | 6 +
gcc/config/arm/mve.md | 795 ++++++++++--------
gcc/config/arm/neon.md | 39 +
gcc/config/arm/vec-common.md | 52 --
gcc/config/arm/vfp.md | 34 +-
gcc/doc/sourcebuild.texi | 4 +
gcc/emit-rtl.c | 20 +-
gcc/genmodes.c | 81 +-
gcc/machmode.def | 2 +-
gcc/rtx-vector-builder.c | 4 +-
gcc/simplify-rtx.c | 34 +-
gcc/testsuite/gcc.dg/signbit-2.c | 1 +
.../gcc.target/arm/simd/mve-vcmp-f32-2.c | 32 +
.../gcc.target/arm/simd/neon-compare-1.c | 78 ++
.../gcc.target/arm/simd/neon-compare-2.c | 13 +
.../gcc.target/arm/simd/neon-compare-3.c | 14 +
.../arm/simd/neon-compare-scalar-1.c | 57 ++
.../gcc.target/arm/simd/neon-vcmp-f16.c | 12 +
.../gcc.target/arm/simd/neon-vcmp-f32-2.c | 15 +
.../gcc.target/arm/simd/neon-vcmp-f32-3.c | 12 +
.../gcc.target/arm/simd/neon-vcmp-f32.c | 12 +
gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c | 22 +
.../gcc.target/arm/simd/pr100757-2.c | 20 +
.../gcc.target/arm/simd/pr100757-3.c | 20 +
.../gcc.target/arm/simd/pr100757-4.c | 19 +
gcc/testsuite/gcc.target/arm/simd/pr100757.c | 19 +
.../gcc.target/arm/simd/pr101325-2.c | 19 +
gcc/testsuite/gcc.target/arm/simd/pr101325.c | 14 +
gcc/testsuite/lib/target-supports.exp | 15 +-
gcc/varasm.c | 7 +-
40 files changed, 1635 insertions(+), 1019 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325-2.c
create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c
--
2.25.1