https://gcc.gnu.org/g:a9ab9b6450fba50d41dcdb2e7531d45e1280ae95
commit r14-10786-ga9ab9b6450fba50d41dcdb2e7531d45e1280ae95 Author: Richard Sandiford <richard.sandif...@arm.com> Date: Mon Oct 14 13:29:34 2024 +0100 aarch64: Fix caller saves of VNx2QI [PR116238] The testcase contains a VNx2QImode pseudo that is live across a call and that cannot be allocated a call-preserved register. LRA quite reasonably tried to save it before the call and restore it afterwards. Unfortunately, the target told it to do that in SImode, even though punning between SImode and VNx2QImode is disallowed by both TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P. The natural class to use for SImode is GENERAL_REGS, so this led to an unsalvageable situation in which we had: (set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B)) where A needed GENERAL_REGS and B needed FP_REGS. We therefore ended up in a reload loop. The hooks above should ensure that this situation can never occur for incoming subregs. It only happened here because the target explicitly forced it. The decision to use SImode for modes smaller than 4 bytes dates back to the beginning of the port, before 16-bit floating-point modes existed. I'm not sure whether promoting to SImode really makes sense for any FPR, but that's a separate performance/QoI discussion. For now, this patch just disallows using SImode when it is wrong for correctness reasons, since that should be safer to backport. gcc/ PR testsuite/116238 * config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode): Only return SImode if we can convert to and from it. gcc/testsuite/ PR testsuite/116238 * gcc.target/aarch64/sve/pr116238.c: New test. (cherry picked from commit ec9d6d45191f639482344362d048294e74587ca3) Diff: --- gcc/config/aarch64/aarch64.cc | 7 ++++--- gcc/testsuite/gcc.target/aarch64/sve/pr116238.c | 13 +++++++++++++ 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 1beec94629d1..e9bca8bbd08b 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -2505,10 +2505,11 @@ aarch64_hard_regno_caller_save_mode (unsigned regno, unsigned, unnecessarily significant. */ if (PR_REGNUM_P (regno)) return mode; - if (known_ge (GET_MODE_SIZE (mode), 4)) - return mode; - else + if (known_lt (GET_MODE_SIZE (mode), 4) + && REG_CAN_CHANGE_MODE_P (regno, mode, SImode) + && REG_CAN_CHANGE_MODE_P (regno, SImode, mode)) return SImode; + return mode; } /* Return true if I's bits are consecutive ones from the MSB. */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c new file mode 100644 index 000000000000..fe66b198107f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c @@ -0,0 +1,13 @@ +/* { dg-additional-options "-O2 -msve-vector-bits=128" } */ + +void foo(); +typedef unsigned char v2qi __attribute__((vector_size(2))); +void f(v2qi *ptr) +{ + v2qi x = *ptr; + asm volatile ("" :: "w" (x)); + asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15"); + foo(); + asm volatile ("" :: "w" (x)); + *ptr = x; +}