https://gcc.gnu.org/g:5cb46d8fff07afee5ca828303544025e4a2e17b7

commit r16-1101-g5cb46d8fff07afee5ca828303544025e4a2e17b7
Author: Richard Sandiford <richard.sandif...@arm.com>
Date:   Wed Jun 4 13:36:51 2025 +0100

    emit-rtl: Tweak validate_subreg ordered_p condition [PR120447]
    
    In the comment trail for PR119966, I'd said that the validate_subreg
    condition:
    
      /* The outer size must be ordered wrt the register size, otherwise
         we wouldn't know at compile time how many registers the outer
         mode occupies.  */
      if (!ordered_p (osize, regsize))
        return false;
    
    "is also potentially relevant" for paradoxical subregs.  But I'd
    forgotten an important caveat.  If the inner size is smaller than
    a register, we know that the inner value will only occupy a single
    register.  Although the paradoxical subreg might extend that single
    register to multiple registers by padding with undefined bits,
    the register size that matters for the extension is:
    
       REGMODE_NATURAL_SIZE (omode)
    
    rather than regsize's:
    
       REGMODE_NATURAL_SIZE (imode)
    
    The ordered check is still relevant if the inner value spans
    multiple registers.
    
    Enabling the check above for paradoxical subregs led to an ICE in the
    testcase, where we tried to generate a VNx4QI paradoxical subreg of a
    QI scalar.  This was previously allowed, and AFAIK worked correctly.
    
    The patch doesn't have the effect of relaxing the condition for
    non-paradoxical subregs, since:
    
      known_le (osize, isize) && known_le (isize, regsize)
        => known_le (osize, regsize)
        => ordered_p (osize, regsize)
    
    So even before the patch for PR119966, the condition only existed for
    the maybe_gt (isize, regsize) case.
    
    The term "block" used in the comment is taken from the rtl.texi
    documentation of subregs.
    
    gcc/
            PR rtl-optimization/120447
            * emit-rtl.cc (validate_subreg): Restrict ordered_p test
            between osize and regsize to cases where the inner value
            occupies multiple blocks.
    
    gcc/testsuite/
            PR rtl-optimization/120447
            * gcc.dg/pr120447.c: New test.

Diff:
---
 gcc/emit-rtl.cc                 |  9 +++++----
 gcc/testsuite/gcc.dg/pr120447.c | 24 ++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 3f453cda67ed..50e3bfcb777a 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -998,10 +998,11 @@ validate_subreg (machine_mode omode, machine_mode imode,
           && known_le (osize, isize))
     return false;
 
-  /* The outer size must be ordered wrt the register size, otherwise
-     we wouldn't know at compile time how many registers the outer
-     mode occupies.  */
-  if (!ordered_p (osize, regsize))
+  /* If ISIZE is greater than REGSIZE, the inner value is split into blocks
+     of size REGSIZE.  The outer size must then be ordered wrt REGSIZE,
+     otherwise we wouldn't know at compile time how many blocks the
+     outer mode occupies.  */
+  if (maybe_gt (isize, regsize) && !ordered_p (osize, regsize))
     return false;
 
   /* For normal pseudo registers, we want most of the same checks.  Namely:
diff --git a/gcc/testsuite/gcc.dg/pr120447.c b/gcc/testsuite/gcc.dg/pr120447.c
new file mode 100644
index 000000000000..bd51f9b174d4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120447.c
@@ -0,0 +1,24 @@
+/* { dg-options "-Ofast" } */
+/* { dg-additional-options "-mcpu=neoverse-v2" { target aarch64*-*-* } } */
+
+char g;
+long h;
+typedef struct {
+  void *data;
+} i;
+i* a;
+void b(i *j, char *p2);
+void c(char *d) {
+  d = d ? " and " : " or ";
+  b(a, d);
+}
+void b(i *j, char *p2) {
+  h = __builtin_strlen(p2);
+  while (g)
+    ;
+  int *k = j->data;
+  char *l = p2, *m = p2 + h;
+  l += 4;
+  while (l < m)
+    *k++ = *l++;
+}

Reply via email to