On Mon, Jun 2, 2025 at 5:53 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> In the comment trail for PR119966, I'd said that the validate_subreg
> condition:
>
>   /* The outer size must be ordered wrt the register size, otherwise
>      we wouldn't know at compile time how many registers the outer
>      mode occupies.  */
>   if (!ordered_p (osize, regsize))
>     return false;
>
> "is also potentially relevant" for paradoxical subregs.  But I'd
> forgotten an important caveat.  If the inner size is smaller than
> a register, we know that the inner value will only occupy a single
> register.  Although the paradoxical subreg might extend that single
> register to multiple registers by padding with undefined bits,
> the register size that matters for the extension is:
>
>    REGMODE_NATURAL_SIZE (omode)
>
> rather than regsize's:
>
>    REGMODE_NATURAL_SIZE (imode)
>
> The ordered check is still relevant if the inner value spans
> multiple registers.
>
> Enabling the check above for paradoxical subregs led to an ICE in the
> testcase, where we tried to generate a VNx4QI paradoxical subreg of a
> QI scalar.  This was previously allowed, and AFAIK worked correctly.
>
> The patch doesn't have the effect of relaxing the condition for
> non-paradoxical subregs, since:
>
>   known_le (osize, isize) && known_le (isize, regsize)
>     => known_le (osize, regsize)
>     => ordered_p (osize, regsize)
>
> So even before the patch for PR119966, the condition only existed for
> the maybe_gt (isize, regsize) case.
>
> The term "block" used in the comment is taken from the rtl.texi
> documentation of subregs.
>
> Tested on aarch64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
>
>
> gcc/
>         PR rtl-optimization/120447
>         * emit-rtl.cc (validate_subreg): Restrict ordered_p test
>         between osize and regsize to cases where the inner value
>         occupies multiple blocks.
>
> gcc/testsuite/
>         PR rtl-optimization/120447
>         * gcc.dg/pr120447.c: New test.
> ---
>  gcc/emit-rtl.cc                 |  9 +++++----
>  gcc/testsuite/gcc.dg/pr120447.c | 24 ++++++++++++++++++++++++
>  2 files changed, 29 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr120447.c
>
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index 3f453cda67e..50e3bfcb777 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -998,10 +998,11 @@ validate_subreg (machine_mode omode, machine_mode imode,
>            && known_le (osize, isize))
>      return false;
>
> -  /* The outer size must be ordered wrt the register size, otherwise
> -     we wouldn't know at compile time how many registers the outer
> -     mode occupies.  */
> -  if (!ordered_p (osize, regsize))
> +  /* If ISIZE is greater than REGSIZE, the inner value is split into blocks
> +     of size REGSIZE.  The outer size must then be ordered wrt REGSIZE,
> +     otherwise we wouldn't know at compile time how many blocks the
> +     outer mode occupies.  */
> +  if (maybe_gt (isize, regsize) && !ordered_p (osize, regsize))
>      return false;
>
>    /* For normal pseudo registers, we want most of the same checks.  Namely:
> diff --git a/gcc/testsuite/gcc.dg/pr120447.c b/gcc/testsuite/gcc.dg/pr120447.c
> new file mode 100644
> index 00000000000..bd51f9b174d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr120447.c
> @@ -0,0 +1,24 @@
> +/* { dg-options "-Ofast" } */
> +/* { dg-additional-options "-mcpu=neoverse-v2" { target aarch64*-*-* } } */
> +
> +char g;
> +long h;
> +typedef struct {
> +  void *data;
> +} i;
> +i* a;
> +void b(i *j, char *p2);
> +void c(char *d) {
> +  d = d ? " and " : " or ";
> +  b(a, d);
> +}
> +void b(i *j, char *p2) {
> +  h = __builtin_strlen(p2);
> +  while (g)
> +    ;
> +  int *k = j->data;
> +  char *l = p2, *m = p2 + h;
> +  l += 4;
> +  while (l < m)
> +    *k++ = *l++;
> +}
> --
> 2.43.0
>

Reply via email to