Richard Biener <richard.guent...@gmail.com> writes:
> On Thu, May 22, 2025 at 12:19 PM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> As the rtl.texi documentation of RTX_AUTOINC expressions says:
>>
>>   If a register used as the operand of these expressions is used in
>>   another address in an insn, the original value of the register is
>>   used.  Uses of the register outside of an address are not permitted
>>   within the same insn as a use in an embedded side effect expression
>>   because such insns behave differently on different machines and hence
>>   must be treated as ambiguous and disallowed.
>>
>> late-combine was failing to follow this rule.  One option would have
>> been to enforce it during the substitution phase, like combine does.
>> This could either be a dedicated condition in the substitution code
>> or, more generally, an extra condition in can_merge_accesses.
>> (The latter would include extending is_pre_post_modify to uses.)
>>
>> However, since the restriction applies to patterns rather than to
>> actions on patterns, the more robust fix seemed to be test and reject
>> this case in (a subroutine of) rtl_ssa::recog.  We already do something
>> similar for hard-coded register clobbers.
>>
>> Using vec_rtx_properties isn't the lightest-weight operation
>> out there.  I did wonder about relying on the is_pre_post_modify
>> flag of the definitions in the new_defs array, but that would
>> require callers that create new autoincs to set the flag before
>> calling recog.  Normally these flags are instead updated
>> automatically based on the final pattern.
>>
>> Besides, recog itself has had to traverse the whole pattern,
>> and it is even less light-weight than vec_rtx_properties.
>> At least the pattern should be in cache.
>>
>> Tested on arm-linux-gnueabihf, aarch64-linux-gnu and
>> x86_64-linux-gnu.  OK for trunk and backports?
>
> LGTM, note the 14 branch is currently frozen.

Thanks.  It turns out that I looked at the wrong results for the
arm-linux-gnueabihf testing :-(, and the Linaro CI flagged up a
regression.  Although I think the rtl-ssa fix is still the right
one, it showed up a mistake (of mine) in the rtl_properties walker:
try_to_add_src would drop all flags except IN_NOTE before recursing
into RTX_AUTOINC addresses.

RTX_AUTOINCs only occur in addresses, and so for them, the flags coming
into try_to_add_src are set by:

  unsigned int base_flags = flags & rtx_obj_flags::STICKY_FLAGS;
  ...
  if (MEM_P (x))
    {
      ...

      unsigned int addr_flags = base_flags | rtx_obj_flags::IN_MEM_STORE;
      if (flags & rtx_obj_flags::IS_READ)
        addr_flags |= rtx_obj_flags::IN_MEM_LOAD;
      try_to_add_src (XEXP (x, 0), addr_flags);
      return;
    }

This means that the only flags that can be set are:

- IN_NOTE (the sole member of STICKY_FLAGS)
- IN_MEM_STORE
- IN_MEM_LOAD

Thus dropping all flags except IN_NOTE had the effect of dropping
IN_MEM_STORE and IN_MEM_LOAD, and nothing else.  But those flags
are the ones that mark something as being part of a mem address.
The exclusion was therefore exactly wrong.

So is the patch OK with the extra rtlanal.cc hunk below?  I was wondering
whether it would count as obvious, but the length of the explanation above
suggests not :)

Richard


gcc/
        PR rtl-optimization/120347
        * rtlanal.cc (rtx_properties::try_to_add_src): Don't drop the
        IN_MEM_LOAD and IN_MEM_STORE flags for autoinc registers.
        * rtl-ssa/changes.cc (recog_level2): Check whether an
        RTX_AUTOINCed register also appears outside of an address.

gcc/testsuite/
        PR rtl-optimization/120347
        * gcc.dg/torture/pr120347.c: New test.
---
 gcc/rtl-ssa/changes.cc                  | 18 ++++++++++++++++++
 gcc/rtlanal.cc                          |  2 +-
 gcc/testsuite/gcc.dg/torture/pr120347.c | 13 +++++++++++++
 3 files changed, 32 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr120347.c

diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc
index eb579ad3ad7..f7aa6a66cdf 100644
--- a/gcc/rtl-ssa/changes.cc
+++ b/gcc/rtl-ssa/changes.cc
@@ -1106,6 +1106,24 @@ recog_level2 (insn_change &change, add_regno_clobber_fn 
add_regno_clobber)
            }
        }
 
+  // Per rtl.texi, registers that are modified using RTX_AUTOINC operations
+  // cannot also appear outside an address.
+  vec_rtx_properties properties;
+  properties.add_pattern (pat);
+  for (rtx_obj_reference def : properties.refs ())
+    if (def.is_pre_post_modify ())
+      for (rtx_obj_reference use : properties.refs ())
+       if (def.regno == use.regno && !use.in_address ())
+         {
+           if (dump_file && (dump_flags & TDF_DETAILS))
+             {
+               fprintf (dump_file, "register %d is both auto-modified"
+                        " and used outside an address:\n", def.regno);
+               print_rtl_single (dump_file, pat);
+             }
+           return false;
+         }
+
   // check_asm_operands checks the constraints after RA, so we don't
   // need to do it again.
   if (reload_completed && !asm_p)
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 86a5e473308..900f53e252a 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -2235,7 +2235,7 @@ rtx_properties::try_to_add_src (const_rtx x, unsigned int 
flags)
        {
          has_pre_post_modify = true;
 
-         unsigned int addr_flags = (base_flags
+         unsigned int addr_flags = (flags
                                     | rtx_obj_flags::IS_PRE_POST_MODIFY
                                     | rtx_obj_flags::IS_READ);
          try_to_add_dest (XEXP (x, 0), addr_flags);
diff --git a/gcc/testsuite/gcc.dg/torture/pr120347.c 
b/gcc/testsuite/gcc.dg/torture/pr120347.c
new file mode 100644
index 00000000000..a2d187bbc5c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr120347.c
@@ -0,0 +1,13 @@
+/* { dg-do assemble } */
+/* { dg-additional-options "-march=armv7-a -mthumb" { target { arm_arch_v7a_ok 
&& arm_thumb2_ok } } } */
+
+void *end;
+void **start;
+void main(void)
+{
+  for (; end; start++) {
+    if (*start)
+      return;
+    *start = start;
+  }
+}
-- 
2.43.0

Reply via email to