On Wed, May 28, 2025 at 6:55 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Biener <richard.guent...@gmail.com> writes: > > On Thu, May 22, 2025 at 12:19 PM Richard Sandiford > > <richard.sandif...@arm.com> wrote: > >> > >> As the rtl.texi documentation of RTX_AUTOINC expressions says: > >> > >> If a register used as the operand of these expressions is used in > >> another address in an insn, the original value of the register is > >> used. Uses of the register outside of an address are not permitted > >> within the same insn as a use in an embedded side effect expression > >> because such insns behave differently on different machines and hence > >> must be treated as ambiguous and disallowed. > >> > >> late-combine was failing to follow this rule. One option would have > >> been to enforce it during the substitution phase, like combine does. > >> This could either be a dedicated condition in the substitution code > >> or, more generally, an extra condition in can_merge_accesses. > >> (The latter would include extending is_pre_post_modify to uses.) > >> > >> However, since the restriction applies to patterns rather than to > >> actions on patterns, the more robust fix seemed to be test and reject > >> this case in (a subroutine of) rtl_ssa::recog. We already do something > >> similar for hard-coded register clobbers. > >> > >> Using vec_rtx_properties isn't the lightest-weight operation > >> out there. I did wonder about relying on the is_pre_post_modify > >> flag of the definitions in the new_defs array, but that would > >> require callers that create new autoincs to set the flag before > >> calling recog. Normally these flags are instead updated > >> automatically based on the final pattern. > >> > >> Besides, recog itself has had to traverse the whole pattern, > >> and it is even less light-weight than vec_rtx_properties. > >> At least the pattern should be in cache. > >> > >> Tested on arm-linux-gnueabihf, aarch64-linux-gnu and > >> x86_64-linux-gnu. OK for trunk and backports? > > > > LGTM, note the 14 branch is currently frozen. > > Thanks. It turns out that I looked at the wrong results for the > arm-linux-gnueabihf testing :-(, and the Linaro CI flagged up a > regression. Although I think the rtl-ssa fix is still the right > one, it showed up a mistake (of mine) in the rtl_properties walker: > try_to_add_src would drop all flags except IN_NOTE before recursing > into RTX_AUTOINC addresses. > > RTX_AUTOINCs only occur in addresses, and so for them, the flags coming > into try_to_add_src are set by: > > unsigned int base_flags = flags & rtx_obj_flags::STICKY_FLAGS; > ... > if (MEM_P (x)) > { > ... > > unsigned int addr_flags = base_flags | rtx_obj_flags::IN_MEM_STORE; > if (flags & rtx_obj_flags::IS_READ) > addr_flags |= rtx_obj_flags::IN_MEM_LOAD; > try_to_add_src (XEXP (x, 0), addr_flags); > return; > } > > This means that the only flags that can be set are: > > - IN_NOTE (the sole member of STICKY_FLAGS) > - IN_MEM_STORE > - IN_MEM_LOAD > > Thus dropping all flags except IN_NOTE had the effect of dropping > IN_MEM_STORE and IN_MEM_LOAD, and nothing else. But those flags > are the ones that mark something as being part of a mem address. > The exclusion was therefore exactly wrong. > > So is the patch OK with the extra rtlanal.cc hunk below? I was wondering > whether it would count as obvious, but the length of the explanation above > suggests not :)
Yes, the patch is OK. The 14 branch is unfrozen, the 13 branch is frozen now. Richard. > Richard > > > gcc/ > PR rtl-optimization/120347 > * rtlanal.cc (rtx_properties::try_to_add_src): Don't drop the > IN_MEM_LOAD and IN_MEM_STORE flags for autoinc registers. > * rtl-ssa/changes.cc (recog_level2): Check whether an > RTX_AUTOINCed register also appears outside of an address. > > gcc/testsuite/ > PR rtl-optimization/120347 > * gcc.dg/torture/pr120347.c: New test. > --- > gcc/rtl-ssa/changes.cc | 18 ++++++++++++++++++ > gcc/rtlanal.cc | 2 +- > gcc/testsuite/gcc.dg/torture/pr120347.c | 13 +++++++++++++ > 3 files changed, 32 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/pr120347.c > > diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc > index eb579ad3ad7..f7aa6a66cdf 100644 > --- a/gcc/rtl-ssa/changes.cc > +++ b/gcc/rtl-ssa/changes.cc > @@ -1106,6 +1106,24 @@ recog_level2 (insn_change &change, > add_regno_clobber_fn add_regno_clobber) > } > } > > + // Per rtl.texi, registers that are modified using RTX_AUTOINC operations > + // cannot also appear outside an address. > + vec_rtx_properties properties; > + properties.add_pattern (pat); > + for (rtx_obj_reference def : properties.refs ()) > + if (def.is_pre_post_modify ()) > + for (rtx_obj_reference use : properties.refs ()) > + if (def.regno == use.regno && !use.in_address ()) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, "register %d is both auto-modified" > + " and used outside an address:\n", def.regno); > + print_rtl_single (dump_file, pat); > + } > + return false; > + } > + > // check_asm_operands checks the constraints after RA, so we don't > // need to do it again. > if (reload_completed && !asm_p) > diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc > index 86a5e473308..900f53e252a 100644 > --- a/gcc/rtlanal.cc > +++ b/gcc/rtlanal.cc > @@ -2235,7 +2235,7 @@ rtx_properties::try_to_add_src (const_rtx x, unsigned > int flags) > { > has_pre_post_modify = true; > > - unsigned int addr_flags = (base_flags > + unsigned int addr_flags = (flags > | rtx_obj_flags::IS_PRE_POST_MODIFY > | rtx_obj_flags::IS_READ); > try_to_add_dest (XEXP (x, 0), addr_flags); > diff --git a/gcc/testsuite/gcc.dg/torture/pr120347.c > b/gcc/testsuite/gcc.dg/torture/pr120347.c > new file mode 100644 > index 00000000000..a2d187bbc5c > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/pr120347.c > @@ -0,0 +1,13 @@ > +/* { dg-do assemble } */ > +/* { dg-additional-options "-march=armv7-a -mthumb" { target { > arm_arch_v7a_ok && arm_thumb2_ok } } } */ > + > +void *end; > +void **start; > +void main(void) > +{ > + for (; end; start++) { > + if (*start) > + return; > + *start = start; > + } > +} > -- > 2.43.0 >