Richard Biener <richard.guent...@gmail.com> writes: > On Thu, May 22, 2025 at 12:19 PM Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> As the rtl.texi documentation of RTX_AUTOINC expressions says: >> >> If a register used as the operand of these expressions is used in >> another address in an insn, the original value of the register is >> used. Uses of the register outside of an address are not permitted >> within the same insn as a use in an embedded side effect expression >> because such insns behave differently on different machines and hence >> must be treated as ambiguous and disallowed. >> >> late-combine was failing to follow this rule. One option would have >> been to enforce it during the substitution phase, like combine does. >> This could either be a dedicated condition in the substitution code >> or, more generally, an extra condition in can_merge_accesses. >> (The latter would include extending is_pre_post_modify to uses.) >> >> However, since the restriction applies to patterns rather than to >> actions on patterns, the more robust fix seemed to be test and reject >> this case in (a subroutine of) rtl_ssa::recog. We already do something >> similar for hard-coded register clobbers. >> >> Using vec_rtx_properties isn't the lightest-weight operation >> out there. I did wonder about relying on the is_pre_post_modify >> flag of the definitions in the new_defs array, but that would >> require callers that create new autoincs to set the flag before >> calling recog. Normally these flags are instead updated >> automatically based on the final pattern. >> >> Besides, recog itself has had to traverse the whole pattern, >> and it is even less light-weight than vec_rtx_properties. >> At least the pattern should be in cache. >> >> Tested on arm-linux-gnueabihf, aarch64-linux-gnu and >> x86_64-linux-gnu. OK for trunk and backports? > > LGTM, note the 14 branch is currently frozen.
Thanks. It turns out that I looked at the wrong results for the arm-linux-gnueabihf testing :-(, and the Linaro CI flagged up a regression. Although I think the rtl-ssa fix is still the right one, it showed up a mistake (of mine) in the rtl_properties walker: try_to_add_src would drop all flags except IN_NOTE before recursing into RTX_AUTOINC addresses. RTX_AUTOINCs only occur in addresses, and so for them, the flags coming into try_to_add_src are set by: unsigned int base_flags = flags & rtx_obj_flags::STICKY_FLAGS; ... if (MEM_P (x)) { ... unsigned int addr_flags = base_flags | rtx_obj_flags::IN_MEM_STORE; if (flags & rtx_obj_flags::IS_READ) addr_flags |= rtx_obj_flags::IN_MEM_LOAD; try_to_add_src (XEXP (x, 0), addr_flags); return; } This means that the only flags that can be set are: - IN_NOTE (the sole member of STICKY_FLAGS) - IN_MEM_STORE - IN_MEM_LOAD Thus dropping all flags except IN_NOTE had the effect of dropping IN_MEM_STORE and IN_MEM_LOAD, and nothing else. But those flags are the ones that mark something as being part of a mem address. The exclusion was therefore exactly wrong. So is the patch OK with the extra rtlanal.cc hunk below? I was wondering whether it would count as obvious, but the length of the explanation above suggests not :) Richard gcc/ PR rtl-optimization/120347 * rtlanal.cc (rtx_properties::try_to_add_src): Don't drop the IN_MEM_LOAD and IN_MEM_STORE flags for autoinc registers. * rtl-ssa/changes.cc (recog_level2): Check whether an RTX_AUTOINCed register also appears outside of an address. gcc/testsuite/ PR rtl-optimization/120347 * gcc.dg/torture/pr120347.c: New test. --- gcc/rtl-ssa/changes.cc | 18 ++++++++++++++++++ gcc/rtlanal.cc | 2 +- gcc/testsuite/gcc.dg/torture/pr120347.c | 13 +++++++++++++ 3 files changed, 32 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr120347.c diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc index eb579ad3ad7..f7aa6a66cdf 100644 --- a/gcc/rtl-ssa/changes.cc +++ b/gcc/rtl-ssa/changes.cc @@ -1106,6 +1106,24 @@ recog_level2 (insn_change &change, add_regno_clobber_fn add_regno_clobber) } } + // Per rtl.texi, registers that are modified using RTX_AUTOINC operations + // cannot also appear outside an address. + vec_rtx_properties properties; + properties.add_pattern (pat); + for (rtx_obj_reference def : properties.refs ()) + if (def.is_pre_post_modify ()) + for (rtx_obj_reference use : properties.refs ()) + if (def.regno == use.regno && !use.in_address ()) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "register %d is both auto-modified" + " and used outside an address:\n", def.regno); + print_rtl_single (dump_file, pat); + } + return false; + } + // check_asm_operands checks the constraints after RA, so we don't // need to do it again. if (reload_completed && !asm_p) diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc index 86a5e473308..900f53e252a 100644 --- a/gcc/rtlanal.cc +++ b/gcc/rtlanal.cc @@ -2235,7 +2235,7 @@ rtx_properties::try_to_add_src (const_rtx x, unsigned int flags) { has_pre_post_modify = true; - unsigned int addr_flags = (base_flags + unsigned int addr_flags = (flags | rtx_obj_flags::IS_PRE_POST_MODIFY | rtx_obj_flags::IS_READ); try_to_add_dest (XEXP (x, 0), addr_flags); diff --git a/gcc/testsuite/gcc.dg/torture/pr120347.c b/gcc/testsuite/gcc.dg/torture/pr120347.c new file mode 100644 index 00000000000..a2d187bbc5c --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr120347.c @@ -0,0 +1,13 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-march=armv7-a -mthumb" { target { arm_arch_v7a_ok && arm_thumb2_ok } } } */ + +void *end; +void **start; +void main(void) +{ + for (; end; start++) { + if (*start) + return; + *start = start; + } +} -- 2.43.0