On Tue, 1 Apr 2025, Jakub Jelinek wrote: > On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote: > > > Looking at this some more today, I think we should special case > > > set_noop_p because that can be put into i2 (except for the JUMP_P > > > violations), currently both modified_between_p (pc_rtx, i2, i3) > > > and reg_used_between_p (pc_rtx, i2, i3) returns false. > > > I'll post a patch incrementally for that (but that feels like > > > new optimization, so probably not something that should be backported). > > > > Trying to review this I noticed that both the comment in combine suggests > > that memory set is important and modified_between_p handles MEM_P > > 'x' and for non-REG 'x' possibly MEM operands (CONCAT?) while > > reg_used_between_p only handles REG_P 'reg'. Also shouldn't > > you use reg_referenced_p? At least that function seems to handle > > SET_SRC/SET_DEST separately? > > > > Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why > > does the comment talk about memory? > > I was worried about making too risky changes this late in stage4 > (and especially also for backports). Most of this code is 1992-ish. > I think many of the functions are just misnamed, the reg_ in there doesn't > match what those functions do (bet they initially supported just REGs > and later on support for other kinds of expressions was added, but haven't > done git archeology to prove that). > > What we know for sure is: > && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT > && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART > && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT > && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART > that is checked earlier in the condition. > Then it calls > && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)), > XVECEXP (newpat, 0, 0)) > && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), > XVECEXP (newpat, 0, 1)) > While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p > which is also misnamed, that function handles just fine all of > REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT, > STRICT_LOW_PART, PC and even some further cases. > So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG > of REG, PC (at least the REG and PC cases are triggered on the testcase) > and quite possibly also MEM (SUBREG of MEM not, see below). > > Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that > function for constants just returns false, for PC returns true, for REG > returns reg_set_between_p, for MEM recurses on the address, for > MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code > whether the memory could have been modified in between, for all other > rtxes recurses on the subrtxes. This part didn't change in my patch. > > I've only changed those > - && !modified_between_p (SET_DEST (set{1,0}), i2, i3) > + && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3) > where the former has been described above and clearly handles all of > REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things. > > The replacement reg_used_between_p calls reg_overlap_mentioned_p on each > instruction in between i2 and i3. So, there is clearly a difference > in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p > returns unconditionally true even if there are no instructions in between, > but reg_used_between_p if there are no non-debug insns in between returns > false. Sorry for missing that, guess I should check for that (with the > exception of the noop moves which are often (set (pc) (pc)) and handled > by the incremental patch). In fact not just that, reg_used_between_p > will only return true for PC if it is mentioned anywhere in the insns > in between. > Anyway, except for that, for REG it calls refers_to_regno_p > and so should find any occurrences of any of the REG or parts of it for hard > registers, for MEM returns true if it sees any MEMs in insns in between > (conservatively), for SUBREGs apparently it relies on it being SUBREG of REG > (so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the > SUBREG_REG, PC I've already described. > > Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think > already the initial > && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)), > XVECEXP (newpat, 0, 0)) > && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), > XVECEXP (newpat, 0, 1)) > calls would have failed --enable-checking=rtl or would have misbehaved, so > I think there is no need to check for it further. > > To your question why I don't use reg_referenced_p, that is because > reg_referenced_p is something to call on one insn pattern, while > reg_used_between_p is pretty much that on all insns in between two > instructions (excluding the boundaries). > > So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx > checks to preserve former behavior, like in the following version.
OK. Richard. > 2025-04-01 Jakub Jelinek <ja...@redhat.com> > > PR rtl-optimization/119291 > * combine.cc (try_combine): For splitting of PARALLEL with > 2 independent SETs into i2 and i3 sets check reg_used_between_p > of the SET_DESTs rather than just modified_between_p. > > * gcc.c-torture/execute/pr119291.c: New test. > > --- gcc/combine.cc.jj 2025-03-25 09:34:33.469102343 +0100 > +++ gcc/combine.cc 2025-04-01 12:24:25.956711734 +0200 > @@ -4012,18 +4012,19 @@ try_combine (rtx_insn *i3, rtx_insn *i2, > rtx set1 = XVECEXP (newpat, 0, 1); > > /* Normally, it doesn't matter which of the two is done first, but > - one which uses any regs/memory set in between i2 and i3 can't > - be first. The PARALLEL might also have been pre-existing in i3, > - so we need to make sure that we won't wrongly hoist a SET to i2 > - that would conflict with a death note present in there, or would > - have its dest modified between i2 and i3. */ > + one which uses any regs/memory set or used in between i2 and i3 > + can't be first. The PARALLEL might also have been pre-existing > + in i3, so we need to make sure that we won't wrongly hoist a SET > + to i2 that would conflict with a death note present in there, or > + would have its dest modified or used between i2 and i3. */ > if (!modified_between_p (SET_SRC (set1), i2, i3) > && !(REG_P (SET_DEST (set1)) > && find_reg_note (i2, REG_DEAD, SET_DEST (set1))) > && !(GET_CODE (SET_DEST (set1)) == SUBREG > && find_reg_note (i2, REG_DEAD, > SUBREG_REG (SET_DEST (set1)))) > - && !modified_between_p (SET_DEST (set1), i2, i3) > + && SET_DEST (set1) != pc_rtx > + && !reg_used_between_p (SET_DEST (set1), i2, i3) > /* If I3 is a jump, ensure that set0 is a jump so that > we do not create invalid RTL. */ > && (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx) > @@ -4038,7 +4039,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2, > && !(GET_CODE (SET_DEST (set0)) == SUBREG > && find_reg_note (i2, REG_DEAD, > SUBREG_REG (SET_DEST (set0)))) > - && !modified_between_p (SET_DEST (set0), i2, i3) > + && SET_DEST (set0) != pc_rtx > + && !reg_used_between_p (SET_DEST (set0), i2, i3) > /* If I3 is a jump, ensure that set1 is a jump so that > we do not create invalid RTL. */ > && (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx) > --- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj 2025-03-27 > 09:48:01.917407084 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr119291.c 2025-03-27 > 09:47:48.020598094 +0100 > @@ -0,0 +1,33 @@ > +/* PR rtl-optimization/119291 */ > + > +int a; > +long c; > + > +__attribute__((noipa)) void > +foo (int x) > +{ > + if (x != 0) > + __builtin_abort (); > + a = 42; > +} > + > +int > +main () > +{ > + int e = 1; > +lab: > + if (a < 2) > + { > + int b = e; > + _Bool d = a != 0; > + _Bool f = b != 0; > + unsigned long g = -(d & f); > + unsigned long h = c & g; > + unsigned long i = ~c; > + e = -(i & h); > + c = e != 0; > + a = ~e + b; > + foo (e); > + goto lab; > + } > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)