Jeff Law <jeffreya...@gmail.com> writes: > This resurrects a patch from a bit over 2 years ago that I never wrapped > up. IIRC, I ended up up catching covid, then in the hospital for an > unrelated issue and it just got dropped on the floor in the insanity. > > The basic idea here is to help postreload-cse eliminate more > const/copies by recording a small set of conditional equivalences (as > Richi said in 2022, "Ick"). > > It was originally to help eliminate an unnecessary constant load I saw > in coremark, but as seen in BZ107455 the same issues show up in real > code as well. > > Richard B and Richard S both had good comments last time around and > their requests are reflected in this update: > > - Use rtx_equal_p rather than pointer equality > - Restrict to register "destinations" > - Restrict to integer modes > - Adjust entry block handling > > My own wider scale testing resulted in a few more changes. > > - Robustify extracting the (set (pc) ... ), which then required ... > - Handle if src/dst are clobbered by the conditional branch > - Fix logic error causing too many equivalences to be recorded
Heh. I delayed reviewing this for a bit because I have no memory of seeing it before (sign I'm getting old). I haven't found my previous comments, so sorry if this completely contradicts what I said before... Overall it looks good to me, but: > Bootstrapped & regression tested on x86 and clean in my tester. OK for > the trunk? > > Jeff > > PR rtl-optimization/107455 > gcc/ > * postreload.cc (reload_cse_regs_1): Take advantage of conditional > equivalences. > > gcc/testsuite > * gcc.target/riscv/pr107455-1.c: New test. > * gcc.target/riscv/pr107455-2.c: New test. > > diff --git a/gcc/postreload.cc b/gcc/postreload.cc > index a220001ef16..01f79898a4a 100644 > --- a/gcc/postreload.cc > +++ b/gcc/postreload.cc > @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3. If not see > #include "emit-rtl.h" > #include "recog.h" > > +#include "cfghooks.h" > #include "cfgrtl.h" > #include "cfgbuild.h" > #include "cfgcleanup.h" > @@ -221,13 +222,116 @@ reload_cse_regs_1 (void) > init_alias_analysis (); > > FOR_EACH_BB_FN (bb, cfun) > - FOR_BB_INSNS (bb, insn) > - { > - if (INSN_P (insn)) > - cfg_changed |= reload_cse_simplify (insn, testreg); > + { > + /* If BB has a small number of predecessors, see if each of the > + has the same implicit set. If so, record that implicit set so > + that we can add it to the cselib tables. */ > + rtx_insn *implicit_set; > > - cselib_process_insn (insn); > - } > + implicit_set = NULL; > + if (EDGE_COUNT (bb->preds) <= 3) > + { > + edge e; > + edge_iterator ei; > + rtx src = NULL_RTX; > + rtx dest = NULL_RTX; > + > + /* Iterate over each incoming edge and see if they > + all have the same implicit set. */ > + FOR_EACH_EDGE (e, ei, bb->preds) > + { > + /* If the predecessor does not end in a conditional > + jump, then it does not have an implicit set. */ > + if (e->src == ENTRY_BLOCK_PTR_FOR_FN (cfun) > + || !block_ends_with_condjump_p (e->src)) AFAICT, this is directly equivalent to: any_condjump_p (BB_END (e->src)) as checked below, so I think we should just check that (trusting that BB_END is nonnull if e->src isn't the entry block). In particular... > + break; > + > + /* We know the predecessor ends with a conditional > + jump. Now dig into the actal form of the jump > + to potentially extract an implicit set. */ > + rtx_insn *condjump = BB_END (e->src); > + if (!condjump > + || ! any_condjump_p (condjump) > + || ! onlyjump_p (condjump)) > + break; ...the onlyjump_p doesn't seem necessary given the later reg_set_p. Instead we can use... > + > + /* This predecessor ends with a possible equivalence > + producing conditional branch. Extract the condition > + and try to use it to create an equivalence. */ > + rtx pat = single_set (condjump); ...pc_set instead of single_set here. > + rtx i_t_e = SET_SRC (pat); > + gcc_assert (GET_CODE (i_t_e) == IF_THEN_ELSE); > + rtx cond = XEXP (i_t_e, 0); > + if ((GET_CODE (cond) == EQ > + && GET_CODE (XEXP (i_t_e, 1)) == LABEL_REF > + && XEXP (XEXP (i_t_e, 1), 0) == BB_HEAD (bb)) > + || (GET_CODE (cond) == NE > + && XEXP (i_t_e, 2) == pc_rtx > + && e->src->next_bb == bb)) How about using: if (((e->flags & EDGE_FALLTHRU) != 0) == (XEXP (i_t_e, 1) == pc_rtx) ? GET_CODE (cond) == EQ : GET_CODE (cond) == NE) (borrowing a construct from cfgcleanup.cc). That seems more general, since some targets allow (if_then_else ... (pc) (label_ref)). It's also a bit shorter. > + { > + /* If this is the first time through record > + the source and destination. */ > + if (!dest) > + { > + dest = XEXP (cond, 0); > + src = XEXP (cond, 1); > + } > + /* If this is not the first time through, then > + verify the source and destination match. */ > + else if (REG_P (dest) > + && REG_P (src) This prevents a match in the CONST_INT_P case allowed below. It might be worth dropping these two conditions and leaving rtx_equal_p to do its thing. > + && rtx_equal_p (dest, XEXP (cond, 0)) > + && rtx_equal_p (src, XEXP (cond, 1))) > + ; > + else > + break; > + > + /* A few more checks. First make sure we're dealing with > + integer modes, second make sure the values aren't clobbered > + by the conditional branch itself. Do this for every > + conditional jump participating in creation of the > + equivalence. */ > + if (!REG_P (dest) > + || !(REG_P (src) || CONST_INT_P (src)) > + || GET_MODE_CLASS (GET_MODE (dest)) != MODE_INT This feels unnecessarily restrictive: it looks like it should work for any mode, and for any type of constant. But I agree that reload_cse_simplify probably isn't going to do much for other cases and so it's probably not worth opening an extra avenue for bugs. Thanks, Richard > + || reg_set_p (dest, condjump) > + || reg_set_p (src, condjump)) > + break; > + > + } > + else > + break; > + } > + > + /* If all the incoming edges had the same implicit > + set, then create a dummy insn for that set. > + > + It will be entered into the cselib tables before > + we process the first real insn in this block. */ > + if (dest && ei_end_p (ei)) > + implicit_set = make_insn_raw (gen_rtx_SET (dest, src)); > + } > + > + FOR_BB_INSNS (bb, insn) > + { > + if (INSN_P (insn)) > + { > + /* If we recorded an implicit set, enter it > + into the tables before the first real insn. > + > + We have to do it this way because a CODE_LABEL > + will flush the cselib tables. */ > + if (implicit_set) > + { > + cselib_process_insn (implicit_set); > + implicit_set = NULL; > + } > + cfg_changed |= reload_cse_simplify (insn, testreg); > + } > + > + cselib_process_insn (insn); > + } > + } > > /* Clean up. */ > end_alias_analysis (); > diff --git a/gcc/testsuite/gcc.target/riscv/pr107455-1.c > b/gcc/testsuite/gcc.target/riscv/pr107455-1.c > new file mode 100644 > index 00000000000..59616b89176 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr107455-1.c > @@ -0,0 +1,46 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-dp" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-O2" "-O3" "-Og" } } */ > + > + > +typedef struct dllist > +{ > + int i; > + struct dllist *ptr_to_next; > + struct dllist *ptr_to_previous; > +} dllist; > + > +int sglib_dllist_len(dllist *list) { > + int res; > + dllist *_dl_; > + int _r1_, _r2_; > + if (list== ((void *)0)) { > + res = 0; > + } else { > + dllist *_ce_; > + dllist *_ne_; > + _r1_ = 0; > + _ce_ = list; > + while (_ce_!= ((void *)0)) { > + _ne_ = _ce_->ptr_to_previous; > + _r1_++; > + _ce_ = _ne_; > + } > + _dl_ = list->ptr_to_next; > + _r2_ = 0; > + _ce_ = _dl_; > + while (_ce_!= (void *)0) { > + _ne_ = _ce_->ptr_to_next; > + _r2_++; > + _ce_ = _ne_; > + } > + res = _r1_ + _r2_; > + } > + return res; > +} > + > + > +/* There was an unnecessary assignment to the return value until > + recently. Scan for that in the resulting output. */ > +/* { dg-final { scan-assembler-times "li\\ta0,0" 1 } } */ > + > diff --git a/gcc/testsuite/gcc.target/riscv/pr107455-2.c > b/gcc/testsuite/gcc.target/riscv/pr107455-2.c > new file mode 100644 > index 00000000000..91106bb5d80 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr107455-2.c > @@ -0,0 +1,40 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -dp" } */ > +/* This was extracted from coremark. */ > + > + > +typedef signed short ee_s16; > +typedef struct list_data_s > +{ > + ee_s16 data16; > + ee_s16 idx; > +} list_data; > + > +typedef struct list_head_s > +{ > + struct list_head_s *next; > + struct list_data_s *info; > +} list_head; > + > + > +list_head * > +core_list_find(list_head *list, list_data *info) > +{ > + if (info->idx >= 0) > + { > + while (list && (list->info->idx != info->idx)) > + list = list->next; > + return list; > + } > + else > + { > + while (list && ((list->info->data16 & 0xff) != info->data16)) > + list = list->next; > + return list; > + } > +} > + > +/* There was an unnecessary assignment to the return value until > + recently. Scan for that in the resulting output. */ > +/* { dg-final { scan-assembler-not "li\\ta0,0" } } */ > +