On Fri, Oct 20, 2017 at 04:10:15AM +0000, Michael Collison wrote: > This patch fixes an ICE on x86 because we were not constraining the operands > of a recognized insn. Bootstrapped and tested on aarch64-none-linux-gnu and > x86_64. > > Also successfully compiled the failing test cases in 82597 and duplicate > 82592. > > Ok for trunk? > > 2017-10-18 Michael Collison <michael.colli...@arm.com> > > PR rtl-optimization/82597 > * compare-elim.c: (try_validate_parallel): Constrain operands > of recognized insn.
That just duplicates more of insn_invalid_p. I wonder if we don't want to do something like the following instead (untested so far). The insn_uid decrement and ggc_free can be left out if deemed unnecessary, I don't have an idea how many try_validate_parallel calls fail in real-world. More importantly, I still don't really like the df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN); part of the earlier changes, that is very expensive and I doubt it is really necessary. You only use the UD/DU chains to find from the comparison insn the previous setter of the in_a in the same basic block (if any), but you walk before that the basic block from HEAD to END in find_comparison_dom_walker::before_dom_children. So, wouldn't it be cheaper to track during that walk the last setter insn of each hard register e.g. in an array indexed by REGNO (or perhaps track only selected simple single_set arith instructions) and remember for comparisons of a hard reg with const0_rtx in struct comparison next to prev_clobber field also the reg_setter? After all, that is the way prev_clobber is computed, except in that case it is just a single register (CC) we track it for. For many targets that is all we need (if all the arith instructions clobber flags), on say x86_64/i686 there are only very few exceptions (e.g. lea, but that is typically used in a way that makes it impossible to merge). On others such as aarch64 (or arm or both?) not, so we need to track more. The pass has other weirdnesses, e.g. for each insn: /* Compute the set of registers modified by this instruction. */ bitmap_clear (killed); df_simulate_find_defs (insn, killed); ... /* Notice if this instruction kills the flags register. */ if (bitmap_bit_p (killed, targetm.flags_regnum)) { Given df_simulate_find_defs is: df_ref def; FOR_EACH_INSN_DEF (def, insn) bitmap_set_bit (defs, DF_REF_REGNO (def)); this is quite expensive way of doing this with a completely useless bitmap. If we throw away the first two stmts and replace bitmap_bit_p with: df_ref def; FOR_EACH_INSN_DEF (def, insn) if (DF_REF_REGNO (def) == targetm.flags_regnum) { ... break; we'll save all the bitmap handling. 2017-10-31 Jakub Jelinek <ja...@redhat.com> PR rtl-optimization/82778 * compare-elim.c (try_validate_parallel): Use insn_invalid_p instead of recog_memoized. Return insn rather than just the pattern. * g++.dg/opt/pr82778.C: New test. --- gcc/compare-elim.c.jj 2017-10-19 09:08:17.000000000 +0200 +++ gcc/compare-elim.c 2017-10-31 09:39:26.936696234 +0100 @@ -625,13 +625,18 @@ can_merge_compare_into_arith (rtx_insn * static rtx try_validate_parallel (rtx set_a, rtx set_b) { - rtx par - = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set_a, set_b)); + rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set_a, set_b)); + rtx_insn *insn = make_insn_raw (par); - rtx_insn *insn; - insn = gen_rtx_INSN (VOIDmode, 0, 0, 0, par, 0, -1, 0); + if (insn_invalid_p (insn, false)) + { + crtl->emit.x_cur_insn_uid--; + ggc_free (insn); + ggc_free (par); + return NULL_RTX; + } - return recog_memoized (insn) > 0 ? par : NULL_RTX; + return insn; } /* For a comparison instruction described by CMP check if it compares a @@ -711,7 +716,7 @@ try_merge_compare (struct comparison *cm } if (!apply_change_group ()) return false; - emit_insn_after (par, def_insn); + emit_insn_after_setloc (par, def_insn, INSN_LOCATION (def_insn)); delete_insn (def_insn); delete_insn (cmp->insn); return true; --- gcc/testsuite/g++.dg/opt/pr82778.C.jj 2017-10-31 09:01:25.025934660 +0100 +++ gcc/testsuite/g++.dg/opt/pr82778.C 2017-10-31 09:00:08.000000000 +0100 @@ -0,0 +1,37 @@ +// PR rtl-optimization/82778 +// { dg-do compile } +// { dg-options "-O2" } + +template <typename a, int b> struct c { + typedef a d[b]; + static a e(d f, int g) { return f[g]; } +}; +template <typename a, int b> struct B { + typedef c<a, b> h; + typename h::d i; + long j; + a at() { return h::e(i, j); } +}; +int k, m, r, s, t; +char l, n, q; +short o, p, w; +struct C { + int u; +}; +B<C, 4> v; +void x() { + if (((p > (q ? v.at().u : k)) >> l - 226) + !(n ^ r * m)) + s = ((-(((p > (q ? v.at().u : k)) >> l - 226) + !(n ^ r * m)) < 0) / + (-(((p > (q ? v.at().u : k)) >> l - 226) + !(n ^ r * m)) ^ + -25 & o) && + p) >> + (0 <= 0 + ? 0 || + (-(((p > (q ? v.at().u : k)) >> l - 226) + !(n ^ r * m)) < + 0) / + (-(((p > (q ? v.at().u : k)) >> l - 226) + + !(n ^ r * m)) ^ -25 & o) + : 0); + w = (p > (q ? v.at().u : k)) >> l - 226; + t = !(n ^ r * m); +} Jakub