https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712
--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> --- On Fri, 29 Nov 2019, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712 > > --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > Before the first revision mentioned above *.optimized dump contained just t * > v, the second one doesn't change anything in *.optimized and is a RTL costing > matter. > _4 = (unsigned int) t_1(D); > _10 = _4 + 4294967295; > _8 = (int) _10; > _13 = v_3(D) * _8; > x_5 = v_3(D) + _13; > and can be seen even on simpler: > int > foo (int t, int v) > { > t = t - 1U; > v *= t; > return v + t; > } > which we don't optimize at GIMPLE level. We don't optimize even: > int > bar (int t, int v) > { > t = t - 1; > v *= t; > return v + t; > } > Rather than hoping it is optimized during combine (the change there was that > while combining b=a-1 into c=b*d we attempted c=a*d-d we now attempt c=(a-1)*d > and similarly for the 3 insn combination with e=c+d, where we attempted and > succeeded to combine that into e=a*d while now we attempt and fail > e=(a-1)*d+d: > -Successfully matched this instruction: > +Failed to match this instruction: > (parallel [ > (set (reg/v:SI 91 [ <retval> ]) > - (mult:SI (reg/v:SI 92 [ t ]) > + (plus:SI (mult:SI (plus:SI (reg/v:SI 92 [ t ]) > + (const_int -1 [0xffffffffffffffff])) > + (reg/v:SI 93 [ v ])) > (reg/v:SI 93 [ v ]))) > (clobber (reg:CC 17 flags)) > ]) > ), I think it would be useful to optimize this in match.pd, plus maybe teach > simplify-rtx.c to handle this. I think fold_plusminus_mult_expr does this on GENERIC, but it was dumbed down at some point because of undefined overflow issues. Because for (t - 1) * v + v, (t - 1) might be zero and with large v t * v might overflow (or so, need to track down history of that function). Richard.