https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712

--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 29 Nov 2019, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712
> 
> --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Before the first revision mentioned above *.optimized dump contained just t *
> v, the second one doesn't change anything in *.optimized and is a RTL costing
> matter.
>   _4 = (unsigned int) t_1(D);
>   _10 = _4 + 4294967295;
>   _8 = (int) _10;
>   _13 = v_3(D) * _8;
>   x_5 = v_3(D) + _13;
> and can be seen even on simpler:
> int
> foo (int t, int v)
> {
>   t = t - 1U;
>   v *= t;
>   return v + t;
> }
> which we don't optimize at GIMPLE level.  We don't optimize even:
> int
> bar (int t, int v)
> {
>   t = t - 1;
>   v *= t;
>   return v + t;
> }
> Rather than hoping it is optimized during combine (the change there was that
> while combining b=a-1 into c=b*d we attempted c=a*d-d we now attempt c=(a-1)*d
> and similarly for the 3 insn combination with e=c+d, where we attempted and
> succeeded to combine that into e=a*d while now we attempt and fail 
> e=(a-1)*d+d:
> -Successfully matched this instruction:
> +Failed to match this instruction:
>  (parallel [
>          (set (reg/v:SI 91 [ <retval> ])
> -            (mult:SI (reg/v:SI 92 [ t ])
> +            (plus:SI (mult:SI (plus:SI (reg/v:SI 92 [ t ])
> +                        (const_int -1 [0xffffffffffffffff]))
> +                    (reg/v:SI 93 [ v ]))
>                  (reg/v:SI 93 [ v ])))
>          (clobber (reg:CC 17 flags))
>      ])
> ), I think it would be useful to optimize this in match.pd, plus maybe teach
> simplify-rtx.c to handle this.

I think fold_plusminus_mult_expr does this on GENERIC, but it was dumbed
down at some point because of undefined overflow issues.  Because
for (t - 1) * v + v, (t - 1) might be zero and with large v t * v
might overflow (or so, need to track down history of that function). 

Richard.

Reply via email to