pr108692.c fails on LoongArch

xry111 at gcc dot gnu.org via Gcc-bugs Mon, 03 Feb 2025 04:21:26 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118727


--- Comment #16 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #15)
> (In reply to Xi Ruoyao from comment #13)
> > For example for the original gcc.dg/pr108692.c:
> > 
> >   a.0_4 = (unsigned char) a_14;
> >   _5 = (int) a.0_4;
> >   b.1_6 = (unsigned char) b_16;
> >   _7 = (int) b.1_6;
> >   c_17 = _5 - _7; 
> >   _8 = ABS_EXPR <c_17>;
> >   r_18 = _8 + r_23;
> > 
> > becomes:
> > 
> >   patt_31 = .ABD (a.0_4, b.1_6);
> > 
> > so when we get the ABD we already stripped the redundant (int) conversion
> > here, note that the (unsigned char) conversion is really needed.  And for my
> > twisted test case:
> > 
> >   a.0_4 = (unsigned char) a_14;
> >   _5 = (unsigned int) a.0_4;
> >   _6 = (unsigned int) b_16;
> >   _7 = _5 - _6; 
> >   c_17 = (int) _7; 
> >   _8 = ABS_EXPR <c_17>;
> > 
> > becomes:
> > 
> >   patt_38 = (unsigned short) a.0_4;
> >   patt_35 = (signed short) b_16;
> >   patt_36 = (signed short) patt_38;
> >   patt_33 = .ABD (patt_36, patt_35);
> > 
> > To me this is already the "optimal" way.  Correct me if I'm wrong, but
> > anyway even if there's a redundant conversion it should be stripped in
> > vect_recog_abd_pattern instead of vect_recog_sad_pattern.
> 
> No, that's not correct. SAD_EXPR is the most optimal version here as
> SAD_EXPR performs double widening.  It removes the need for the intermediate
> vector widening.
> 
>   a.0_4 = (unsigned char) a_14;
>   patt_b_16 = (unsigned char) b_16
>   patt_41 = SAD_EXPR <a.0_4, patt_b_16, r_23>;
> 
> is the most optimal form, as it does the operation on bytes rather than
> short.

I don't think it's correct...  In the twisted test case we have something like

r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b)

So if a is 255 and b is 128, we should add r with abs(255 - (-128)) = 383, but
with this form we'll add it with 127.

[Bug tree-optimization/118727] [15 Regression] gcc.dg/pr108692.c fails on LoongArch

Reply via email to