[Bug tree-optimization/103417] [12 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r12-5489

tnfchris at gcc dot gnu.org via Gcc-bugs Wed, 24 Nov 2021 18:46:33 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103417


--- Comment #5 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #4)
> Created attachment 51870 [details]
> gcc12-pr103417.patch
> 
> Untested fix.  Handling GE in that simplification is clearly bogus, we
> should just fold it to true elsewhere, not bother with it (it doesn't handle
> LT either,
> which should also fold to false elsewhere).

Indeed, that one is wrong..

> Handling LE and GT there isn't wrong, but makes no sense.  Elsewhere we
> canonicalize x > 0U into x != 0U and x <= 0U into x == 0U and for signed it
> was handling only EQ and NE already before.

Well, the intention is to simplify the bitmask. Most vector ISAs can create the
simple bitmask much easier than the complex one. i.e. 0xFFFFFF00 is much harder
to create than 0xFF. for scalar yes it doesn't matter much.

but e.g.

     for (int i = 0; i < (n & -16); i++)
       x[i] = (x[i]&(~255)) <= 0U;

generates worse code when the mask 0xFFFFFF00 is to be used.  The patch is
mainly addressing vector code but we added scalar for uniformity.

So I would like LE and GT to stay, at the very least for vector where it makes
a difference.  It's not something we can fix in the backend because we can't
differentiate between signed and unsigned.

[Bug tree-optimization/103417] [12 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r12-5489

Reply via email to