https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33027
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2021-07-19
Keywords| |missed-optimization
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.
-O2 produces much better code than -O3 or -O3 -fno-tree-vectorize.
And we even optimize a slightly different case at -O2 where 1 is replaced with
any other value:
unsigned int fn(unsigned int n, unsigned int dmax) throw()
{
for (unsigned int d = 0; d < dmax; ++d) {
n += d?d:55;
}
return n;
}
Note GCC seems to produce better code than LLVM for both cases even.
Especially the -O3 with constant of 1, on aarch64, GCC produces an umax
instruction while LLVM produces a cmeq/bsl pair :).