https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #47 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The testcase then doesn't have to be floating point, say on x86 -O3 -mavx512f
void
foo (int *f, int d, int e)
{
for (int i = 0; i < 1024; i++)
{
int a = f[i];
int t;
if (a < 0)
t = 1;
else if (a < e)
t = 1 - a * d;
else
t = 0;
f[i] = t;
}
}
shows similar problems. Strangely, for
void
foo (int *f, int d, int e)
{
if (e < 32 || e > 64)
__builtin_unreachable ();
for (int i = 0; i < 1024; i++)
{
int a = f[i];
f[i] = (a < 0 ? 1 : 1 - a * d) * (a < e ? 1 : 0);
}
}
the threader doesn't do what it does for floating point code and we use just 2
comparisons rather than 3 (or more). Still, only one multiplication, not 2.
Strangely, in that case the second multiplication is there until vrp2, which
folds it using
/* Transform x * { 0 or 1, 0 or 1, ... } into x & { 0 or -1, 0 or -1, ...},
unless the target has native support for the former but not the latter. */
match.pd pattern and others into oblivion.