https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99912

--- Comment #11 from Erik Schnetter <schnetter at gmail dot com> ---
The number of active local variables is likely much larger than the number of
registers, and I expect there to be a lot of spilling. I hope that the compiler
is clever about changing the order in which expressions are evaluated to reduce
spilling as much as possible.

Because the loop is so large, I split it into two, each calculating about half
of the output variables. The code here looks at one of the loops. To simplify
the code, each loop still loads all variables (via masked loads), but may not
use all of them. The unused masked loads do not surprise me per se, but I
expect the compiler to remove them.

Reply via email to