https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98299

--- Comment #2 from Gabriel Ravier <gabravier at gmail dot com> ---
At the very least, it seems like a worthwhile pattern to recognize in -O3, even
if only to avoid vectorizing it, i.e. have similar effects to what happens if
you add `if (n >= 1000) __builtin_unreachable();` to the start of f1.

Altogether, though, it seems unlikely that the modulo would be costlier than
the loop except in very narrow cases, since it is optimized into a
multiplication and a few other operations with little cost.

Also, the transformation into a modulo seems to occur in the vectorized version
too, though it is weirdly optimized.

Reply via email to