https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77453
Alexander Monakov <amonakov at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |amonakov at gcc dot gnu.org
--- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
You won't gain much speed-wise from inlining scalar evaluation of cbrtf.
However, if the structure of your code allows that, you can get good speedup
from vectorized evaluation of cbrtf for 4 arguments at once. GCC can emit
vectorized cbrtf calls with -O3 -funsafe-math-optimizations -mveclibabi=svml.