https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77453
Alexander Monakov <amonakov at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amonakov at gcc dot gnu.org --- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> --- You won't gain much speed-wise from inlining scalar evaluation of cbrtf. However, if the structure of your code allows that, you can get good speedup from vectorized evaluation of cbrtf for 4 arguments at once. GCC can emit vectorized cbrtf calls with -O3 -funsafe-math-optimizations -mveclibabi=svml.