Right now, the compiler has no special code to optimize pow (x, 0.75) into something like sqrt (x) * sqrt (sqrt (x)) under -ffast-math, nor pow (x, 0.25) into sqrt (sqrt (x)). On machines with a builtin sqrt instruction, it is often times faster to do the calculations using sqrt than calling the pow function. In particular, x**0.75 shows up in the bwaves spec 2006 benchmark.
For sqrt (sqrt (x)) vs pow (x, 0.25), I see: IBM power6: 9.2 times faster Intel core2 laptop: 5 times faster AMD K8 system: 3.2 times faster For sqrt (x) * sqrt (sqrt (x)) vs. pow (x, 0.75), I see: IBM power6: 6.1 times faster Intel core2 laptop: 3.4 times faster AMD K8 system: 2.3 times faster In addition, the compiler optimizes sqrt (sqrt (x)) into pow (x, 0.25), and similar optimizations. This should be fixed by adding a hook to say what the relative speed of sqrt and cbrt vs. pow, so that the backend can control whether or not this optimization should be done. By default, the optimization is probably only useful if -Os is used on machines that have a hardware sqrt instruction. -- Summary: Compiler could optimize pow (x, 0.75) into sqrt (x) * sqrt (sqrt (x)) Product: gcc Version: 4.5.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: meissner at gcc dot gnu dot org ReportedBy: meissner at gcc dot gnu dot org GCC build triplet: powerpc64-unknown-linux-gnu GCC host triplet: powerpc64-unknown-linux-gnu GCC target triplet: powerpc64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42694