The current powerpc -mrecip can be improved in several areas. 1) The compiler generates a conditional test to deal with NaN, etc. even though it is only used with -ffast-math, where the programmer declares that he/she is not interested in Nan/infinity/etc. This causes the code to single thread the approximation, and does not allow the power7 to issue two streams of FRSQRTES and fixup in parallel.
2) Right now, -mrecip only enables single precision reciprocal square estimate. The compiler should be modified so that it also supports double precision reciprocal estimate on newer machines (power6/power7) that have a high enough precision FRSQRTE instruction. 3) In addition to optimizing 1/sqrt(x), the compiler should be able to optimize normal divisions using the FRE/FRES instructions. Machines with higher precision FRE instructions should do fewer passes in the fixup than the older machines. 4) On altivec and VSX systems, the compiler should enable auto vectorization of the estimate instructions and fixup. 5) The compiler should provide more builtins to allow the user to access to double reciprocal square root, as well as vector versions. -- Summary: Improve powerpc -mrecip support Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target AssignedTo: meissner at gcc dot gnu dot org ReportedBy: meissner at gcc dot gnu dot org GCC build triplet: powerpc64-unknown-linux-gnu GCC host triplet: powerpc64-unknown-linux-gnu GCC target triplet: powerpc64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44218