http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56125
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org |gnu.org | --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-01-28 10:27:19 UTC --- Created attachment 29292 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29292 gcc48-pr56125.patch The bug is that the last two optimizations of pow (where 2c resp. 3c is a non-zero integer) silently assume that the earlier optimizations already handled the cases where already c (or 2c for the last optimization) is an integer. But that doesn't have to be the case, as shown by the testcase, the integer optimization is guarded by c in [-1,2] or optimization for speed. The && optimize_function_for_speed_p () (or should that be optimize_insn_for_speed_p?, the pow folding is inconsistent in that, and I don't see e.g. rtl_profile_for_bb being called during this pass to make it accurate) is up for discussions, say on x86_64 __attribute__((cold)) double foo (double x, double n) { double u = __builtin_pow (x, -1.5); return u; } with it we get smaller code: movsd .LC0(%rip), %xmm1 jmp pow compared to: sqrtsd %xmm0, %xmm1 mulsd %xmm0, %xmm1 movsd .LC0(%rip), %xmm0 divsd %xmm1, %xmm0 without it, 7 bytes shorter.