Consider the program that follows (you can cut & paste into a shell to get foo.s). Functions A and B are mathematically identical on the reals.
On Mac OS X 10.5, gcc version 4.4.1, with -O2, we see A and B compiling differently. In the assembly we see that A squares z, multiplies by y, subtracts from x -- precisely what the code says. B loads a -1, XORs that with y to get -y, then multiplies by z, then z again, and adds to x -- so it computes (-1) * y * z * z + x, which is a bit slower. Worse, reading a constant in memory means futzing with PIC setup. By comparison, in function C we have + (-b) converted to just a subtraction, so it doesn't seem like this has to do with IEEE special cases. I've also tested with all the -f options that have to do with assuming special cases won't arise, to no effect. The behaviour is identical on gcc-4.2.1. cat > foo.c << EOF double A (double x, double y, double z) { return x - y * z*z ; } double B (double x, double y, double z) { return x + (-y * z*z); } double C (double a, double b) { return a + (-b); } EOF gcc -O2 -fomit-frame-pointer -S foo.c cat foo.s -- Summary: missed optimization: x + (-y * z * z) => x - y * z * z Product: gcc Version: 4.4.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: benoit dot hudson at gmail dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40921