Consider the program that follows (you can cut & paste into a shell to get
foo.s).  Functions A and B are mathematically identical on the reals.

On Mac OS X 10.5, gcc version 4.4.1, with -O2, we see A and B compiling
differently.  In the assembly we see that A squares z, multiplies by y,
subtracts from x -- precisely what the code says.  B loads a -1, XORs that with
y to get -y, then multiplies by z, then z again, and adds to x -- so it
computes (-1) * y * z * z + x, which is a bit slower.  Worse, reading a
constant in memory means futzing with PIC setup.

By comparison, in function C we have + (-b) converted to just a subtraction, so
it doesn't seem like this has to do with IEEE special cases.  I've also tested
with all the -f options that have to do with assuming special cases won't
arise, to no effect.

The behaviour is identical on gcc-4.2.1.

cat > foo.c << EOF
double A (double x, double y, double z) {
  return x - y * z*z ; 
}
double B (double x, double y, double z) {
  return x + (-y * z*z);
} 
double C (double a, double b) {
  return a + (-b);
} 
EOF
gcc -O2 -fomit-frame-pointer -S foo.c
cat foo.s


-- 
           Summary: missed optimization: x + (-y * z * z) => x - y * z * z
           Product: gcc
           Version: 4.4.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: benoit dot hudson at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40921

Reply via email to