https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80586

            Bug ID: 80586
           Summary: vsqrtss with AVX should avoid a dependency on the
                    destination register.
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: peter at cordes dot ca
  Target Milestone: ---
            Target: x86_64-*-*, i?86-*-*

#include <math.h>
float sqrt_depcheck(float a, float b) {
    return sqrtf(b);
}

compiles to (with gcc 8.0.0 20170429  -march=haswell -O3 -fno-math-errno):

  vsqrtss %xmm1, %xmm0, %xmm0
  ret


recent clang (4.0) avoids the unwanted dependency on %xmm0 by using the source
register as *both* source operands:

  vsqrtss %xmm1, %xmm1, %xmm0
  ret


This of course doesn't work when the source is a different type (e.g. memory,
or for int->float conversion, an integer register.  See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80571 for a suggestion to track
cold registers that can be safely used read-only without delaying OOO
execution, without putting vxorps-zeroing everywhere).


float sqrt_from_mem(float *fp) {
    return sqrtf(*fp);
}

ICC17 breaks the dep on xmm0 this way:
        vmovss    (%rdi), %xmm0                                 #8.12
        vsqrtss   %xmm0, %xmm0, %xmm0                           #8.12

gcc and clang both decide to risk it with:
       vsqrtss (%rdi), %xmm0, %xmm0


code on https://godbolt.org/g/mJmjdh.

Reply via email to