https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88189

            Bug ID: 88189
           Summary: ix86_expand_sse_movcc and blend for scalars
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: glisse at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-*

double f(double a,double b){
  return (a<0)?a:b;
}
typedef double vec __attribute__((vector_size(16)));
vec g(vec a,vec b){
  return (a<0)?a:b;
}

I am compiling with -O3, and the most interesting pass is ce1 with
noce_try_cmove. Using -msse2, both functions generate similar code

        andpd   %xmm2, %xmm0
        andnpd  %xmm1, %xmm2
        orpd    %xmm2, %xmm0

With -mxop they also generate similar code

        vpcmov  %xmm2, %xmm1, %xmm0, %xmm0

However, with -msse4, they differ, the vector version gets

        blendvpd        %xmm0, %xmm2, %xmm1

while the scalar version is stuck with the SSE2 and+andn+or. Is there a
particular reason for this inconsistency?

Reply via email to