https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107294

            Bug ID: 107294
           Summary: Missed optimization: multiplying real with complex
                    number in Fortran (only)
           Product: gcc
           Version: 11.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bartoldeman at users dot sourceforge.net
  Target Milestone: ---

This code:

complex function csmul(a, b)
  real, value :: a
  complex, value :: b
  csmul = a * b
end function csmul

produces this assembly on x86-64 (11.3, -O2)
   0:   66 0f d6 4c 24 f8       movq   %xmm1,-0x8(%rsp)
   6:   f3 0f 10 64 24 fc       movss  -0x4(%rsp),%xmm4
   c:   f3 0f 10 4c 24 f8       movss  -0x8(%rsp),%xmm1
  12:   0f 28 d0                movaps %xmm0,%xmm2
  15:   66 0f ef db             pxor   %xmm3,%xmm3 # xmm3 = 0
  19:   f3 0f 59 d1             mulss  %xmm1,%xmm2
  1d:   0f 28 ec                movaps %xmm4,%xmm5
  20:   f3 0f 59 eb             mulss  %xmm3,%xmm5 # xmm5 = 0
  24:   f3 0f 59 c4             mulss  %xmm4,%xmm0
  28:   f3 0f 59 cb             mulss  %xmm3,%xmm1 # xmm1 = 0
  2c:   f3 0f 5c d5             subss  %xmm5,%xmm2 # xmm2 unchanged
  30:   f3 0f 58 c1             addss  %xmm1,%xmm0 # xmm0 unchanged
  34:   f3 0f 11 54 24 f0       movss  %xmm2,-0x10(%rsp)
  3a:   f3 0f 11 44 24 f4       movss  %xmm0,-0xc(%rsp)
  40:   f3 0f 7e 44 24 f0       movq   -0x10(%rsp),%xmm0
  46:   c3                      retq    

here xmm3 (imaginary part of a, promoted to complex) is set to 0 but this is
not exploited in the remainder.

On the other hand the assembly for the corresponding C code looks good, with
two mul instructions, as expected:

float _Complex csmul(float a, float _Complex b)
{
  return a * b;
}

0000000000000000 <csmul>:
   0:   66 0f d6 4c 24 f8       movq   %xmm1,-0x8(%rsp)
   6:   f3 0f 10 4c 24 f8       movss  -0x8(%rsp),%xmm1
   c:   f3 0f 59 c8             mulss  %xmm0,%xmm1
  10:   f3 0f 59 44 24 fc       mulss  -0x4(%rsp),%xmm0
  16:   f3 0f 11 4c 24 f0       movss  %xmm1,-0x10(%rsp)
  1c:   f3 0f 11 44 24 f4       movss  %xmm0,-0xc(%rsp)
  22:   f3 0f 7e 44 24 f0       movq   -0x10(%rsp),%xmm0
  28:   c3                      retq   

The same issue is still present in trunk, according to godbolt.org.

Reply via email to