https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |target

--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
We produce:
Trying 5, 7 -> 11:
    5: r86:V4SF=[`*.LC0']
      REG_EQUAL const_vector
    7: r85:V4SF=vec_select(vec_concat(r86:V4SF,r86:V4SF),parallel)
      REG_DEAD r86:V4SF
      REG_EQUAL const_vector
   11: r88:V4SF=vec_select(vec_concat(r85:V4SF,r85:V4SF),parallel)
      REG_DEAD r85:V4SF
      REG_EQUAL const_vector
Failed to match this instruction:
(set (reg:V4SF 88)
    (const_vector:V4SF [
            (const_double:SF 2.0e+0 [0x0.8p+2])
            (const_double:SF 1.0e+0 [0x0.8p+1])
            (const_double:SF 4.0e+0 [0x0.8p+3])
            (const_double:SF 3.0e+0 [0x0.cp+2])
        ]))

Which means the vec_select are merging at the rtl level just fine.

Anyways if the target expands __builtin_ia32_shufps to VEC_PERM_EXPR we would
have gotten this optimized at the gimple level.  So this is a target issue.

Reply via email to