https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98532

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|pinskia at gcc dot gnu.org         |unassigned at gcc dot 
gnu.org
      Known to work|                            |12.1.0
             Status|ASSIGNED                    |NEW

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Starting in GCC 12 we produce:
  vect__1.5_10 = *a_4(D);
  vect__2.6_11 = VEC_PERM_EXPR <vect__1.5_10, vect__1.5_10, { 1, 0 }>;
  *b_6(D) = vect__2.6_11;


        ldr     q0, [x0]
        ext     v0.16b, v0.16b, v0.16b, #8
        str     q0, [x1]

RTL level wise:
Trying 8 -> 9:
    8: r96:V2DI=unspec[r92:V2DI,r92:V2DI,0x1] 237
      REG_DEAD r92:V2DI
    9: [r98:DI]=r96:V2DI
      REG_DEAD r98:DI
      REG_DEAD r96:V2DI
Failed to match this instruction:
(set (mem:V2DI (reg:DI 98) [1 *b_6(D)+0 S16 A128])
    (unspec:V2DI [
            (reg:V2DI 92 [ vect__1.5 ]) repeated x2
            (const_int 1 [0x1])
        ] UNSPEC_EXT))

Trying 7, 8 -> 9:
    7: r92:V2DI=[r97:DI]
      REG_DEAD r97:DI
    8: r96:V2DI=unspec[r92:V2DI,r92:V2DI,0x1] 237
      REG_DEAD r92:V2DI
    9: [r98:DI]=r96:V2DI
      REG_DEAD r98:DI
      REG_DEAD r96:V2DI
Failed to match this instruction:
(set (mem:V2DI (reg:DI 98) [1 *b_6(D)+0 S16 A128])
    (unspec:V2DI [
            (mem:V2DI (reg:DI 97) [1 *a_4(D)+0 S16 A128]) repeated x2
            (const_int 1 [0x1])
        ] UNSPEC_EXT))


Maybe the aarch64 backend could have a pattern that matches the last 7,8 -> 9
combined rtl that then expands into a load pair/store pair with reversed
registers.

Reply via email to