[Bug tree-optimization/113678] New: SLP misses up vec_concat

pinskia at gcc dot gnu.org via Gcc-bugs Tue, 30 Jan 2024 18:33:20 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113678


            Bug ID: 113678
           Summary: SLP misses up vec_concat
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64

Take:
```
void f(char *a, char *b)
{
        int b0 = b[0];
        int b1 = b[1];
        int b2 = b[2];
        int b3 = b[3];
        int b4 = 0;
        int b5 = 0;
        int b6 = 0;
        int b7 = 0;
        a[0] = b0;
        a[1] = b1;
        a[2] = b2;
        a[3] = b3;
#if 0
        asm("":::"memory");
#endif
        a[4] = b0;
        a[5] = b1;
        a[6] = b2;
        a[7] = b3;
}
```

On x86_64 we get some mess because SLP decides to do this:
```
  _1 = *b_6(D);
  _2 = MEM[(char *)b_6(D) + 1B];
  _3 = MEM[(char *)b_6(D) + 2B];
  _4 = MEM[(char *)b_6(D) + 3B];
  _16 = {_1, _2, _3, _4, _1, _2, _3, _4};
```

But this is could be done as 2 stores (if we change the `#if 0` to `#if 1` we
get the better code):
```
  vect__1.5_18 = MEM <vector(4) char> [(char *)b_6(D)];
  MEM <vector(4) char> [(char *)a_7(D)] = vect__1.5_18;
  MEM <vector(4) char> [(char *)a_7(D) + 4B] = vect__1.5_18;
```

Or we could get one store even like LLVM gets:
```
        movd    xmm0, dword ptr [rsi]           # xmm0 = mem[0],zero,zero,zero
        pshufd  xmm0, xmm0, 0                   # xmm0 = xmm0[0,0,0,0]
        movq    qword ptr [rdi], xmm0
        ret
```

[Bug tree-optimization/113678] New: SLP misses up vec_concat

Reply via email to