https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109690
Bug ID: 109690 Summary: bad SLP vectorization on zen Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hubicka at gcc dot gnu.org Target Milestone: --- model name : AMD Ryzen 7 5800X 8-Core Processor reproduces on my znver1 laptop too. h@ryzen3:~/gcc-kub/build/gcc> cat tt.c int a[100]; [[gnu::noipa]] void loop() { for (int i = 0; i < 3; i++) a[i]+=a[i]; } int main() { for (int j = 0; j < 1000000000; j++) loop (); return 0; } jh@ryzen3:~/gcc-kub/build/gcc> ./xgcc -B ./ -O2 -march=native tt.c ; perf stat ./a.out Performance counter stats for './a.out': 2683.95 msec task-clock:u # 1.000 CPUs utilized 0 context-switches:u # 0.000 /sec 0 cpu-migrations:u # 0.000 /sec 52 page-faults:u # 19.374 /sec 13001141361 cycles:u # 4.844 GHz (83.31%) 691180 stalled-cycles-frontend:u # 0.01% frontend cycles idle (83.31%) 101980 stalled-cycles-backend:u # 0.00% backend cycles idle (83.31%) 12999928665 instructions:u # 1.00 insn per cycle # 0.00 stalled cycles per insn (83.31%) 3000013809 branches:u # 1.118 G/sec (83.41%) 1525 branch-misses:u # 0.00% of all branches (83.36%) 2.684376360 seconds time elapsed 2.684369000 seconds user 0.000000000 seconds sys jh@ryzen3:~/gcc-kub/build/gcc> ./xgcc -B ./ -O2 -march=native tt.c -fno-tree-vectorize ; perf stat ./a.out Performance counter stats for './a.out': 1238.92 msec task-clock:u # 1.000 CPUs utilized 0 context-switches:u # 0.000 /sec 0 cpu-migrations:u # 0.000 /sec 52 page-faults:u # 41.972 /sec 6000338140 cycles:u # 4.843 GHz (83.21%) 314660 stalled-cycles-frontend:u # 0.01% frontend cycles idle (83.21%) 0 stalled-cycles-backend:u # 0.00% backend cycles idle (83.23%) 7999796562 instructions:u # 1.33 insn per cycle # 0.00 stalled cycles per insn (83.53%) 2999887795 branches:u # 2.421 G/sec (83.53%) 698 branch-misses:u # 0.00% of all branches (83.28%) 1.239116606 seconds time elapsed 1.239121000 seconds user 0.000000000 seconds sys