https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117088

            Bug ID: 117088
           Summary: [15 regression] 548.exchange_r regressed by 10% with
                    -O2 -march=x86-64-v3 after enhance O2 vectorization
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: liuhongt at gcc dot gnu.org
  Target Milestone: ---

548.exchange2_r     -11.54%

The only regression is from 548.exchange_r, the vectorization for inner loop in
each layer
of the 9-layer loops increases register pressure and causes more spill.
- block(rnext:9, 1, i1) = block(rnext:9, 1, i1) + 10
  - block(rnext:9, 2, i2) = block(rnext:9, 2, i2) + 10
    .....
        - block(rnext:9, 9, i9) = block(rnext:9, 9, i9) + 10
    ...
- block(rnext:9, 2, i2) = block(rnext:9, 2, i2) + 10
- block(rnext:9, 1, i1) = block(rnext:9, 1, i1) + 10

Looks like aarch64 doesn't have the issue because aarch64 has 32 gprs, but x86
only has 16.
I have a extra patch to prevent loop vectorization in deep-depth loop for x86
backend which can bring the performance back.

Reply via email to