https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117957

            Bug ID: 117957
           Summary: vectorization pesimises std::vector push/pop test
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

my synthetic push/pop minibenchmark regressed due to vectorization:
jh@ryzen3:~> cat t.C
#include <vector>
typedef unsigned int uint32_t;
std::pair<uint32_t, uint32_t> pair;
void
test()
{
        std::vector<std::pair<uint32_t, uint32_t>> stack;
        stack.push_back (pair);
        while (!stack.empty()) {
                std::pair<uint32_t, uint32_t> cur = stack.back();
                stack.pop_back();
                if (!cur.first)
                {
                        cur.second++;
                        stack.push_back (cur);
                }
                if (cur.second > 10000)
                        break;
        }
}
int
main()
{
        for (int i = 0; i < 10000; i++)
          test();
}

jh@ryzen3:~> ~/trunk-install2/bin/g++ -O3 t.C ; time ./a.out

real    0m0.250s
user    0m0.250s
sys     0m0.000s
jh@ryzen3:~> ~/trunk-install2/bin/g++ -O3 t.C -fno-tree-vectorize ; time
./a.out

real    0m0.044s
user    0m0.044s
sys     0m0.000s

This is regression since gcc14 

jh@ryzen3:~> g++ -O3 t.C ; time ./a.out

real    0m0.044s
user    0m0.044s
sys     0m0.000s
jh@ryzen3:~> g++ --version
g++ (SUSE Linux) 14.2.1 20241007 [revision
4af44f2cf7d281f3e4f3957efce10e8b2ccb2ad3]
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

with clang I get 0.066s.

perf shows:
Percent│       nop                           ▒
       │ 30:┌─→add       $0x1,%ebx           ▒
  0.00 │    │  cmp       %rsi,%rdx           ▒
       │    │↓ je        90                  ▒
       │    │  movdqa    %xmm1,%xmm0         ▒
       │    │  movd      %ebx,%xmm2          ▒
  8.23 │    │  punpckldq %xmm2,%xmm0         ◆
 24.84 │    │  movq      %xmm0,-0x8(%rax)    ▒
       │ 49:│  cmp       $0x2710,%ebx        ▒
       │    │↓ ja        73                  ▒
       │ 51:│  cmp       %rbp,%rax           ▒
       │    │↓ je        150                 ▒
       │ 5a:│  mov       -0x8(%rax),%ecx     ▒
 58.52 │    │  mov       -0x4(%rax),%ebx     ▒
  0.01 │    │  lea       -0x8(%rax),%rdx     ▒
       │    ├──test      %ecx,%ecx           ▒
  8.39 │    └──je        30                  ▒

Reply via email to