http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46228

           Summary: code produced for STL container is worse in 4.5.1 than
                    in 4.4.5
           Product: gcc
           Version: 4.5.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: zeev.taran...@gmail.com


First of all, I don't know if this is the right component. If not, I apologize
and ask to please reroute appropriately.

This code:

#include <set>
#include <stdio.h>

int main()
{
  static const int array[] = { 1,2,3,4,5,6,7,8,9,10,6 };
  std::set<int> the_set;
  int count = 0;
  for (unsigned i = 0; i < sizeof(array)/sizeof(*array); i++)
  {
    std::pair<std::set<int>::iterator, bool> result =
      the_set.insert(array[i]);
    if (result.second)
      count++;
  }
  printf("%d unique items in array.\n", count);
  return 0;
}

Compiled with g++ -Os -fno-exceptions using gcc 4.5.1+ rev. 165881 produced
code including this loop in main:

  40076d: mov    %ebx,%eax
  40076f: mov    %r12,%rdi
  400772: lea    0x400a60(,%rax,4),%rsi
  40077a: callq  400930 <std::set<int, std::less<int>, std::allocator<int>
>::insert(int const&)>
  40077f: mov    %rax,(%rsp)
  400783: mov    %edx,0x8(%rsp)
  400787: mov    %rax,0x40(%rsp)
  40078c: mov    0x8(%rsp),%rax
  400791: cmp    $0x1,%al
  400793: mov    %rax,0x48(%rsp)
  400798: sbb    $0xffffffffffffffff,%ebp
  40079b: inc    %ebx
  40079d: cmp    $0xb,%ebx
  4007a0: jne    40076d <main+0x19>

and this function:

<std::set<int, std::less<int>, std::allocator<int> >::insert(int const&)>:
  400930: sub    $0x48,%rsp
  400934: callq  400898 <std::_Rb_tree<int, int, std::_Identity<int>,
std::less<int>, std::allocator<int> >::_M_insert_unique(int const&)>
  400939: mov    %edx,0x18(%rsp)
  40093d: mov    0x18(%rsp),%dl
  400941: mov    %dl,0x28(%rsp)
  400945: mov    0x28(%rsp),%edx
  400949: add    $0x48,%rsp
  40094d: retq

Same source code compiled on clang 2.8 into this:

  4007b3: lea    0x400a20(%r14),%rsi
  4007ba: mov    %r15,%rdi
  4007bd: callq  4007fc <std::_Rb_tree<int, int, std::_Identity<int>,
std::less<int>, std::allocator<int> >::_M_insert_unique(int const&)>
  4007c2: and    $0x1,%dl
  4007c5: movzbl %dl,%eax
  4007c8: add    %eax,%ebx
  4007ca: add    $0x4,%r14
  4007ce: cmp    $0x2c,%r14
  4007d2: jne    4007b3 <main+0x5f>

set::insert has been inlined into nothing, and the pair<iterator, bool> doesn't
get space on the stack.

gcc 4.4.5 didn't produce those shuffles, so this seems like a regression from
4.4.

gcc 4.5.1 also didn't inline _Rb_tree_impl(), ~_Rb_tree (two instructions long)
and the above set::insert that should have been maybe two instructions long.
Could this be fixed?

Reply via email to