http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46228
Summary: code produced for STL container is worse in 4.5.1 than in 4.4.5 Product: gcc Version: 4.5.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: zeev.taran...@gmail.com First of all, I don't know if this is the right component. If not, I apologize and ask to please reroute appropriately. This code: #include <set> #include <stdio.h> int main() { static const int array[] = { 1,2,3,4,5,6,7,8,9,10,6 }; std::set<int> the_set; int count = 0; for (unsigned i = 0; i < sizeof(array)/sizeof(*array); i++) { std::pair<std::set<int>::iterator, bool> result = the_set.insert(array[i]); if (result.second) count++; } printf("%d unique items in array.\n", count); return 0; } Compiled with g++ -Os -fno-exceptions using gcc 4.5.1+ rev. 165881 produced code including this loop in main: 40076d: mov %ebx,%eax 40076f: mov %r12,%rdi 400772: lea 0x400a60(,%rax,4),%rsi 40077a: callq 400930 <std::set<int, std::less<int>, std::allocator<int> >::insert(int const&)> 40077f: mov %rax,(%rsp) 400783: mov %edx,0x8(%rsp) 400787: mov %rax,0x40(%rsp) 40078c: mov 0x8(%rsp),%rax 400791: cmp $0x1,%al 400793: mov %rax,0x48(%rsp) 400798: sbb $0xffffffffffffffff,%ebp 40079b: inc %ebx 40079d: cmp $0xb,%ebx 4007a0: jne 40076d <main+0x19> and this function: <std::set<int, std::less<int>, std::allocator<int> >::insert(int const&)>: 400930: sub $0x48,%rsp 400934: callq 400898 <std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_M_insert_unique(int const&)> 400939: mov %edx,0x18(%rsp) 40093d: mov 0x18(%rsp),%dl 400941: mov %dl,0x28(%rsp) 400945: mov 0x28(%rsp),%edx 400949: add $0x48,%rsp 40094d: retq Same source code compiled on clang 2.8 into this: 4007b3: lea 0x400a20(%r14),%rsi 4007ba: mov %r15,%rdi 4007bd: callq 4007fc <std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_M_insert_unique(int const&)> 4007c2: and $0x1,%dl 4007c5: movzbl %dl,%eax 4007c8: add %eax,%ebx 4007ca: add $0x4,%r14 4007ce: cmp $0x2c,%r14 4007d2: jne 4007b3 <main+0x5f> set::insert has been inlined into nothing, and the pair<iterator, bool> doesn't get space on the stack. gcc 4.4.5 didn't produce those shuffles, so this seems like a regression from 4.4. gcc 4.5.1 also didn't inline _Rb_tree_impl(), ~_Rb_tree (two instructions long) and the above set::insert that should have been maybe two instructions long. Could this be fixed?