http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439
--- Comment #4 from Ben Maurer <ben.maurer at gmail dot com> --- Also, here's where perf says time is being spent. While only 25% is shown as being in the locale constructor/destructor, I suspect that the time spent in other methods is actually related to the ping-ponging of cachelines caused by the constructor -- all the time is being spent in memory access which should be very hot in the CPU cache. 16.77% benchmark libstdc++.so.6.0.18 [.] std::locale::locale() 10.42% benchmark libstdc++.so.6.0.18 [.] std::locale::~locale() 9.93% benchmark libstdc++.so.6.0.18 [.] bool std::has_facet<std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > > >(std::locale const&) 9.84% benchmark libstdc++.so.6.0.18 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > > const& std::use_facet<std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > > >(std::locale const&) 9.26% benchmark libstdc++.so.6.0.18 [.] bool std::has_facet<std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > >(std::locale const&) 9.10% benchmark libstdc++.so.6.0.18 [.] std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > const& std::use_facet<std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > >(std::locale const&) 8.78% benchmark libstdc++.so.6.0.18 [.] std::ctype<char> const& std::use_facet<std::ctype<char> >(std::locale const&) 6.14% benchmark libstdc++.so.6.0.18 [.] std::locale::operator=(std::locale const&) 3.71% benchmark libstdc++.so.6.0.18 [.] std::__use_cache<std::__numpunct_cache<char> >::operator()(std::locale const&) const 3.66% benchmark libstdc++.so.6.0.18 [.] bool std::has_facet<std::ctype<char> >(std::locale const&) 3.42% benchmark libstdc++.so.6.0.18 [.] __dynamic_cast 1.96% benchmark libstdc++.so.6.0.18 [.] std::locale::id::_M_id() const 0.89% benchmark benchmark [.] doIoStream()