N.B. All libstdc++ patches should be sent to the libstdc++ list (not just to the names in the MAINTAINERS file).
On Sat, 14 Dec 2024 at 01:01, Jonathan Wakely <jwak...@redhat.com> wrote: > > On Fri, 13 Dec 2024 at 17:12, Tamar Christina <tamar.christ...@arm.com> wrote: > > > > Hi All, > > > > In GCC 12 there was a ~40% regression in the performance of hashmap->find. > > > > This regression came about accidentally: > > > > Before GCC 12 the find function was small enough that IPA would inline it > > even > > though it wasn't marked inline. In GCC-12 an optimization was added to > > perform > > a linear search when the entries in the hashmap are small. > > > > This increased the size of the function enough that IPA would no longer > > inline. > > Inlining had two benefits: > > > > 1. The return value is a reference. so it has to be returned and > > dereferenced > > even though the search loop may have already dereference it. > > 2. The pattern is a hard pattern to track for branch predictors. This > > causes > > a large number of branch misses if the value is immediately checked and > > branched on. i.e. if (a != m.end()) which is a common pattern. > > > > The patch fixes both these issues by adding the inline keyword to _M_locate > > to allow the inliner to consider inlining again. > > > > This and the other patches have been ran through serveral benchmarks where > > the size, number of elements searched for and type (reference vs value) etc > > were tested. > > > > The change shows no statistical regression, but an average find improvement > > of > > ~27% and a range between ~10-60% improvements. A selection of the results: > > > > +-----------+--------------------+-------+----------+ > > | Group | Benchmark | Size | % Inline | > > +-----------+--------------------+-------+----------+ > > | Find | unord<uint64_t | 11264 | 53.52% | > > | Find | unord<uint64_t | 11264 | 47.98% | > > | Find Many | unord<uint64_t | 11 | 47.62% | > > | Find Many | unord<std::string | 11 | 44.94% | > > | Find Many | unord<std::string | 11 | 44.89% | > > | Find Many | unord<uint64_t | 11 | 40.90% | > > | Find | unord<NonSSOString | 11 | 31.80% | > > | Find | unord<NonSSOString | 11 | 31.17% | > > | Find Many | unord<uint64_t | 352 | 30.57% | > > | Find | unord<uint64_t | 352 | 28.27% | > > | Find Many | unord<uint64_t | 352 | 26.80% | > > | Find Many | unord<NonSSOString | 352 | 26.59% | > > | Find Many | unord<NonSSOString | 352 | 25.93% | > > | Find | unord<std::string | 11 | 25.66% | > > | Find Many | unord<NonSSOString | 11 | 23.21% | > > | Find Many | unord<std::string | 352 | 23.12% | > > | Find Many | unord<NonSSOString | 11 | 22.04% | > > | Find Many | unord<uint64_t | 11264 | 21.33% | > > | Find Many | unord<uint64_t | 11264 | 20.71% | > > | Find | unord<std::string | 11 | 20.36% | > > | Find Many | unord<std::string | 352 | 19.23% | > > | Find | unord<std::string | 352 | 18.59% | > > | Find | unord<uint64_t | 352 | 15.43% | > > | Find | unord<NonSSOString | 352 | 13.79% | > > | Find | unord<std::string | 11264 | 11.80% | > > | Find | unord<std::string | 352 | 11.12% | > > | Find | val<uint64_t | 11 | 9.98% | > > | Find | unord<std::string | 11264 | 9.97% | > > | Find | unord<NonSSOString | 352 | 9.69% | > > | Find | unord<NonSSOString | 11264 | 9.19% | > > +-----------+--------------------+-------+----------+ > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? and possible backports? > > Thanks for the thorough benchmarking. > > OK for trunk. > > A different patch will be needed for the release branches because > _M_locate only exists on trunk. > > > > > Thanks, > > Tamar > > > > libstdc++-v3/ChangeLog: > > > > * include/bits/hashtable.h: Inline _M_locate. > > > > --- > > diff --git a/libstdc++-v3/include/bits/hashtable.h > > b/libstdc++-v3/include/bits/hashtable.h > > index > > b8bd8c2f41816a800bab9b3589fe609b16285ad1..e791e52ec329277474f3218d8a44cd37ded14ac3 > > 100644 > > --- a/libstdc++-v3/include/bits/hashtable.h > > +++ b/libstdc++-v3/include/bits/hashtable.h > > @@ -2213,7 +2213,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > typename _ExtractKey, typename _Equal, > > typename _Hash, typename _RangeHash, typename _Unused, > > typename _RehashPolicy, typename _Traits> > > - auto > > + inline auto > > _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, > > _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > > _M_locate(const key_type& __k) const > > > > > > > > > > --