Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2

Paolo Carlini Tue, 13 Nov 2012 14:53:53 -0800

Hi,

On 11/13/2012 10:40 PM, François Dumont wrote:

Here is the proposal to remove shrinking feature from hash policy. Ihave also considered your remark regarding usage of lower_bound so_M_bkt_for_elements doesn't call _M_next_bkt (calling lower_bound)anymore. For 2 of the 3 calls it was only a source of redundantlower_bound invocations, in the last case I call _M_next_bkt explicitly.
2012-11-13  François Dumont  <fdum...@gcc.gnu.org>

    * include/bits/hashtable_policy.h (_Prime_rehash_policy): Remove
    automatic shrink.
    (_Prime_rehash_policy::_M_bkt_for_elements): Do not call
    _M_next_bkt anymore.
    (_Prime_rehash_policy::_M_next_bkt): Move usage of
    _S_growth_factor ...
    (_Prime_rehash_policy::_M_need_rehash): ... here.
    * include/bits/hashtable.h (_Hashtable<>): Adapt.

Tested under linux x86_64, normal and debug modes.

Thanks. First blush the patch looks good but please give us a few daysto analyze the details of it, we don't want to make mistakes for 4.8.

Regarding performance, I have done a small evolution of the 54075.cctest proposed last time. It is now checking performance with andwithout cache of hash code. Result is:
54075.cc std::unordered_set 300000 Foo insertionswithout cache 9r 9u 0s 13765616mem 0pf54075.cc std::unordered_set 300000 Foo insertionswith cache 14r 13u 0s 18562064mem 0pf54075.cc std::tr1::unordered_set 300000 Fooinsertions without cache 9r 8u 1s 13765616mem 0pf54075.cc std::tr1::unordered_set 300000 Fooinsertions with cache 14r 13u 0s 18561952mem 0pf
So the difference of performance in this case only seems to comefrom caching the hash code or not. In reported use case defaultbehavior of std::unordered_set is to cache hash codes andstd::tr1::unordered_set not to cache it. We should perhaps reviewdefault behavior regarding caching the hash code. Perhaps cache it ifthe hash functor can throw and not cache it otherwise, not easy tofind out what's best to do.

Ah good. I think we finally have nailed the core performance issue. And,as it turns out, I'm a bit confused about the logic we have in place nowfor the defaults: can you please summarize what we are doing and whichare the trade offs (leaving out the technicalities having to do with thefinal types)? I think the most interesting are three:


    1- std::hash<int>
    2- std::hash<std::string>
    3- user_defined_hash<xxx> which cannot throw

In the first we should normally not cache; in the second, from aperformance point of view (from the exception safety point of view wecould do both, because std::hash<std::string> doesn't throw anyway) itwould be better to cache; the third case is rather tricky, because, likethe case of std::string, from the exception safety point of view wecould do both, thus it's purely a performance issue. Do I understandcorrectly that currently we handle 2- and 3- above in the same way, thuswe cache? It seems to me that whereas that kind of default makes a lotof sense for std::string, doesn't necessarily make sense for everythingelse, and it seems to me that such kind of default makes a suboptimaluse of the knowledge we have via __is_noexcept_hash that the functordoesn't throw. That seems instead a sort of user-hint to not cache!Given the unfortunate situation that the user has no way to explicitlypick a behavior when instantiating the container, we can imagine that hecan anyway provide a strong if indirect hint by decorating or not withnoexcept the call operator. We could even document that as part of ourimplementation defined behavior. How does it sound? Do we have a way tofigure out what other implementations are doing? Outside std::hash, itshould be pretty easy to instantiate with a special functor whichinternally keeps a counter... if we have evidence that the other bestimplementations don't cache for 3- we should definitely do the same.


To summarize my intuitions are (again, leaving out the final technicalities)

    a- std::hash specializations for scalar types -> no cache

b- std::hash specialization for for std::string (or maybeeverything else, for simplicity) -> cache

    c- user defined functor -> cache or not basing on __is_noexcept_hash

Jon?

Thanks!
Paolo.

Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2

Reply via email to