On Thu, 26 Oct 2023 at 21:52, François Dumont <frs.dum...@gmail.com> wrote: > > > On 26/10/2023 12:43, Jonathan Wakely wrote: > > On 26/10/23 07:18 +0200, François Dumont wrote: > >> libstdc++: [_Hashtable] Use RAII type to manage rehash functor state > >> > >> Replace usage of __try/__catch with a RAII type to restore rehash > >> functor > >> state when needed. > > > > Generally I really like replacing try-catch with RAII but I have some > > questions below. > > > >> libstdc++-v3/ChangeLog: > >> > >> * include/bits/hashtable_policy.h (_RehashStateGuard): New. > >> (_Insert_base<>::_M_insert_range(_IIt, _IIt, const > >> _NodeGet&, false_type)): > >> Adapt. > >> * include/bits/hashtable.h (__rehash_guard_t): New. > >> (__rehash_state): Remove. > >> (_M_rehash): Remove. > >> (_M_rehash_aux): Rename into _M_rehash. > >> (_M_assign_elements, _M_insert_unique_node, > >> _M_insert_multi_node): Adapt. > >> (rehash): Adapt. > >> > >> > >> Tested under Linux x64. > >> > >> Ok to commit ? > >> > >> François > > > >> diff --git a/libstdc++-v3/include/bits/hashtable.h > >> b/libstdc++-v3/include/bits/hashtable.h > >> index 0857448f7ed..64071ac1fb2 100644 > >> --- a/libstdc++-v3/include/bits/hashtable.h > >> +++ b/libstdc++-v3/include/bits/hashtable.h > >> @@ -234,6 +234,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> _RehashPolicy, _Traits>; > >> using __enable_default_ctor > >> = _Hashtable_enable_default_ctor<_Equal, _Hash, _Alloc>; > >> + using __rehash_guard_t > >> + = __detail::_RehashStateGuard<_RehashPolicy>; > >> > >> public: > >> typedef _Key key_type; > >> @@ -264,7 +266,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> > >> private: > >> using __rehash_type = _RehashPolicy; > >> - using __rehash_state = typename __rehash_type::_State; > >> > >> using __unique_keys = typename __traits_type::__unique_keys; > >> > >> @@ -1200,14 +1201,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> > >> private: > >> // Helper rehash method used when keys are unique. > >> - void _M_rehash_aux(size_type __bkt_count, true_type __uks); > >> + void _M_rehash(size_type __bkt_count, true_type __uks); > >> > >> // Helper rehash method used when keys can be non-unique. > >> - void _M_rehash_aux(size_type __bkt_count, false_type __uks); > >> - > >> - // Unconditionally change size of bucket array to n, restore > >> - // hash policy state to __state on exception. > >> - void _M_rehash(size_type __bkt_count, const __rehash_state& > >> __state); > >> + void _M_rehash(size_type __bkt_count, false_type __uks); > >> }; > >> > >> // Definitions of class template _Hashtable's out-of-line member > >> functions. > >> @@ -1337,7 +1334,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> { > >> __buckets_ptr __former_buckets = nullptr; > >> std::size_t __former_bucket_count = _M_bucket_count; > >> - const __rehash_state& __former_state = _M_rehash_policy._M_state(); > >> + __rehash_guard_t __rehash_guard(_M_rehash_policy); > >> > >> if (_M_bucket_count != __ht._M_bucket_count) > >> { > >> @@ -1359,6 +1356,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> _M_assign(std::forward<_Ht>(__ht), __roan); > >> if (__former_buckets) > >> _M_deallocate_buckets(__former_buckets, > >> __former_bucket_count); > >> + __rehash_guard._M_reset = false; > > > > I find this confusing. Usually "reset" means that something is > > cleared, so won't take an action in the destructor. e.g. if you use > > std::unique_ptr::reset() then the object is destroyed immediately, and > > then nothing happens in the destructor. Here it's the opposite, > > _M_reset=true means that it _sould_ do something in the destructor. > > > > The problem is the ambiguity between "reset the state in the > > destructor later" and "reset the object to an empty state now". > > > > If the member was called _M_guarded then it might be clearer. > > _M_guarded=true means the guard is active, and will restore the state > > later. Or _M_active, or _M_armed, or even _M_reset_in_dtor. Any of > > those names avoids the confusion with the semantics of > > std::unique_ptr::reset() and similar well-known APIs. > > > > Or what I usually do is store a pointer to the guarded object in the > > RAII guard type, and then just null the pointer to disarm the guard. > > That means you don't need a separate bool member variable. If > > _RehashStateGuard::_M_rehash_policy was called _M_guarded_obj and was > > a _RehashPolicy* instead of _RehashPolicy& then disarming it would be: > > > > __rehash_guard._M_guarded_obj = nullptr; > > > > This seems clear to me, as it says that the guard no longer has > > anything to guard, so won't do anything in the destructor. > > > Looks clearer to me too. > > I started with a _RehashStateScope and a _M_commit() method, it could > have been worst :-) > > > > > >> } > >> __catch(...) > >> { > >> @@ -1366,7 +1364,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> { > >> // Restore previous buckets. > >> _M_deallocate_buckets(); > >> - _M_rehash_policy._M_reset(__former_state); > >> _M_buckets = __former_buckets; > >> _M_bucket_count = __former_bucket_count; > >> } > >> @@ -2142,17 +2139,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> __node_ptr __node, size_type __n_elt) > >> -> iterator > >> { > >> - const __rehash_state& __saved_state = > >> _M_rehash_policy._M_state(); > >> + __rehash_guard_t __rehash_guard(_M_rehash_policy); > >> std::pair<bool, std::size_t> __do_rehash > >> = _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count, > >> __n_elt); > >> > >> if (__do_rehash.first) > >> { > >> - _M_rehash(__do_rehash.second, __saved_state); > >> + _M_rehash(__do_rehash.second, true_type{}); > >> __bkt = _M_bucket_index(__code); > >> } > >> > >> + __rehash_guard._M_reset = false; > >> this->_M_store_code(*__node, __code); > > > > This changes behaviour. Previously the try-catch was inside the call to > > _M_rehash. Now we're starting to guard before calling _M_need_rehash, > > We were capturing _M_rehash_policy state before the call to > _M_need_rehash so it's not different. > > We really need to instantiate the guard before we call _M_need_rehash > cause it is the method changing the _M_rehash_policy state. > > > and we run the destructor unconditionally even if we never call > > _M_rehash. > > Here we only want to reset rehash state on exception. > > Note that with the 2 rehash policy implementations we provide it doesn't > matter if we reset or not when we do not call _M_rehash because their > state changes only when rehash is needed. But some user's fancy rehash > policy might update their state even if no rehash is requested and in > this case it is better to reset this state only on exception. > > > > I think that's OK, because _M_need_rehash is noexcept (but > > it's confusing because we have a const overload of _M_need_rehash > > which is noexcept(false) and a non-const overload which is > > noexcept(true) ... should they both be noexcept?) > > That's another subject but yes, _M_need_rehash should be non-const and > noexcept on both rehash policy implementations. I considered doing such > a thing but won't it impact abi as _Prime_rehash_policy._M_need_rehash > symbols is coming from the lib ? > > > > > > Can we do this instead: > > > > > > - const __rehash_state& __saved_state = _M_rehash_policy._M_state(); > > std::pair<bool, std::size_t> __do_rehash > > = _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count, > > __n_elt); > > > > if (__do_rehash.first) > > { > > + __rehash_guard_t __rehash_guard(_M_rehash_policy); > > Clearly no, we need to capture state before _M_need_rehash. Like it was > done with the __saved_state above. > > > > - _M_rehash(__do_rehash.second, __saved_state); > > + _M_rehash(__do_rehash.second, true_type{}); > > + __rehash_guard._M_reset = false; > > __bkt = _M_bucket_index(__code); > > } > > > > this->_M_store_code(*__node, __code); > > > > This only creates a guard object if we're actually going to rehash, so > > we don't run its destructor (and check if we need t restore anything) > > when no rehash is needed. > > > > N.B. _M_bucket_index is also confusing. There are two overloads of it, > > one is noexcept and one isn't. The one that we call here is > > noexcept(false), so that call could throw ... is it OK that we don't > > restore the old state if that happens? Should that be guarded too? (I > > have no idea). > > > > Also, I see that the noexcept(true) overload of _M_bucket_index calls > > __hash_code_base::_M_bucket_index, which is noexcept(/* condition */), > > so the caller says it cannot throw, but it calls something that > > sometimes throws. Is that correct? > > It can be noexcept because usage of hash code cache depends on either > hash functor is noexcept or not. > > > > >> > >> // Always insert at the beginning of the bucket. > >> @@ -2172,13 +2170,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> __hash_code __code, __node_ptr __node) > >> -> iterator > >> { > >> - const __rehash_state& __saved_state = > >> _M_rehash_policy._M_state(); > >> + __rehash_guard_t __rehash_guard(_M_rehash_policy); > >> std::pair<bool, std::size_t> __do_rehash > >> = _M_rehash_policy._M_need_rehash(_M_bucket_count, > >> _M_element_count, 1); > >> > >> if (__do_rehash.first) > >> - _M_rehash(__do_rehash.second, __saved_state); > >> + _M_rehash(__do_rehash.second, false_type{}); > >> > >> + __rehash_guard._M_reset = false; > > > > Again, we're creating the guard object in a larger scope than needed. > > > >> this->_M_store_code(*__node, __code); > >> const key_type& __k = _ExtractKey{}(__node->_M_v()); > >> size_type __bkt = _M_bucket_index(__code); > >> @@ -2509,39 +2508,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > >> rehash(size_type __bkt_count) > >> { > >> - const __rehash_state& __saved_state = > >> _M_rehash_policy._M_state(); > >> + __rehash_guard_t __rehash_guard(_M_rehash_policy); > >> __bkt_count > >> = std::max(_M_rehash_policy._M_bkt_for_elements(_M_element_count > >> + 1), > >> __bkt_count); > >> __bkt_count = _M_rehash_policy._M_next_bkt(__bkt_count); > >> > >> if (__bkt_count != _M_bucket_count) > >> - _M_rehash(__bkt_count, __saved_state); > >> - else > >> - // No rehash, restore previous state to keep it consistent with > >> - // container state. > >> - _M_rehash_policy._M_reset(__saved_state); > >> - } > >> - > >> - template<typename _Key, typename _Value, typename _Alloc, > >> - typename _ExtractKey, typename _Equal, > >> - typename _Hash, typename _RangeHash, typename _Unused, > >> - typename _RehashPolicy, typename _Traits> > >> - void > >> - _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, > >> - _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > >> - _M_rehash(size_type __bkt_count, const __rehash_state& __state) > >> - { > >> - __try > >> - { > >> - _M_rehash_aux(__bkt_count, __unique_keys{}); > >> - } > >> - __catch(...) > >> { > >> - // A failure here means that buckets allocation failed. We only > >> - // have to restore hash policy previous state. > >> - _M_rehash_policy._M_reset(__state); > >> - __throw_exception_again; > >> + _M_rehash(__bkt_count, __unique_keys{}); > >> + __rehash_guard._M_reset = false; > > > > And a larger scope again. > > > >> } > >> } > >> > >> @@ -2553,7 +2529,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> void > >> _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, > >> _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > >> - _M_rehash_aux(size_type __bkt_count, true_type /* __uks */) > >> + _M_rehash(size_type __bkt_count, true_type /* __uks */) > >> { > >> __buckets_ptr __new_buckets = _M_allocate_buckets(__bkt_count); > >> __node_ptr __p = _M_begin(); > >> @@ -2596,7 +2572,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> void > >> _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, > >> _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > >> - _M_rehash_aux(size_type __bkt_count, false_type /* __uks */) > >> + _M_rehash(size_type __bkt_count, false_type /* __uks */) > >> { > >> __buckets_ptr __new_buckets = _M_allocate_buckets(__bkt_count); > >> __node_ptr __p = _M_begin(); > >> diff --git a/libstdc++-v3/include/bits/hashtable_policy.h > >> b/libstdc++-v3/include/bits/hashtable_policy.h > >> index 5d162463dc3..8b9626b1575 100644 > >> --- a/libstdc++-v3/include/bits/hashtable_policy.h > >> +++ b/libstdc++-v3/include/bits/hashtable_policy.h > >> @@ -715,6 +715,26 @@ namespace __detail > >> std::size_t _M_next_resize; > >> }; > >> > >> + template<typename _RehashPolicy> > >> + struct _RehashStateGuard > >> + { > >> + _RehashPolicy& _M_rehash_policy; > >> + typename _RehashPolicy::_State _M_prev_state; > >> + bool _M_reset = true; > > > > This could be: > > > > + _RehashPolicy* _M_guarded_obj; > > + typename _RehashPolicy::_State _M_prev_state; > > > >> + > >> + _RehashStateGuard(_RehashPolicy& __policy) > >> + : _M_rehash_policy(__policy) > >> + , _M_prev_state(__policy._M_state()) > > > > + _RehashStateGuard(_RehashPolicy& __policy) > > + : _M_guarded_obj(std::__addressof(__policy)) > > + , _M_prev_state(__policy._M_state()) > > > >> + { } > >> + _RehashStateGuard(const _RehashStateGuard&) = delete; > >> + > >> + ~_RehashStateGuard() > >> + { > >> + if (_M_reset) > >> + _M_rehash_policy._M_reset(_M_prev_state); > > > > + if (_M_guarded_obj) > > + _M_guarded_obj->_M_reset(_M_prev_state); > > > >> + } > >> + }; > >> + > >> // Base classes for std::_Hashtable. We define these base classes > >> // because in some cases we want to do different things depending on > >> // the value of a policy class. In some cases the policy class > >> @@ -1007,7 +1027,7 @@ namespace __detail > >> const _NodeGetter& __node_gen, false_type __uks) > >> { > >> using __rehash_type = typename __hashtable::__rehash_type; > >> - using __rehash_state = typename __hashtable::__rehash_state; > >> + using __rehash_guard_t = typename __hashtable::__rehash_guard_t; > >> using pair_type = std::pair<bool, std::size_t>; > >> > >> size_type __n_elt = __detail::__distance_fw(__first, __last); > >> @@ -1016,14 +1036,15 @@ namespace __detail > >> > >> __hashtable& __h = _M_conjure_hashtable(); > >> __rehash_type& __rehash = __h._M_rehash_policy; > >> - const __rehash_state& __saved_state = __rehash._M_state(); > >> + __rehash_guard_t __rehash_guard(__rehash); > >> pair_type __do_rehash = __rehash._M_need_rehash(__h._M_bucket_count, > >> __h._M_element_count, > >> __n_elt); > >> > >> if (__do_rehash.first) > >> - __h._M_rehash(__do_rehash.second, __saved_state); > >> + __h._M_rehash(__do_rehash.second, __uks); > >> > >> + __rehash_guard._M_reset = false; > > > > A larger scope again. > > Note that there is only one place where we reset rehash policy state > even if no exception took place. It is in the std rehash method when the > user requests a rehash that has no effect. > > Here is the updated patch, ok to commit ?
Thanks for the explanations, OK for trunk.