On 16/01/20 13:25 +0000, Jonathan Wakely wrote:
On 16/01/20 07:42 +0100, François Dumont wrote:
On 1/15/20 10:52 PM, Jonathan Wakely wrote:
On 15/01/20 21:48 +0000, Jonathan Wakely wrote:
On 14/01/20 22:25 +0100, François Dumont wrote:
On 1/13/20 10:53 PM, Jonathan Wakely wrote:
On 13/01/20 22:41 +0100, François Dumont wrote:
For the multi-keys we could still avoid redundant
comparisons when _Equal is just doing == on the key type.
On unordered_multiset we could just avoids the call to
is_permuation and on the unordered_multimap we could check
the is_permutation only on the associated value rather
than on the std::pair.
I don't think that's necessary, or helpful.
The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
that you shouldn't be using _Equal at all, and therefore it doesn't
matter whether it's std::equal_to or not.
And it was indeed possible.
Nice!
PR libstdc++/91223
* include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
* include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
(_Equality_base): Remove.
(_Equality<>::_M_equal): Review implementation. Use
std::is_permutation.
* testsuite/23_containers/unordered_multiset/operators/1.cc
(Hash, Equal, test02, test03): New.
* testsuite/23_containers/unordered_set/operators/1.cc
(Hash, Equal, test02, test03): New.
Tested under Linux x86_64.
Ok to commit ?
Yes, OK for trunk (we're in stage4 but your patch was posted in stage3
and fixes a pretty nasty performance bug, so is OK now).
N.B. GCC has moved to Git instead of Subversion. If you don't have Git
access set up let me know and I can commit the patch for you.
I haven't done the move yet and won't be able to do it before the
week-end. So please proceed to the commit for me, thanks.
No problem, I can do that.
Your patch is now committed to trunk. Thanks for the major
improvement.
I had a look at std::is_permutation and I think we can make some
simplifications to the 4-argument overload, and we can share most of
the code between the 3-arg and 4-arg overloads (once they've confirmed
the lengths are the same they do exactly the same thing). See the
attached patch. This should probably wait for stage1 though.
I also wanted to add some comments to the _Equality::_M_equal
specialiation for unordered_multimap/multiset to explain what the code
was doing, and had some more ideas. See patch again.
It looks like this loop can potentially visit every element of
__other, instead of stopping at the end of the bucket:
typename __hashtable::const_iterator __ity(__y_n);
for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
if (--__x_count == 0)
break;
Consider a case like this:
unordered_multiset<int> a{1, 2, 3, 4};
for (int i = 0; i <10000; ++i)
a.insert(1);
unordered_multiset<int> b{1, 2, 3, 4};
for (int i = 0; i <10000; ++i)
b.insert(2);
When doing a == b we'll find 10000 elements in a with key '1',
and find one element in b with that key. But then we iterate through
every element in b after that one, even though they have different
keys and are probably in different buckets.
Instead of just iterating from __ity to __other.end(), can we use a
local iterator so we stop at the end of the bucket?
This seems to make the PR91263 example *very* slightly slower, but
makes the example above significantly faster.
What do you think?
diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index 6503d1518d3..6d5d3b2fced 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -3588,6 +3588,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return std::make_pair(*__p.first, *__p.second);
}
+ // precondition: distance(__first1, __last1) == distance(__first2, __last2).
+ template<typename _FwdIter1, typename _FwdIter2,
+ typename _BinaryPredicate = __gnu_cxx::__ops::_Iter_equal_to_iter>
+ _GLIBCXX20_CONSTEXPR
+ bool
+ __is_permutation_impl(_FwdIter1 __first1, _FwdIter1 __last1,
+ _FwdIter2 __first2, _FwdIter2 __last2,
+ _BinaryPredicate __pred = {})
+ {
+ for (auto __scan = __first1; __scan != __last1; ++__scan)
+ {
+ if (__scan != std::__find_if(__first1, __scan,
+ __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan)))
+ continue; // We've seen this one before.
+
+ auto __matches
+ = std::__count_if(__first2, __last2,
+ __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan));
+ if (0 == __matches ||
+ std::__count_if(__scan, __last1,
+ __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan))
+ != __matches)
+ return false;
+ }
+ return true;
+ }
+
template<typename _ForwardIterator1, typename _ForwardIterator2,
typename _BinaryPredicate>
_GLIBCXX20_CONSTEXPR
@@ -3604,26 +3631,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (__first1 == __last1)
return true;
- // Establish __last2 assuming equal ranges by iterating over the
- // rest of the list.
- _ForwardIterator2 __last2 = __first2;
- std::advance(__last2, std::distance(__first1, __last1));
- for (_ForwardIterator1 __scan = __first1; __scan != __last1; ++__scan)
- {
- if (__scan != std::__find_if(__first1, __scan,
- __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan)))
- continue; // We've seen this one before.
-
- auto __matches
- = std::__count_if(__first2, __last2,
- __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan));
- if (0 == __matches ||
- std::__count_if(__scan, __last1,
- __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan))
- != __matches)
- return false;
- }
- return true;
+ // Establish __last2 by iterating over the rest of the list.
+ auto __last2 = std::next(__first2, std::distance(__first1, __last1));
+ return std::__is_permutation_impl(__first1, __last1, __first2, __last2,
+ __pred);
}
/**
@@ -3720,36 +3731,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (!__pred(__first1, __first2))
break;
- if (__ra_iters)
+ if (__first1 == __last1)
{
- if (__first1 == __last1)
+ if (__ra_iters)
return true;
- }
- else
- {
- auto __d1 = std::distance(__first1, __last1);
- auto __d2 = std::distance(__first2, __last2);
- if (__d1 == 0 && __d2 == 0)
- return true;
- if (__d1 != __d2)
- return false;
+ return __first2 == __last2;
}
- for (_ForwardIterator1 __scan = __first1; __scan != __last1; ++__scan)
+ auto __d1 = std::distance(__first1, __last1);
+ while (__first2 != __last2 && __d1 > 0)
{
- if (__scan != std::__find_if(__first1, __scan,
- __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan)))
- continue; // We've seen this one before.
-
- auto __matches = std::__count_if(__first2, __last2,
- __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan));
- if (0 == __matches
- || std::__count_if(__scan, __last1,
- __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan))
- != __matches)
- return false;
+ ++__first2;
+ --__d1;
}
- return true;
+ if (__d1 > 0 || __first2 != __last2)
+ return false;
+
+ return std::__is_permutation_impl(__first1, __last1, __first2, __last2,
+ __pred);
}
/**
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 4024e6c37fa..c9188920cd8 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -1915,6 +1915,7 @@ namespace __detail
for (auto __itx = __this->begin(); __itx != __this->end();)
{
+ // Find all the elements in *this with keys equivalent to *itx.
std::size_t __x_count = 1;
auto __itx_end = __itx;
for (++__itx_end; __itx_end != __this->end()
@@ -1922,32 +1923,21 @@ namespace __detail
_ExtractKey()(*__itx_end));
++__itx_end)
++__x_count;
+ // Now [itx, itx_end) is the result of equal_range(key-of(*itx))
+ // and xcount is the result of std::distance(itx, itx_end).
- std::size_t __ybkt = __other._M_bucket_index(__itx._M_cur);
- __node_base* __y_prev_n = __other._M_buckets[__ybkt];
- if (!__y_prev_n)
- return false;
-
- __node_type* __y_n = static_cast<__node_type*>(__y_prev_n->_M_nxt);
- for (;; __y_n = __y_n->_M_next())
- {
- if (__this->key_eq()(_ExtractKey()(__y_n->_M_v()),
- _ExtractKey()(*__itx)))
- break;
-
- if (!__y_n->_M_nxt
- || __other._M_bucket_index(__y_n->_M_next()) != __ybkt)
- return false;
- }
-
- typename __hashtable::const_iterator __ity(__y_n);
- for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
- if (--__x_count == 0)
- break;
-
- if (__x_count != 0)
+ // Find the first element in __other with key equivalent to *itx.
+ size_t __ybkt = __other._M_bucket_index(__itx._M_cur);
+ auto __ity = __other.begin(__ybkt), __ity_end = __other.end(__ybkt);
+ while (__ity != __ity_end
+ && !__this->key_eq()(_ExtractKey()(*__ity),
+ _ExtractKey()(*__itx)))
+ ++__ity;
+
+ if (std::distance(__ity, __ity_end) < __x_count)
return false;
+ using const_iterator = typename __hashtable::const_iterator;
if (!std::is_permutation(__itx, __itx_end, __ity))
return false;