On Thu, May 22, 2025 at 5:15 PM Tomasz Kaminski <tkami...@redhat.com> wrote:
> > > On Thu, May 22, 2025 at 5:04 PM Jonathan Wakely <jwak...@redhat.com> > wrote: > >> On Thu, 22 May 2025 at 15:50, Tomasz Kaminski <tkami...@redhat.com> >> wrote: >> > >> > >> > >> > On Thu, May 22, 2025 at 1:42 PM Jonathan Wakely <jwak...@redhat.com> >> wrote: >> >> >> >> The current overload set for __unique_copy handles three cases: >> >> >> >> - The input range uses forward iterators, the output range does not. >> >> This is the simplest case, and can just compare adjacent elements of >> >> the input range. >> >> >> >> - Neither the input range nor output range use forward iterators. >> >> This requires a local variable copied from the input range and >> updated >> >> by assigning each element to the local variable. >> >> >> >> - The output range uses forward iterators. >> >> For this case we compare the current element from the input range >> with >> >> the element just written to the output range. >> >> >> >> There are two problems with this implementation. Firstly, the third >> case >> >> assumes that the value type of the output range can be compared to the >> >> value type of the input range, which might not be possible at all, or >> >> might be possible but give different results to comparing elements of >> >> the input range. This is the problem identified in LWG 2439. >> >> >> >> Secondly, the third case is used when both ranges use forward >> iterators, >> >> even though the first case could (and should) be used. This means that >> >> we compare elements from the output range instead of the input range, >> >> with the problems described above (either not well-formed, or might >> give >> >> the wrong results). >> >> >> >> The cause of the second problem is that the overload for the first case >> >> looks like: >> >> >> >> OutputIterator >> >> __unique_copy(ForwardIter, ForwardIter, OutputIterator, BinaryPred, >> >> forward_iterator_tag, output_iterator_tag); >> >> >> >> When the output range uses forward iterators this overload cannot be >> >> used, because forward_iterator_tag does not inherit from >> >> output_iterator_tag, so is not convertible to it. >> >> >> >> To fix these problems we need to implement the resolution of LWG 2439 >> so >> >> that the third case is only used when the value types of the two ranges >> >> are the same. This ensures that the comparisons are well behaved. We >> >> also need to ensure that the first case is used when both ranges use >> >> forward iterators. >> >> >> >> This change replaces a single step of tag dispatching to choose between >> >> three overloads with two step of tag dispatching, choosing between two >> >> overloads at each step. The first step dispatches based on the iterator >> >> category of the input range, ignoring the category of the output range. >> >> The second step only happens when the input range uses non-forward >> >> iterators, and dispatches based on the category of the output range and >> >> whether the value type of the two ranges is the same. So now the cases >> >> that are handled are: >> >> >> >> - The input range uses forward iterators. >> >> - The output range uses non-forward iterators or a different value >> type. >> >> - The output range uses forward iterators and has the same value type. >> >> >> >> For the second case, the old code used >> __gnu_cxx::__ops::__iter_comp_val >> >> to wrap the predicate in another level of indirection. That seems >> >> unnecessary, as we can just use a pointer to the local variable instead >> >> of an iterator referring to it. >> >> >> >> libstdc++-v3/ChangeLog: >> >> >> >> PR libstdc++/120386 >> >> * include/bits/stl_algo.h (__unique_copy_1): New overloads for >> >> the case where the input range uses non-forward iterators. >> >> (__unique_copy): Replace three overloads with two, depending >> >> only on the iterator category of the input range. Dispatch to >> >> __unique_copy_1 for the non-forward case. >> >> (unique_copy): Only pass the input range category to >> >> __unique_copy. >> >> --- >> >> >> >> Tested x86_64-linux. >> > >> > LGTM. Only small suggestion, regarding the change of order of arguments. >> >> I forgot to say that I need to add tests for each of the cases, >> especially the case that fails with the existing code! >> >> >> >> >> >> >> libstdc++-v3/include/bits/stl_algo.h | 80 +++++++++++++++------------- >> >> 1 file changed, 44 insertions(+), 36 deletions(-) >> >> >> >> diff --git a/libstdc++-v3/include/bits/stl_algo.h >> b/libstdc++-v3/include/bits/stl_algo.h >> >> index f5361aeab7e2..c0bb17f9c8b2 100644 >> >> --- a/libstdc++-v3/include/bits/stl_algo.h >> >> +++ b/libstdc++-v3/include/bits/stl_algo.h >> >> @@ -918,24 +918,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> >> >> __gnu_cxx::__ops::__iter_comp_iter(__binary_pred)); >> >> } >> >> >> >> - /** >> >> - * This is an uglified >> >> - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, >> >> - * _BinaryPredicate) >> >> - * overloaded for forward iterators and output iterator as result. >> >> - */ >> >> + // Implementation of std::unique_copy for forward iterators. >> >> + // This case is easy, just compare *i with *(i-1). >> >> template<typename _ForwardIterator, typename _OutputIterator, >> >> typename _BinaryPredicate> >> >> _GLIBCXX20_CONSTEXPR >> >> _OutputIterator >> >> __unique_copy(_ForwardIterator __first, _ForwardIterator __last, >> >> _OutputIterator __result, _BinaryPredicate >> __binary_pred, >> >> - forward_iterator_tag, output_iterator_tag) >> >> + forward_iterator_tag) >> >> { >> >> _ForwardIterator __next = __first; >> >> *__result = *__first; >> >> while (++__next != __last) >> >> - if (!__binary_pred(__first, __next)) >> >> + if (!__binary_pred(__next, __first)) >> > >> > I would prefer if you will not do this change, and pass iterators that >> were already seen as the first argument. >> >> The standard seems clear that it should be bool(pred(*i, *(i - 1))) >> In theory a predicate could depend on that. >> > Oh, indeed. > >> >> > Note that the forward-output overload, preserves this order: >> > *__result = *__first; >> > while (++__first != __last) >> > if (!__binary_pred(__result, __first)) >> >> Ah yes, well I should have changed that too ;-) >> >> What's your reason for preferring the current order? >> > That what I would intuitively expect, that left argument is an element > that is left to right argument. > And if you sorted range with predicate lt, then passing not_fn(lt) is > equivalent to checking equality. > It seems that I have built my expectation based on unique, that check i-1, i. https://eel.is/c++draft/alg.unique#1 > > >> > *++__result = *__first; >> > >> >> { >> >> __first = __next; >> >> *++__result = *__first; >> >> @@ -943,27 +939,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> >> return ++__result; >> >> } >> >> >> >> - /** >> >> - * This is an uglified >> >> - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, >> >> - * _BinaryPredicate) >> >> - * overloaded for input iterators and output iterator as result. >> >> - */ >> >> + // Implementation of std::unique_copy for non-forward iterators, >> >> + // where we cannot compare with elements written to the output. >> >> template<typename _InputIterator, typename _OutputIterator, >> >> typename _BinaryPredicate> >> >> _GLIBCXX20_CONSTEXPR >> >> _OutputIterator >> >> - __unique_copy(_InputIterator __first, _InputIterator __last, >> >> - _OutputIterator __result, _BinaryPredicate >> __binary_pred, >> >> - input_iterator_tag, output_iterator_tag) >> >> + __unique_copy_1(_InputIterator __first, _InputIterator __last, >> >> + _OutputIterator __result, _BinaryPredicate >> __binary_pred, >> >> + __false_type) >> >> { >> >> - typename iterator_traits<_InputIterator>::value_type __value = >> *__first; >> >> - __decltype(__gnu_cxx::__ops::__iter_comp_val(__binary_pred)) >> >> - __rebound_pred >> >> - = __gnu_cxx::__ops::__iter_comp_val(__binary_pred); >> >> + typedef typename iterator_traits<_InputIterator>::value_type >> _Val; >> >> + _Val __value = *__first; >> >> *__result = __value; >> >> while (++__first != __last) >> >> - if (!__rebound_pred(__first, __value)) >> >> + if (!__binary_pred(__first, std::__addressof(__value))) >> > >> > I would instead change the order here to std::__addressof(__value), >> __first. >> >> >> >> { >> >> __value = *__first; >> >> *++__result = __value; >> >> @@ -971,19 +961,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> >> return ++__result; >> >> } >> >> >> >> - /** >> >> - * This is an uglified >> >> - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, >> >> - * _BinaryPredicate) >> >> - * overloaded for input iterators and forward iterator as result. >> >> - */ >> >> + // Implementation of std::unique_copy for non-forward iterators, >> >> + // where we can compare with the last element written to the output. >> >> template<typename _InputIterator, typename _ForwardIterator, >> >> typename _BinaryPredicate> >> >> - _GLIBCXX20_CONSTEXPR >> >> _ForwardIterator >> >> - __unique_copy(_InputIterator __first, _InputIterator __last, >> >> - _ForwardIterator __result, _BinaryPredicate >> __binary_pred, >> >> - input_iterator_tag, forward_iterator_tag) >> >> + __unique_copy_1(_InputIterator __first, _InputIterator __last, >> >> + _ForwardIterator __result, _BinaryPredicate >> __binary_pred, >> >> + __true_type) >> >> { >> >> *__result = *__first; >> >> while (++__first != __last) >> >> @@ -992,6 +977,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> >> return ++__result; >> >> } >> >> >> >> + // Implementation of std::unique_copy for non-forward iterators. >> >> + // We cannot compare *i to *(i-1) so we need to either make a copy >> >> + // or compare with the last element written to the output range. >> >> + template<typename _InputIterator, typename _OutputIterator, >> >> + typename _BinaryPredicate> >> >> + _GLIBCXX20_CONSTEXPR >> >> + _OutputIterator >> >> + __unique_copy(_InputIterator __first, _InputIterator __last, >> >> + _OutputIterator __result, _BinaryPredicate >> __binary_pred, >> >> + input_iterator_tag) >> >> + { >> >> + // _GLIBCXX_RESOLVE_LIB_DEFECTS >> >> + // 2439. unique_copy() sometimes can't fall back to reading its >> output >> >> + typedef iterator_traits<_InputIterator> _InItTraits; >> >> + typedef iterator_traits<_OutputIterator> _OutItTraits; >> >> + typedef typename _OutItTraits::iterator_category _Cat; >> >> + const bool __output_is_fwd = __is_base_of(forward_iterator_tag, >> _Cat); >> >> + const bool __same_type = __is_same(typename >> _OutItTraits::value_type, >> >> + typename >> _InItTraits::value_type); >> >> >> >> + typedef __truth_type<__output_is_fwd && __same_type> >> __cmp_with_output; >> > >> > No change needed, but I was wondering of output only iterator, can have >> non-void value type, >> > but making sure that they are also forwards mitigates this question. >> >> >> >> + return std::__unique_copy_1(__first, __last, __result, >> __binary_pred, >> >> + typename >> __cmp_with_output::__type()); >> >> + } >> >> + >> >> + >> >> /** >> >> * This is an uglified reverse(_BidirectionalIterator, >> >> * _BidirectionalIterator) >> >> @@ -4456,8 +4466,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO >> >> return __result; >> >> return std::__unique_copy(__first, __last, __result, >> >> >> __gnu_cxx::__ops::__iter_equal_to_iter(), >> >> - std::__iterator_category(__first), >> >> - std::__iterator_category(__result)); >> >> + std::__iterator_category(__first)); >> >> } >> >> >> >> /** >> >> @@ -4499,8 +4508,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO >> >> return __result; >> >> return std::__unique_copy(__first, __last, __result, >> >> >> __gnu_cxx::__ops::__iter_comp_iter(__binary_pred), >> >> - std::__iterator_category(__first), >> >> - std::__iterator_category(__result)); >> >> + std::__iterator_category(__first)); >> >> } >> >> >> >> #if __cplusplus <= 201103L || _GLIBCXX_USE_DEPRECATED >> >> -- >> >> 2.49.0 >> >> >> >>