On Thu, May 22, 2025 at 1:42 PM Jonathan Wakely <jwak...@redhat.com> wrote:
> The current overload set for __unique_copy handles three cases: > > - The input range uses forward iterators, the output range does not. > This is the simplest case, and can just compare adjacent elements of > the input range. > > - Neither the input range nor output range use forward iterators. > This requires a local variable copied from the input range and updated > by assigning each element to the local variable. > > - The output range uses forward iterators. > For this case we compare the current element from the input range with > the element just written to the output range. > > There are two problems with this implementation. Firstly, the third case > assumes that the value type of the output range can be compared to the > value type of the input range, which might not be possible at all, or > might be possible but give different results to comparing elements of > the input range. This is the problem identified in LWG 2439. > > Secondly, the third case is used when both ranges use forward iterators, > even though the first case could (and should) be used. This means that > we compare elements from the output range instead of the input range, > with the problems described above (either not well-formed, or might give > the wrong results). > > The cause of the second problem is that the overload for the first case > looks like: > > OutputIterator > __unique_copy(ForwardIter, ForwardIter, OutputIterator, BinaryPred, > forward_iterator_tag, output_iterator_tag); > > When the output range uses forward iterators this overload cannot be > used, because forward_iterator_tag does not inherit from > output_iterator_tag, so is not convertible to it. > > To fix these problems we need to implement the resolution of LWG 2439 so > that the third case is only used when the value types of the two ranges > are the same. This ensures that the comparisons are well behaved. We > also need to ensure that the first case is used when both ranges use > forward iterators. > > This change replaces a single step of tag dispatching to choose between > three overloads with two step of tag dispatching, choosing between two > overloads at each step. The first step dispatches based on the iterator > category of the input range, ignoring the category of the output range. > The second step only happens when the input range uses non-forward > iterators, and dispatches based on the category of the output range and > whether the value type of the two ranges is the same. So now the cases > that are handled are: > > - The input range uses forward iterators. > - The output range uses non-forward iterators or a different value type. > - The output range uses forward iterators and has the same value type. > > For the second case, the old code used __gnu_cxx::__ops::__iter_comp_val > to wrap the predicate in another level of indirection. That seems > unnecessary, as we can just use a pointer to the local variable instead > of an iterator referring to it. > > libstdc++-v3/ChangeLog: > > PR libstdc++/120386 > * include/bits/stl_algo.h (__unique_copy_1): New overloads for > the case where the input range uses non-forward iterators. > (__unique_copy): Replace three overloads with two, depending > only on the iterator category of the input range. Dispatch to > __unique_copy_1 for the non-forward case. > (unique_copy): Only pass the input range category to > __unique_copy. > --- > > Tested x86_64-linux. > LGTM. Only small suggestion, regarding the change of order of arguments. > > libstdc++-v3/include/bits/stl_algo.h | 80 +++++++++++++++------------- > 1 file changed, 44 insertions(+), 36 deletions(-) > > diff --git a/libstdc++-v3/include/bits/stl_algo.h > b/libstdc++-v3/include/bits/stl_algo.h > index f5361aeab7e2..c0bb17f9c8b2 100644 > --- a/libstdc++-v3/include/bits/stl_algo.h > +++ b/libstdc++-v3/include/bits/stl_algo.h > @@ -918,24 +918,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > __gnu_cxx::__ops::__iter_comp_iter(__binary_pred)); > } > > - /** > - * This is an uglified > - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, > - * _BinaryPredicate) > - * overloaded for forward iterators and output iterator as result. > - */ > + // Implementation of std::unique_copy for forward iterators. > + // This case is easy, just compare *i with *(i-1). > template<typename _ForwardIterator, typename _OutputIterator, > typename _BinaryPredicate> > _GLIBCXX20_CONSTEXPR > _OutputIterator > __unique_copy(_ForwardIterator __first, _ForwardIterator __last, > _OutputIterator __result, _BinaryPredicate __binary_pred, > - forward_iterator_tag, output_iterator_tag) > + forward_iterator_tag) > { > _ForwardIterator __next = __first; > *__result = *__first; > while (++__next != __last) > - if (!__binary_pred(__first, __next)) > + if (!__binary_pred(__next, __first)) > I would prefer if you will not do this change, and pass iterators that were already seen as the first argument. Note that the forward-output overload, preserves this order: *__result = *__first; while (++__first != __last) if (!__binary_pred(__result, __first)) *++__result = *__first; { > __first = __next; > *++__result = *__first; > @@ -943,27 +939,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > return ++__result; > } > > - /** > - * This is an uglified > - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, > - * _BinaryPredicate) > - * overloaded for input iterators and output iterator as result. > - */ > + // Implementation of std::unique_copy for non-forward iterators, > + // where we cannot compare with elements written to the output. > template<typename _InputIterator, typename _OutputIterator, > typename _BinaryPredicate> > _GLIBCXX20_CONSTEXPR > _OutputIterator > - __unique_copy(_InputIterator __first, _InputIterator __last, > - _OutputIterator __result, _BinaryPredicate __binary_pred, > - input_iterator_tag, output_iterator_tag) > + __unique_copy_1(_InputIterator __first, _InputIterator __last, > + _OutputIterator __result, _BinaryPredicate > __binary_pred, > + __false_type) > { > - typename iterator_traits<_InputIterator>::value_type __value = > *__first; > - __decltype(__gnu_cxx::__ops::__iter_comp_val(__binary_pred)) > - __rebound_pred > - = __gnu_cxx::__ops::__iter_comp_val(__binary_pred); > + typedef typename iterator_traits<_InputIterator>::value_type _Val; > + _Val __value = *__first; > *__result = __value; > while (++__first != __last) > - if (!__rebound_pred(__first, __value)) > + if (!__binary_pred(__first, std::__addressof(__value))) > I would instead change the order here to std::__addressof(__value), __first. > { > __value = *__first; > *++__result = __value; > @@ -971,19 +961,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > return ++__result; > } > > - /** > - * This is an uglified > - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, > - * _BinaryPredicate) > - * overloaded for input iterators and forward iterator as result. > - */ > + // Implementation of std::unique_copy for non-forward iterators, > + // where we can compare with the last element written to the output. > template<typename _InputIterator, typename _ForwardIterator, > typename _BinaryPredicate> > - _GLIBCXX20_CONSTEXPR > _ForwardIterator > - __unique_copy(_InputIterator __first, _InputIterator __last, > - _ForwardIterator __result, _BinaryPredicate > __binary_pred, > - input_iterator_tag, forward_iterator_tag) > + __unique_copy_1(_InputIterator __first, _InputIterator __last, > + _ForwardIterator __result, _BinaryPredicate > __binary_pred, > + __true_type) > { > *__result = *__first; > while (++__first != __last) > @@ -992,6 +977,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > return ++__result; > } > > + // Implementation of std::unique_copy for non-forward iterators. > + // We cannot compare *i to *(i-1) so we need to either make a copy > + // or compare with the last element written to the output range. > + template<typename _InputIterator, typename _OutputIterator, > + typename _BinaryPredicate> > + _GLIBCXX20_CONSTEXPR > + _OutputIterator > + __unique_copy(_InputIterator __first, _InputIterator __last, > + _OutputIterator __result, _BinaryPredicate __binary_pred, > + input_iterator_tag) > + { > + // _GLIBCXX_RESOLVE_LIB_DEFECTS > + // 2439. unique_copy() sometimes can't fall back to reading its > output > + typedef iterator_traits<_InputIterator> _InItTraits; > + typedef iterator_traits<_OutputIterator> _OutItTraits; > + typedef typename _OutItTraits::iterator_category _Cat; > + const bool __output_is_fwd = __is_base_of(forward_iterator_tag, > _Cat); > + const bool __same_type = __is_same(typename > _OutItTraits::value_type, > + typename _InItTraits::value_type); > + typedef __truth_type<__output_is_fwd && __same_type> > __cmp_with_output; > No change needed, but I was wondering of output only iterator, can have non-void value type, but making sure that they are also forwards mitigates this question. > + return std::__unique_copy_1(__first, __last, __result, > __binary_pred, > + typename __cmp_with_output::__type()); > + } > + > + > /** > * This is an uglified reverse(_BidirectionalIterator, > * _BidirectionalIterator) > @@ -4456,8 +4466,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO > return __result; > return std::__unique_copy(__first, __last, __result, > __gnu_cxx::__ops::__iter_equal_to_iter(), > - std::__iterator_category(__first), > - std::__iterator_category(__result)); > + std::__iterator_category(__first)); > } > > /** > @@ -4499,8 +4508,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO > return __result; > return std::__unique_copy(__first, __last, __result, > __gnu_cxx::__ops::__iter_comp_iter(__binary_pred), > - std::__iterator_category(__first), > - std::__iterator_category(__result)); > + std::__iterator_category(__first)); > } > > #if __cplusplus <= 201103L || _GLIBCXX_USE_DEPRECATED > -- > 2.49.0 > >