On Thu, May 29, 2025 at 10:38 PM Jonathan Wakely <jwak...@redhat.com> wrote:
> The current overload set for __unique_copy handles three cases: > > - The input range uses forward iterators, the output range does not. > This is the simplest case, and can just compare adjacent elements of > the input range. > > - Neither the input range nor output range use forward iterators. > This requires a local variable copied from the input range and updated > by assigning each element to the local variable. > > - The output range uses forward iterators. > For this case we compare the current element from the input range with > the element just written to the output range. > > There are two problems with this implementation. Firstly, the third case > assumes that the value type of the output range can be compared to the > value type of the input range, which might not be possible at all, or > might be possible but give different results to comparing elements of > the input range. This is the problem identified in LWG 2439. > > Secondly, the third case is used when both ranges use forward iterators, > even though the first case could (and should) be used. This means that > we compare elements from the output range instead of the input range, > with the problems described above (either not well-formed, or might give > the wrong results). > > The cause of the second problem is that the overload for the first case > looks like: > > OutputIterator > __unique_copy(ForwardIter, ForwardIter, OutputIterator, BinaryPred, > forward_iterator_tag, output_iterator_tag); > > When the output range uses forward iterators this overload cannot be > used, because forward_iterator_tag does not inherit from > output_iterator_tag, so is not convertible to it. > > To fix these problems we need to implement the resolution of LWG 2439 so > that the third case is only used when the value types of the two ranges > are the same. This ensures that the comparisons are well behaved. We > also need to ensure that the first case is used when both ranges use > forward iterators. > > This change replaces a single step of tag dispatching to choose between > three overloads with two step of tag dispatching, choosing between two > overloads at each step. The first step dispatches based on the iterator > category of the input range, ignoring the category of the output range. > The second step only happens when the input range uses non-forward > iterators, and dispatches based on the category of the output range and > whether the value type of the two ranges is the same. So now the cases > that are handled are: > > - The input range uses forward iterators. > - The output range uses non-forward iterators or a different value type. > - The output range uses forward iterators and has the same value type. > > For the second case, the old code used __gnu_cxx::__ops::__iter_comp_val > to wrap the predicate in another level of indirection. That seems > unnecessary, as we can just use a pointer to the local variable instead > of an iterator referring to it. > > During review of this patch, it was discovered that all known > implementations of std::unique_copy and ranges::unique_copy (except > cmcstl2) disagree with the specification. The standard (and the SGI STL > documentation) say that it uses pred(*i, *(i-1)) but everybody uses > pred(*(i-1), *i) instead, and apparently always has done. This patch > adjusts ranges::unique_copy to be consistent. > > In the first __unique_copy overload, the local copy of the iterator is > changed to be the previous position not the next one, so that we use > ++first as the "next" iterator, consistent with the logic used in the > other overloads. This makes it easier to compare them, because we aren't > using pred(*first, *next) in one and pred(something, *first) in the > others. Instead it's always pred(something, *first). > > libstdc++-v3/ChangeLog: > > PR libstdc++/120386 > * include/bits/ranges_algo.h (__unique_copy_fn): Reorder > arguments for third case to match the first two cases. > * include/bits/stl_algo.h (__unique_copy): Replace three > overloads with two, depending only on the iterator category of > the input range. Dispatch to __unique_copy_1 for the > non-forward case. > (__unique_copy_1): New overloads for the case where the input > range uses non-forward iterators. > (unique_copy): Only pass the input range category to > __unique_copy. > * testsuite/25_algorithms/unique_copy/lwg2439.cc: New test. > --- > > Patch v2 adds a new test, and doesn't reorder the predicate arguments to > match the standard, instead reorder the one inconsistent case in > ranges::unique_copy. And I've submitted a library issue to change the > standard: https://cplusplus.github.io/LWG/issue4269 > > Tested x86_64-linux. > LGTM. Please add the __GLIBCXX_RESOLVE_ISSUE for 4269 comment in the ranges_algo. > libstdc++-v3/include/bits/ranges_algo.h | 4 +- > libstdc++-v3/include/bits/stl_algo.h | 86 ++++++------ > .../25_algorithms/unique_copy/lwg2439.cc | 127 ++++++++++++++++++ > 3 files changed, 176 insertions(+), 41 deletions(-) > create mode 100644 > libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc > > diff --git a/libstdc++-v3/include/bits/ranges_algo.h > b/libstdc++-v3/include/bits/ranges_algo.h > index f36e7dd59911..0d9639533db4 100644 > --- a/libstdc++-v3/include/bits/ranges_algo.h > +++ b/libstdc++-v3/include/bits/ranges_algo.h > @@ -1250,8 +1250,8 @@ namespace ranges > while (++__first != __last) > { > if (!(bool)std::__invoke(__comp, > - std::__invoke(__proj, *__first), > - std::__invoke(__proj, __value))) > + std::__invoke(__proj, __value), > + std::__invoke(__proj, *__first))) > { > __value = *__first; > *++__result = __value; > diff --git a/libstdc++-v3/include/bits/stl_algo.h > b/libstdc++-v3/include/bits/stl_algo.h > index f5361aeab7e2..b984cfaa5878 100644 > --- a/libstdc++-v3/include/bits/stl_algo.h > +++ b/libstdc++-v3/include/bits/stl_algo.h > @@ -918,52 +918,42 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > __gnu_cxx::__ops::__iter_comp_iter(__binary_pred)); > } > > - /** > - * This is an uglified > - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, > - * _BinaryPredicate) > - * overloaded for forward iterators and output iterator as result. > - */ > + // Implementation of std::unique_copy for forward iterators. > + // This case is easy, just compare *i with *(i-1). > template<typename _ForwardIterator, typename _OutputIterator, > typename _BinaryPredicate> > _GLIBCXX20_CONSTEXPR > _OutputIterator > __unique_copy(_ForwardIterator __first, _ForwardIterator __last, > _OutputIterator __result, _BinaryPredicate __binary_pred, > - forward_iterator_tag, output_iterator_tag) > + forward_iterator_tag) > { > - _ForwardIterator __next = __first; > + _ForwardIterator __prev = __first; > *__result = *__first; > - while (++__next != __last) > - if (!__binary_pred(__first, __next)) > + while (++__first != __last) > + if (!__binary_pred(__prev, __first)) > { > - __first = __next; > *++__result = *__first; > + __prev = __first; > } > return ++__result; > } > > - /** > - * This is an uglified > - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, > - * _BinaryPredicate) > - * overloaded for input iterators and output iterator as result. > - */ > + // Implementation of std::unique_copy for non-forward iterators, > + // where we cannot compare with elements written to the output. > template<typename _InputIterator, typename _OutputIterator, > typename _BinaryPredicate> > _GLIBCXX20_CONSTEXPR > _OutputIterator > - __unique_copy(_InputIterator __first, _InputIterator __last, > - _OutputIterator __result, _BinaryPredicate __binary_pred, > - input_iterator_tag, output_iterator_tag) > + __unique_copy_1(_InputIterator __first, _InputIterator __last, > + _OutputIterator __result, _BinaryPredicate > __binary_pred, > + __false_type) > { > - typename iterator_traits<_InputIterator>::value_type __value = > *__first; > - __decltype(__gnu_cxx::__ops::__iter_comp_val(__binary_pred)) > - __rebound_pred > - = __gnu_cxx::__ops::__iter_comp_val(__binary_pred); > + typedef typename iterator_traits<_InputIterator>::value_type _Val; > + _Val __value = *__first; > *__result = __value; > while (++__first != __last) > - if (!__rebound_pred(__first, __value)) > + if (!__binary_pred(std::__addressof(__value), __first)) > { > __value = *__first; > *++__result = __value; > @@ -971,19 +961,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > return ++__result; > } > > - /** > - * This is an uglified > - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, > - * _BinaryPredicate) > - * overloaded for input iterators and forward iterator as result. > - */ > + // Implementation of std::unique_copy for non-forward iterators, > + // where we can compare with the last element written to the output. > template<typename _InputIterator, typename _ForwardIterator, > typename _BinaryPredicate> > - _GLIBCXX20_CONSTEXPR > _ForwardIterator > - __unique_copy(_InputIterator __first, _InputIterator __last, > - _ForwardIterator __result, _BinaryPredicate > __binary_pred, > - input_iterator_tag, forward_iterator_tag) > + __unique_copy_1(_InputIterator __first, _InputIterator __last, > + _ForwardIterator __result, _BinaryPredicate > __binary_pred, > + __true_type) > { > *__result = *__first; > while (++__first != __last) > @@ -992,6 +977,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > return ++__result; > } > > + // Implementation of std::unique_copy for non-forward iterators. > + // We cannot compare *i to *(i-1) so we need to either make a copy > + // or compare with the last element written to the output range. > + template<typename _InputIterator, typename _OutputIterator, > + typename _BinaryPredicate> > + _GLIBCXX20_CONSTEXPR > + _OutputIterator > + __unique_copy(_InputIterator __first, _InputIterator __last, > + _OutputIterator __result, _BinaryPredicate __binary_pred, > + input_iterator_tag) > + { > + // _GLIBCXX_RESOLVE_LIB_DEFECTS > + // 2439. unique_copy() sometimes can't fall back to reading its > output > + typedef iterator_traits<_InputIterator> _InItTraits; > + typedef iterator_traits<_OutputIterator> _OutItTraits; > + typedef typename _OutItTraits::iterator_category _Cat; > + const bool __output_is_fwd = __is_base_of(forward_iterator_tag, > _Cat); > + const bool __same_type = __is_same(typename > _OutItTraits::value_type, > + typename _InItTraits::value_type); > + typedef __truth_type<__output_is_fwd && __same_type> > __cmp_with_output; > + return std::__unique_copy_1(__first, __last, __result, > __binary_pred, > + typename __cmp_with_output::__type()); > + } > + > + > /** > * This is an uglified reverse(_BidirectionalIterator, > * _BidirectionalIterator) > @@ -4456,8 +4466,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO > return __result; > return std::__unique_copy(__first, __last, __result, > __gnu_cxx::__ops::__iter_equal_to_iter(), > - std::__iterator_category(__first), > - std::__iterator_category(__result)); > + std::__iterator_category(__first)); > } > > /** > @@ -4499,8 +4508,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO > return __result; > return std::__unique_copy(__first, __last, __result, > __gnu_cxx::__ops::__iter_comp_iter(__binary_pred), > - std::__iterator_category(__first), > - std::__iterator_category(__result)); > + std::__iterator_category(__first)); > } > > #if __cplusplus <= 201103L || _GLIBCXX_USE_DEPRECATED > diff --git a/libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc > b/libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc > new file mode 100644 > index 000000000000..f2ec3e368099 > --- /dev/null > +++ b/libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc > @@ -0,0 +1,127 @@ > +// { dg-do run } > + > +#include <algorithm> > +#include <testsuite_iterators.h> > + > +using namespace __gnu_test; > + > +int out[4]; > +short out_shrt[4]; > +short in[7] = { 1, 2, 2, 2, 3, 4, 4 }; > + > +template<typename T> > +void > +check_and_reset(T* p) > +{ > + VERIFY( p[0] == 1 ); > + VERIFY( p[1] == 2 ); > + VERIFY( p[2] == 3 ); > + VERIFY( p[3] == 4 ); > + std::fill_n(p, 4, 0); > +} > + > +struct Eq > +{ > + bool operator()(short i, short j) const { return i == j; } > + bool operator()(short, int) const { VERIFY(false); } > + bool operator()(int, short) const { VERIFY(false); } > +}; > + > +struct Eq2 > +{ > + bool operator()(const short& i, const short& j) const > + { > + // Both arguments should be elements of the 'in' array. > + VERIFY( in+0 <= &i && &i < in+7 ); > + VERIFY( in+0 <= &j && &j < in+7 ); > + VERIFY( &i < &j ); > + return i == j; > + } > + bool operator()(short, int) const { VERIFY(false); } > + bool operator()(int, short) const { VERIFY(false); } > +}; > + > +struct Eq3 > +{ > + bool operator()(const short& i, const short& j) const > + { > + // First argument should be element of the 'out' array. > + // Second argument should be element of the 'in' array. > + VERIFY( out_shrt+0 <= &i && &i < out_shrt+4 ); > + VERIFY( in+0 <= &j && &j < in+7 ); > + return i == j; > + } > + bool operator()(short, int) const { VERIFY(false); } > + bool operator()(int, short) const { VERIFY(false); } > +}; > + > +void > +test_forward_source() > +{ > + // The input range uses forward iterators > + test_container<short, forward_iterator_wrapper> inc(in); > + test_container<int, output_iterator_wrapper> outc(out); > + std::unique_copy(inc.begin(), inc.end(), outc.begin()); > + check_and_reset(out); > + > + test_container<short, forward_iterator_wrapper> inc2(in); > + test_container<int, output_iterator_wrapper> outc2(out); > + std::unique_copy(inc2.begin(), inc2.end(), outc2.begin(), Eq2()); > + check_and_reset(out); > +} > + > +void > +test_output_dest() > +{ > + // The input range uses input iterators. > + // The output range uses output iterators. > + test_container<short, input_iterator_wrapper> inc(in); > + test_container<int, output_iterator_wrapper> outc(out); > + std::unique_copy(inc.begin(), inc.end(), outc.begin()); > + check_and_reset(out); > + > + test_container<short, input_iterator_wrapper> inc2(in); > + test_container<int, output_iterator_wrapper> outc2(out); > + std::unique_copy(inc2.begin(), inc2.end(), outc2.begin(), Eq()); > + check_and_reset(out); > +} > + > +void > +test_forward_dest_diff_type() > +{ > + // The input range uses input iterators. > + // The output range uses forward iterators, but with different value > type. > + test_container<short, input_iterator_wrapper> inc(in); > + test_container<int, forward_iterator_wrapper> outc(out); > + std::unique_copy(inc.begin(), inc.end(), outc.begin()); > + check_and_reset(out); > + > + test_container<short, input_iterator_wrapper> inc2(in); > + test_container<int, forward_iterator_wrapper> outc2(out); > + std::unique_copy(inc2.begin(), inc2.end(), outc2.begin(), Eq()); > + check_and_reset(out); > +} > + > +void > +test_forward_dest_same_type() > +{ > + // The input range uses input iterators. > + // The output range uses forward iterators, with same value type. > + test_container<short, input_iterator_wrapper> inc(in); > + test_container<short, forward_iterator_wrapper> outc(out_shrt); > + std::unique_copy(inc.begin(), inc.end(), outc.begin()); > + check_and_reset(out_shrt); > + > + test_container<short, input_iterator_wrapper> inc2(in); > + test_container<short, forward_iterator_wrapper> outc2(out_shrt); > + std::unique_copy(inc2.begin(), inc2.end(), outc2.begin(), Eq3()); > + check_and_reset(out_shrt); > +} > + > +int main() > +{ > + test_forward_source(); > + test_output_dest(); > + test_forward_dest_diff_type(); > + test_forward_dest_same_type(); > +} > -- > 2.49.0 > >