On Thu, May 29, 2025 at 10:38 PM Jonathan Wakely <jwak...@redhat.com> wrote:

> The current overload set for __unique_copy handles three cases:
>
> - The input range uses forward iterators, the output range does not.
>   This is the simplest case, and can just compare adjacent elements of
>   the input range.
>
> - Neither the input range nor output range use forward iterators.
>   This requires a local variable copied from the input range and updated
>   by assigning each element to the local variable.
>
> - The output range uses forward iterators.
>   For this case we compare the current element from the input range with
>   the element just written to the output range.
>
> There are two problems with this implementation. Firstly, the third case
> assumes that the value type of the output range can be compared to the
> value type of the input range, which might not be possible at all, or
> might be possible but give different results to comparing elements of
> the input range. This is the problem identified in LWG 2439.
>
> Secondly, the third case is used when both ranges use forward iterators,
> even though the first case could (and should) be used. This means that
> we compare elements from the output range instead of the input range,
> with the problems described above (either not well-formed, or might give
> the wrong results).
>
> The cause of the second problem is that the overload for the first case
> looks like:
>
> OutputIterator
> __unique_copy(ForwardIter, ForwardIter, OutputIterator, BinaryPred,
>               forward_iterator_tag, output_iterator_tag);
>
> When the output range uses forward iterators this overload cannot be
> used, because forward_iterator_tag does not inherit from
> output_iterator_tag, so is not convertible to it.
>
> To fix these problems we need to implement the resolution of LWG 2439 so
> that the third case is only used when the value types of the two ranges
> are the same. This ensures that the comparisons are well behaved. We
> also need to ensure that the first case is used when both ranges use
> forward iterators.
>
> This change replaces a single step of tag dispatching to choose between
> three overloads with two step of tag dispatching, choosing between two
> overloads at each step. The first step dispatches based on the iterator
> category of the input range, ignoring the category of the output range.
> The second step only happens when the input range uses non-forward
> iterators, and dispatches based on the category of the output range and
> whether the value type of the two ranges is the same. So now the cases
> that are handled are:
>
> - The input range uses forward iterators.
> - The output range uses non-forward iterators or a different value type.
> - The output range uses forward iterators and has the same value type.
>
> For the second case, the old code used __gnu_cxx::__ops::__iter_comp_val
> to wrap the predicate in another level of indirection. That seems
> unnecessary, as we can just use a pointer to the local variable instead
> of an iterator referring to it.
>
> During review of this patch, it was discovered that all known
> implementations of std::unique_copy and ranges::unique_copy (except
> cmcstl2) disagree with the specification. The standard (and the SGI STL
> documentation) say that it uses pred(*i, *(i-1)) but everybody uses
> pred(*(i-1), *i) instead, and apparently always has done. This patch
> adjusts ranges::unique_copy to be consistent.
>
> In the first __unique_copy overload, the local copy of the iterator is
> changed to be the previous position not the next one, so that we use
> ++first as the "next" iterator, consistent with the logic used in the
> other overloads. This makes it easier to compare them, because we aren't
> using pred(*first, *next) in one and pred(something, *first) in the
> others. Instead it's always pred(something, *first).
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/120386
>         * include/bits/ranges_algo.h (__unique_copy_fn): Reorder
>         arguments for third case to match the first two cases.
>         * include/bits/stl_algo.h (__unique_copy): Replace three
>         overloads with two, depending only on the iterator category of
>         the input range.  Dispatch to __unique_copy_1 for the
>         non-forward case.
>         (__unique_copy_1): New overloads for the case where the input
>         range uses non-forward iterators.
>         (unique_copy): Only pass the input range category to
>         __unique_copy.
>         * testsuite/25_algorithms/unique_copy/lwg2439.cc: New test.
> ---
>
> Patch v2 adds a new test, and doesn't reorder the predicate arguments to
> match the standard, instead reorder the one inconsistent case in
> ranges::unique_copy. And I've submitted a library issue to change the
> standard: https://cplusplus.github.io/LWG/issue4269
>
> Tested x86_64-linux.
>
LGTM. Please add the __GLIBCXX_RESOLVE_ISSUE for 4269 comment in the
ranges_algo.


>  libstdc++-v3/include/bits/ranges_algo.h       |   4 +-
>  libstdc++-v3/include/bits/stl_algo.h          |  86 ++++++------
>  .../25_algorithms/unique_copy/lwg2439.cc      | 127 ++++++++++++++++++
>  3 files changed, 176 insertions(+), 41 deletions(-)
>  create mode 100644
> libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc
>
> diff --git a/libstdc++-v3/include/bits/ranges_algo.h
> b/libstdc++-v3/include/bits/ranges_algo.h
> index f36e7dd59911..0d9639533db4 100644
> --- a/libstdc++-v3/include/bits/ranges_algo.h
> +++ b/libstdc++-v3/include/bits/ranges_algo.h
> @@ -1250,8 +1250,8 @@ namespace ranges
>             while (++__first != __last)
>               {
>                 if (!(bool)std::__invoke(__comp,
> -                                        std::__invoke(__proj, *__first),
> -                                        std::__invoke(__proj, __value)))
> +                                        std::__invoke(__proj, __value),
> +                                        std::__invoke(__proj, *__first)))
>                   {
>                     __value = *__first;
>                     *++__result = __value;
> diff --git a/libstdc++-v3/include/bits/stl_algo.h
> b/libstdc++-v3/include/bits/stl_algo.h
> index f5361aeab7e2..b984cfaa5878 100644
> --- a/libstdc++-v3/include/bits/stl_algo.h
> +++ b/libstdc++-v3/include/bits/stl_algo.h
> @@ -918,52 +918,42 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>  __gnu_cxx::__ops::__iter_comp_iter(__binary_pred));
>      }
>
> -  /**
> -   *  This is an uglified
> -   *  unique_copy(_InputIterator, _InputIterator, _OutputIterator,
> -   *              _BinaryPredicate)
> -   *  overloaded for forward iterators and output iterator as result.
> -  */
> +  // Implementation of std::unique_copy for forward iterators.
> +  // This case is easy, just compare *i with *(i-1).
>    template<typename _ForwardIterator, typename _OutputIterator,
>            typename _BinaryPredicate>
>      _GLIBCXX20_CONSTEXPR
>      _OutputIterator
>      __unique_copy(_ForwardIterator __first, _ForwardIterator __last,
>                   _OutputIterator __result, _BinaryPredicate __binary_pred,
> -                 forward_iterator_tag, output_iterator_tag)
> +                 forward_iterator_tag)
>      {
> -      _ForwardIterator __next = __first;
> +      _ForwardIterator __prev = __first;
>        *__result = *__first;
> -      while (++__next != __last)
> -       if (!__binary_pred(__first, __next))
> +      while (++__first != __last)
> +       if (!__binary_pred(__prev, __first))
>           {
> -           __first = __next;
>             *++__result = *__first;
> +           __prev = __first;
>           }
>        return ++__result;
>      }
>
> -  /**
> -   *  This is an uglified
> -   *  unique_copy(_InputIterator, _InputIterator, _OutputIterator,
> -   *              _BinaryPredicate)
> -   *  overloaded for input iterators and output iterator as result.
> -  */
> +  // Implementation of std::unique_copy for non-forward iterators,
> +  // where we cannot compare with elements written to the output.
>    template<typename _InputIterator, typename _OutputIterator,
>            typename _BinaryPredicate>
>      _GLIBCXX20_CONSTEXPR
>      _OutputIterator
> -    __unique_copy(_InputIterator __first, _InputIterator __last,
> -                 _OutputIterator __result, _BinaryPredicate __binary_pred,
> -                 input_iterator_tag, output_iterator_tag)
> +    __unique_copy_1(_InputIterator __first, _InputIterator __last,
> +                   _OutputIterator __result, _BinaryPredicate
> __binary_pred,
> +                   __false_type)
>      {
> -      typename iterator_traits<_InputIterator>::value_type __value =
> *__first;
> -      __decltype(__gnu_cxx::__ops::__iter_comp_val(__binary_pred))
> -       __rebound_pred
> -       = __gnu_cxx::__ops::__iter_comp_val(__binary_pred);
> +      typedef typename iterator_traits<_InputIterator>::value_type _Val;
> +      _Val __value = *__first;
>        *__result = __value;
>        while (++__first != __last)
> -       if (!__rebound_pred(__first, __value))
> +       if (!__binary_pred(std::__addressof(__value), __first))
>           {
>             __value = *__first;
>             *++__result = __value;
> @@ -971,19 +961,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        return ++__result;
>      }
>
> -  /**
> -   *  This is an uglified
> -   *  unique_copy(_InputIterator, _InputIterator, _OutputIterator,
> -   *              _BinaryPredicate)
> -   *  overloaded for input iterators and forward iterator as result.
> -  */
> +  // Implementation of std::unique_copy for non-forward iterators,
> +  // where we can compare with the last element written to the output.
>    template<typename _InputIterator, typename _ForwardIterator,
>            typename _BinaryPredicate>
> -    _GLIBCXX20_CONSTEXPR
>      _ForwardIterator
> -    __unique_copy(_InputIterator __first, _InputIterator __last,
> -                 _ForwardIterator __result, _BinaryPredicate
> __binary_pred,
> -                 input_iterator_tag, forward_iterator_tag)
> +    __unique_copy_1(_InputIterator __first, _InputIterator __last,
> +                   _ForwardIterator __result, _BinaryPredicate
> __binary_pred,
> +                   __true_type)
>      {
>        *__result = *__first;
>        while (++__first != __last)
> @@ -992,6 +977,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        return ++__result;
>      }
>
> +  // Implementation of std::unique_copy for non-forward iterators.
> +  // We cannot compare *i to *(i-1) so we need to either make a copy
>
+  // or compare with the last element written to the output range.
> +  template<typename _InputIterator, typename _OutputIterator,
> +          typename _BinaryPredicate>
> +    _GLIBCXX20_CONSTEXPR
> +    _OutputIterator
> +    __unique_copy(_InputIterator __first, _InputIterator __last,
> +                 _OutputIterator __result, _BinaryPredicate __binary_pred,
> +                 input_iterator_tag)
> +    {
> +      // _GLIBCXX_RESOLVE_LIB_DEFECTS
> +      // 2439. unique_copy() sometimes can't fall back to reading its
> output
> +      typedef iterator_traits<_InputIterator> _InItTraits;
> +      typedef iterator_traits<_OutputIterator> _OutItTraits;
> +      typedef typename _OutItTraits::iterator_category _Cat;
> +      const bool __output_is_fwd = __is_base_of(forward_iterator_tag,
> _Cat);
> +      const bool __same_type = __is_same(typename
> _OutItTraits::value_type,
> +                                        typename _InItTraits::value_type);
> +      typedef __truth_type<__output_is_fwd && __same_type>
> __cmp_with_output;
> +      return std::__unique_copy_1(__first, __last, __result,
> __binary_pred,
> +                                 typename __cmp_with_output::__type());
> +    }
> +
> +
>    /**
>     *  This is an uglified reverse(_BidirectionalIterator,
>     *                              _BidirectionalIterator)
> @@ -4456,8 +4466,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
>         return __result;
>        return std::__unique_copy(__first, __last, __result,
>                                 __gnu_cxx::__ops::__iter_equal_to_iter(),
> -                               std::__iterator_category(__first),
> -                               std::__iterator_category(__result));
> +                               std::__iterator_category(__first));
>      }
>
>    /**
> @@ -4499,8 +4508,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
>         return __result;
>        return std::__unique_copy(__first, __last, __result,
>                         __gnu_cxx::__ops::__iter_comp_iter(__binary_pred),
> -                               std::__iterator_category(__first),
> -                               std::__iterator_category(__result));
> +                               std::__iterator_category(__first));
>      }
>
>  #if __cplusplus <= 201103L || _GLIBCXX_USE_DEPRECATED
> diff --git a/libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc
> b/libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc
> new file mode 100644
> index 000000000000..f2ec3e368099
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc
> @@ -0,0 +1,127 @@
> +// { dg-do run }
> +
> +#include <algorithm>
> +#include <testsuite_iterators.h>
> +
> +using namespace __gnu_test;
> +
> +int out[4];
> +short out_shrt[4];
> +short in[7] = { 1, 2, 2, 2, 3, 4, 4 };
> +
> +template<typename T>
> +void
> +check_and_reset(T* p)
> +{
> +  VERIFY( p[0] == 1 );
> +  VERIFY( p[1] == 2 );
> +  VERIFY( p[2] == 3 );
> +  VERIFY( p[3] == 4 );
> +  std::fill_n(p, 4, 0);
> +}
> +
> +struct Eq
> +{
> +  bool operator()(short i, short j) const { return i == j; }
> +  bool operator()(short, int) const { VERIFY(false); }
> +  bool operator()(int, short) const { VERIFY(false); }
> +};
> +
> +struct Eq2
> +{
> +  bool operator()(const short& i, const short& j) const
> +  {
> +    // Both arguments should be elements of the 'in' array.
> +    VERIFY( in+0 <= &i && &i < in+7 );
> +    VERIFY( in+0 <= &j && &j < in+7 );
> +    VERIFY( &i < &j );
> +    return i == j;
> +  }
> +  bool operator()(short, int) const { VERIFY(false); }
> +  bool operator()(int, short) const { VERIFY(false); }
> +};
> +
> +struct Eq3
> +{
> +  bool operator()(const short& i, const short& j) const
> +  {
> +    // First argument should be element of the 'out' array.
> +    // Second argument should be element of the 'in' array.
> +    VERIFY( out_shrt+0 <= &i && &i < out_shrt+4 );
> +    VERIFY( in+0 <= &j && &j < in+7 );
> +    return i == j;
> +  }
> +  bool operator()(short, int) const { VERIFY(false); }
> +  bool operator()(int, short) const { VERIFY(false); }
> +};
> +
> +void
> +test_forward_source()
> +{
> +  // The input range uses forward iterators
> +  test_container<short, forward_iterator_wrapper> inc(in);
> +  test_container<int, output_iterator_wrapper> outc(out);
> +  std::unique_copy(inc.begin(), inc.end(), outc.begin());
> +  check_and_reset(out);
> +
> +  test_container<short, forward_iterator_wrapper> inc2(in);
> +  test_container<int, output_iterator_wrapper> outc2(out);
> +  std::unique_copy(inc2.begin(), inc2.end(), outc2.begin(), Eq2());
> +  check_and_reset(out);
> +}
> +
> +void
> +test_output_dest()
> +{
> +  // The input range uses input iterators.
> +  // The output range uses output iterators.
> +  test_container<short, input_iterator_wrapper> inc(in);
> +  test_container<int, output_iterator_wrapper> outc(out);
> +  std::unique_copy(inc.begin(), inc.end(), outc.begin());
> +  check_and_reset(out);
> +
> +  test_container<short, input_iterator_wrapper> inc2(in);
> +  test_container<int, output_iterator_wrapper> outc2(out);
> +  std::unique_copy(inc2.begin(), inc2.end(), outc2.begin(), Eq());
> +  check_and_reset(out);
> +}
> +
> +void
> +test_forward_dest_diff_type()
> +{
> +  // The input range uses input iterators.
> +  // The output range uses forward iterators, but with different value
> type.
> +  test_container<short, input_iterator_wrapper> inc(in);
> +  test_container<int, forward_iterator_wrapper> outc(out);
> +  std::unique_copy(inc.begin(), inc.end(), outc.begin());
> +  check_and_reset(out);
> +
> +  test_container<short, input_iterator_wrapper> inc2(in);
> +  test_container<int, forward_iterator_wrapper> outc2(out);
> +  std::unique_copy(inc2.begin(), inc2.end(), outc2.begin(), Eq());
> +  check_and_reset(out);
> +}
> +
> +void
> +test_forward_dest_same_type()
> +{
> +  // The input range uses input iterators.
> +  // The output range uses forward iterators, with same value type.
> +  test_container<short, input_iterator_wrapper> inc(in);
> +  test_container<short, forward_iterator_wrapper> outc(out_shrt);
> +  std::unique_copy(inc.begin(), inc.end(), outc.begin());
> +  check_and_reset(out_shrt);
> +
> +  test_container<short, input_iterator_wrapper> inc2(in);
> +  test_container<short, forward_iterator_wrapper> outc2(out_shrt);
> +  std::unique_copy(inc2.begin(), inc2.end(), outc2.begin(), Eq3());
> +  check_and_reset(out_shrt);
> +}
> +
> +int main()
> +{
> +  test_forward_source();
> +  test_output_dest();
> +  test_forward_dest_diff_type();
> +  test_forward_dest_same_type();
> +}
> --
> 2.49.0
>
>

Reply via email to