On Thu, May 22, 2025 at 5:15 PM Tomasz Kaminski <tkami...@redhat.com> wrote:

>
>
> On Thu, May 22, 2025 at 5:04 PM Jonathan Wakely <jwak...@redhat.com>
> wrote:
>
>> On Thu, 22 May 2025 at 15:50, Tomasz Kaminski <tkami...@redhat.com>
>> wrote:
>> >
>> >
>> >
>> > On Thu, May 22, 2025 at 1:42 PM Jonathan Wakely <jwak...@redhat.com>
>> wrote:
>> >>
>> >> The current overload set for __unique_copy handles three cases:
>> >>
>> >> - The input range uses forward iterators, the output range does not.
>> >>   This is the simplest case, and can just compare adjacent elements of
>> >>   the input range.
>> >>
>> >> - Neither the input range nor output range use forward iterators.
>> >>   This requires a local variable copied from the input range and
>> updated
>> >>   by assigning each element to the local variable.
>> >>
>> >> - The output range uses forward iterators.
>> >>   For this case we compare the current element from the input range
>> with
>> >>   the element just written to the output range.
>> >>
>> >> There are two problems with this implementation. Firstly, the third
>> case
>> >> assumes that the value type of the output range can be compared to the
>> >> value type of the input range, which might not be possible at all, or
>> >> might be possible but give different results to comparing elements of
>> >> the input range. This is the problem identified in LWG 2439.
>> >>
>> >> Secondly, the third case is used when both ranges use forward
>> iterators,
>> >> even though the first case could (and should) be used. This means that
>> >> we compare elements from the output range instead of the input range,
>> >> with the problems described above (either not well-formed, or might
>> give
>> >> the wrong results).
>> >>
>> >> The cause of the second problem is that the overload for the first case
>> >> looks like:
>> >>
>> >> OutputIterator
>> >> __unique_copy(ForwardIter, ForwardIter, OutputIterator, BinaryPred,
>> >>               forward_iterator_tag, output_iterator_tag);
>> >>
>> >> When the output range uses forward iterators this overload cannot be
>> >> used, because forward_iterator_tag does not inherit from
>> >> output_iterator_tag, so is not convertible to it.
>> >>
>> >> To fix these problems we need to implement the resolution of LWG 2439
>> so
>> >> that the third case is only used when the value types of the two ranges
>> >> are the same. This ensures that the comparisons are well behaved. We
>> >> also need to ensure that the first case is used when both ranges use
>> >> forward iterators.
>> >>
>> >> This change replaces a single step of tag dispatching to choose between
>> >> three overloads with two step of tag dispatching, choosing between two
>> >> overloads at each step. The first step dispatches based on the iterator
>> >> category of the input range, ignoring the category of the output range.
>> >> The second step only happens when the input range uses non-forward
>> >> iterators, and dispatches based on the category of the output range and
>> >> whether the value type of the two ranges is the same. So now the cases
>> >> that are handled are:
>> >>
>> >> - The input range uses forward iterators.
>> >> - The output range uses non-forward iterators or a different value
>> type.
>> >> - The output range uses forward iterators and has the same value type.
>> >>
>> >> For the second case, the old code used
>> __gnu_cxx::__ops::__iter_comp_val
>> >> to wrap the predicate in another level of indirection. That seems
>> >> unnecessary, as we can just use a pointer to the local variable instead
>> >> of an iterator referring to it.
>> >>
>> >> libstdc++-v3/ChangeLog:
>> >>
>> >>         PR libstdc++/120386
>> >>         * include/bits/stl_algo.h (__unique_copy_1): New overloads for
>> >>         the case where the input range uses non-forward iterators.
>> >>         (__unique_copy): Replace three overloads with two, depending
>> >>         only on the iterator category of the input range. Dispatch to
>> >>         __unique_copy_1 for the non-forward case.
>> >>         (unique_copy): Only pass the input range category to
>> >>         __unique_copy.
>> >> ---
>> >>
>> >> Tested x86_64-linux.
>> >
>> > LGTM. Only small suggestion, regarding the change of order of arguments.
>>
>> I forgot to say that I need to add tests for each of the cases,
>> especially the case that fails with the existing code!
>>
>> >>
>> >>
>> >>  libstdc++-v3/include/bits/stl_algo.h | 80 +++++++++++++++-------------
>> >>  1 file changed, 44 insertions(+), 36 deletions(-)
>> >>
>> >> diff --git a/libstdc++-v3/include/bits/stl_algo.h
>> b/libstdc++-v3/include/bits/stl_algo.h
>> >> index f5361aeab7e2..c0bb17f9c8b2 100644
>> >> --- a/libstdc++-v3/include/bits/stl_algo.h
>> >> +++ b/libstdc++-v3/include/bits/stl_algo.h
>> >> @@ -918,24 +918,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >>
>> __gnu_cxx::__ops::__iter_comp_iter(__binary_pred));
>> >>      }
>> >>
>> >> -  /**
>> >> -   *  This is an uglified
>> >> -   *  unique_copy(_InputIterator, _InputIterator, _OutputIterator,
>> >> -   *              _BinaryPredicate)
>> >> -   *  overloaded for forward iterators and output iterator as result.
>> >> -  */
>> >> +  // Implementation of std::unique_copy for forward iterators.
>> >> +  // This case is easy, just compare *i with *(i-1).
>> >>    template<typename _ForwardIterator, typename _OutputIterator,
>> >>            typename _BinaryPredicate>
>> >>      _GLIBCXX20_CONSTEXPR
>> >>      _OutputIterator
>> >>      __unique_copy(_ForwardIterator __first, _ForwardIterator __last,
>> >>                   _OutputIterator __result, _BinaryPredicate
>> __binary_pred,
>> >> -                 forward_iterator_tag, output_iterator_tag)
>> >> +                 forward_iterator_tag)
>> >>      {
>> >>        _ForwardIterator __next = __first;
>> >>        *__result = *__first;
>> >>        while (++__next != __last)
>> >> -       if (!__binary_pred(__first, __next))
>> >> +       if (!__binary_pred(__next, __first))
>> >
>> > I would prefer if you will not do this change, and pass iterators that
>> were already seen as the first argument.
>>
>> The standard seems clear that it should be bool(pred(*i, *(i - 1)))
>> In theory a predicate could depend on that.
>>
> Oh, indeed.
>
>>
>> > Note that the forward-output overload, preserves this order:
>> >       *__result = *__first;
>> >       while (++__first != __last)
>> >         if (!__binary_pred(__result, __first))
>>
>> Ah yes, well I should have changed that too ;-)
>>
>> What's your reason for preferring the current order?
>>
> That what I would intuitively expect, that left argument is an element
> that is left to right argument.
> And if you sorted range with predicate lt, then passing not_fn(lt) is
> equivalent to checking equality.
>
It seems that I have built my expectation based on unique, that check i-1,
i.
https://eel.is/c++draft/alg.unique#1

>
>
>> >           *++__result = *__first;
>> >
>> >>           {
>> >>             __first = __next;
>> >>             *++__result = *__first;
>> >> @@ -943,27 +939,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >>        return ++__result;
>> >>      }
>> >>
>> >> -  /**
>> >> -   *  This is an uglified
>> >> -   *  unique_copy(_InputIterator, _InputIterator, _OutputIterator,
>> >> -   *              _BinaryPredicate)
>> >> -   *  overloaded for input iterators and output iterator as result.
>> >> -  */
>> >> +  // Implementation of std::unique_copy for non-forward iterators,
>> >> +  // where we cannot compare with elements written to the output.
>> >>    template<typename _InputIterator, typename _OutputIterator,
>> >>            typename _BinaryPredicate>
>> >>      _GLIBCXX20_CONSTEXPR
>> >>      _OutputIterator
>> >> -    __unique_copy(_InputIterator __first, _InputIterator __last,
>> >> -                 _OutputIterator __result, _BinaryPredicate
>> __binary_pred,
>> >> -                 input_iterator_tag, output_iterator_tag)
>> >> +    __unique_copy_1(_InputIterator __first, _InputIterator __last,
>> >> +                   _OutputIterator __result, _BinaryPredicate
>> __binary_pred,
>> >> +                   __false_type)
>> >>      {
>> >> -      typename iterator_traits<_InputIterator>::value_type __value =
>> *__first;
>> >> -      __decltype(__gnu_cxx::__ops::__iter_comp_val(__binary_pred))
>> >> -       __rebound_pred
>> >> -       = __gnu_cxx::__ops::__iter_comp_val(__binary_pred);
>> >> +      typedef typename iterator_traits<_InputIterator>::value_type
>> _Val;
>> >> +      _Val __value = *__first;
>> >>        *__result = __value;
>> >>        while (++__first != __last)
>> >> -       if (!__rebound_pred(__first, __value))
>> >> +       if (!__binary_pred(__first, std::__addressof(__value)))
>> >
>> > I would instead change the order here to  std::__addressof(__value),
>> __first.
>> >>
>> >>           {
>> >>             __value = *__first;
>> >>             *++__result = __value;
>> >> @@ -971,19 +961,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >>        return ++__result;
>> >>      }
>> >>
>> >> -  /**
>> >> -   *  This is an uglified
>> >> -   *  unique_copy(_InputIterator, _InputIterator, _OutputIterator,
>> >> -   *              _BinaryPredicate)
>> >> -   *  overloaded for input iterators and forward iterator as result.
>> >> -  */
>> >> +  // Implementation of std::unique_copy for non-forward iterators,
>> >> +  // where we can compare with the last element written to the output.
>> >>    template<typename _InputIterator, typename _ForwardIterator,
>> >>            typename _BinaryPredicate>
>> >> -    _GLIBCXX20_CONSTEXPR
>> >>      _ForwardIterator
>> >> -    __unique_copy(_InputIterator __first, _InputIterator __last,
>> >> -                 _ForwardIterator __result, _BinaryPredicate
>> __binary_pred,
>> >> -                 input_iterator_tag, forward_iterator_tag)
>> >> +    __unique_copy_1(_InputIterator __first, _InputIterator __last,
>> >> +                   _ForwardIterator __result, _BinaryPredicate
>> __binary_pred,
>> >> +                   __true_type)
>> >>      {
>> >>        *__result = *__first;
>> >>        while (++__first != __last)
>> >> @@ -992,6 +977,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >>        return ++__result;
>> >>      }
>> >>
>> >> +  // Implementation of std::unique_copy for non-forward iterators.
>> >> +  // We cannot compare *i to *(i-1) so we need to either make a copy
>> >> +  // or compare with the last element written to the output range.
>> >> +  template<typename _InputIterator, typename _OutputIterator,
>> >> +          typename _BinaryPredicate>
>> >> +    _GLIBCXX20_CONSTEXPR
>> >> +    _OutputIterator
>> >> +    __unique_copy(_InputIterator __first, _InputIterator __last,
>> >> +                 _OutputIterator __result, _BinaryPredicate
>> __binary_pred,
>> >> +                 input_iterator_tag)
>> >> +    {
>> >> +      // _GLIBCXX_RESOLVE_LIB_DEFECTS
>> >> +      // 2439. unique_copy() sometimes can't fall back to reading its
>> output
>> >> +      typedef iterator_traits<_InputIterator> _InItTraits;
>> >> +      typedef iterator_traits<_OutputIterator> _OutItTraits;
>> >> +      typedef typename _OutItTraits::iterator_category _Cat;
>> >> +      const bool __output_is_fwd = __is_base_of(forward_iterator_tag,
>> _Cat);
>> >> +      const bool __same_type = __is_same(typename
>> _OutItTraits::value_type,
>> >> +                                        typename
>> _InItTraits::value_type);
>> >>
>> >> +      typedef __truth_type<__output_is_fwd && __same_type>
>> __cmp_with_output;
>> >
>> > No change needed, but I was wondering of output only iterator, can have
>> non-void value type,
>> > but making sure that they are also forwards mitigates this question.
>> >>
>> >> +      return std::__unique_copy_1(__first, __last, __result,
>> __binary_pred,
>> >> +                                 typename
>> __cmp_with_output::__type());
>> >> +    }
>> >> +
>> >> +
>> >>    /**
>> >>     *  This is an uglified reverse(_BidirectionalIterator,
>> >>     *                              _BidirectionalIterator)
>> >> @@ -4456,8 +4466,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
>> >>         return __result;
>> >>        return std::__unique_copy(__first, __last, __result,
>> >>
>>  __gnu_cxx::__ops::__iter_equal_to_iter(),
>> >> -                               std::__iterator_category(__first),
>> >> -                               std::__iterator_category(__result));
>> >> +                               std::__iterator_category(__first));
>> >>      }
>> >>
>> >>    /**
>> >> @@ -4499,8 +4508,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
>> >>         return __result;
>> >>        return std::__unique_copy(__first, __last, __result,
>> >>
>>  __gnu_cxx::__ops::__iter_comp_iter(__binary_pred),
>> >> -                               std::__iterator_category(__first),
>> >> -                               std::__iterator_category(__result));
>> >> +                               std::__iterator_category(__first));
>> >>      }
>> >>
>> >>  #if __cplusplus <= 201103L || _GLIBCXX_USE_DEPRECATED
>> >> --
>> >> 2.49.0
>> >>
>>
>>

Reply via email to