On Thu, 22 May 2025 at 16:25, Tomasz Kaminski <tkami...@redhat.com> wrote: > > > > On Thu, May 22, 2025 at 5:15 PM Tomasz Kaminski <tkami...@redhat.com> wrote: >> >> >> >> On Thu, May 22, 2025 at 5:04 PM Jonathan Wakely <jwak...@redhat.com> wrote: >>> >>> On Thu, 22 May 2025 at 15:50, Tomasz Kaminski <tkami...@redhat.com> wrote: >>> > >>> > >>> > >>> > On Thu, May 22, 2025 at 1:42 PM Jonathan Wakely <jwak...@redhat.com> >>> > wrote: >>> >> >>> >> The current overload set for __unique_copy handles three cases: >>> >> >>> >> - The input range uses forward iterators, the output range does not. >>> >> This is the simplest case, and can just compare adjacent elements of >>> >> the input range. >>> >> >>> >> - Neither the input range nor output range use forward iterators. >>> >> This requires a local variable copied from the input range and updated >>> >> by assigning each element to the local variable. >>> >> >>> >> - The output range uses forward iterators. >>> >> For this case we compare the current element from the input range with >>> >> the element just written to the output range. >>> >> >>> >> There are two problems with this implementation. Firstly, the third case >>> >> assumes that the value type of the output range can be compared to the >>> >> value type of the input range, which might not be possible at all, or >>> >> might be possible but give different results to comparing elements of >>> >> the input range. This is the problem identified in LWG 2439. >>> >> >>> >> Secondly, the third case is used when both ranges use forward iterators, >>> >> even though the first case could (and should) be used. This means that >>> >> we compare elements from the output range instead of the input range, >>> >> with the problems described above (either not well-formed, or might give >>> >> the wrong results). >>> >> >>> >> The cause of the second problem is that the overload for the first case >>> >> looks like: >>> >> >>> >> OutputIterator >>> >> __unique_copy(ForwardIter, ForwardIter, OutputIterator, BinaryPred, >>> >> forward_iterator_tag, output_iterator_tag); >>> >> >>> >> When the output range uses forward iterators this overload cannot be >>> >> used, because forward_iterator_tag does not inherit from >>> >> output_iterator_tag, so is not convertible to it. >>> >> >>> >> To fix these problems we need to implement the resolution of LWG 2439 so >>> >> that the third case is only used when the value types of the two ranges >>> >> are the same. This ensures that the comparisons are well behaved. We >>> >> also need to ensure that the first case is used when both ranges use >>> >> forward iterators. >>> >> >>> >> This change replaces a single step of tag dispatching to choose between >>> >> three overloads with two step of tag dispatching, choosing between two >>> >> overloads at each step. The first step dispatches based on the iterator >>> >> category of the input range, ignoring the category of the output range. >>> >> The second step only happens when the input range uses non-forward >>> >> iterators, and dispatches based on the category of the output range and >>> >> whether the value type of the two ranges is the same. So now the cases >>> >> that are handled are: >>> >> >>> >> - The input range uses forward iterators. >>> >> - The output range uses non-forward iterators or a different value type. >>> >> - The output range uses forward iterators and has the same value type. >>> >> >>> >> For the second case, the old code used __gnu_cxx::__ops::__iter_comp_val >>> >> to wrap the predicate in another level of indirection. That seems >>> >> unnecessary, as we can just use a pointer to the local variable instead >>> >> of an iterator referring to it. >>> >> >>> >> libstdc++-v3/ChangeLog: >>> >> >>> >> PR libstdc++/120386 >>> >> * include/bits/stl_algo.h (__unique_copy_1): New overloads for >>> >> the case where the input range uses non-forward iterators. >>> >> (__unique_copy): Replace three overloads with two, depending >>> >> only on the iterator category of the input range. Dispatch to >>> >> __unique_copy_1 for the non-forward case. >>> >> (unique_copy): Only pass the input range category to >>> >> __unique_copy. >>> >> --- >>> >> >>> >> Tested x86_64-linux. >>> > >>> > LGTM. Only small suggestion, regarding the change of order of arguments. >>> >>> I forgot to say that I need to add tests for each of the cases, >>> especially the case that fails with the existing code! >>> >>> >> >>> >> >>> >> libstdc++-v3/include/bits/stl_algo.h | 80 +++++++++++++++------------- >>> >> 1 file changed, 44 insertions(+), 36 deletions(-) >>> >> >>> >> diff --git a/libstdc++-v3/include/bits/stl_algo.h >>> >> b/libstdc++-v3/include/bits/stl_algo.h >>> >> index f5361aeab7e2..c0bb17f9c8b2 100644 >>> >> --- a/libstdc++-v3/include/bits/stl_algo.h >>> >> +++ b/libstdc++-v3/include/bits/stl_algo.h >>> >> @@ -918,24 +918,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >>> >> >>> >> __gnu_cxx::__ops::__iter_comp_iter(__binary_pred)); >>> >> } >>> >> >>> >> - /** >>> >> - * This is an uglified >>> >> - * unique_copy(_InputIterator, _InputIterator, _OutputIterator, >>> >> - * _BinaryPredicate) >>> >> - * overloaded for forward iterators and output iterator as result. >>> >> - */ >>> >> + // Implementation of std::unique_copy for forward iterators. >>> >> + // This case is easy, just compare *i with *(i-1). >>> >> template<typename _ForwardIterator, typename _OutputIterator, >>> >> typename _BinaryPredicate> >>> >> _GLIBCXX20_CONSTEXPR >>> >> _OutputIterator >>> >> __unique_copy(_ForwardIterator __first, _ForwardIterator __last, >>> >> _OutputIterator __result, _BinaryPredicate >>> >> __binary_pred, >>> >> - forward_iterator_tag, output_iterator_tag) >>> >> + forward_iterator_tag) >>> >> { >>> >> _ForwardIterator __next = __first; >>> >> *__result = *__first; >>> >> while (++__next != __last) >>> >> - if (!__binary_pred(__first, __next)) >>> >> + if (!__binary_pred(__next, __first)) >>> > >>> > I would prefer if you will not do this change, and pass iterators that >>> > were already seen as the first argument. >>> >>> The standard seems clear that it should be bool(pred(*i, *(i - 1))) >>> In theory a predicate could depend on that. >> >> Oh, indeed. >>> >>> >>> > Note that the forward-output overload, preserves this order: >>> > *__result = *__first; >>> > while (++__first != __last) >>> > if (!__binary_pred(__result, __first)) >>> >>> Ah yes, well I should have changed that too ;-) >>> >>> What's your reason for preferring the current order? >> >> That what I would intuitively expect, that left argument is an element that >> is left to right argument. >> And if you sorted range with predicate lt, then passing not_fn(lt) is >> equivalent to checking equality. > > It seems that I have built my expectation based on unique, that check i-1, i. > https://eel.is/c++draft/alg.unique#1
The standard seems clear, however, all three of libstdc++, libc++ and MSVC use the "wrong" order in unique_copy. So I think the standard should be changed.