[Bug libstdc++/114821] _M_realloc_append should use memcpy instead of loop to copy data when possible

hubicka at gcc dot gnu.org via Gcc-bugs Tue, 23 Apr 2024 05:08:48 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114821


--- Comment #6 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Thanks. I though the relocate_a only cares about the fact if the pointed-to
type can be bitwise copied.  It would be nice to early produce memcpy from
libstdc++ for std::pair, so the second patch makes sense to me (I did not test
if it works)

I think it would be still nice to tell GCC that the copy loop never gets
overlapping memory locations so the cases which are not early optimized to
memcpy can still be optimized later (or vectorized if it does really something
non-trivial).

So i tried your second patch fixed so it compiles:
diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h
b/libstdc++-v3/include/bits/stl_uninitialized.h
index 7f84da31578..0d2e588ae5e 100644
--- a/libstdc++-v3/include/bits/stl_uninitialized.h
+++ b/libstdc++-v3/include/bits/stl_uninitialized.h
@@ -1109,8 +1109,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template <typename _Tp, typename _Up>
     _GLIBCXX20_CONSTEXPR
     inline __enable_if_t<std::__is_bitwise_relocatable<_Tp>::value, _Tp*>
-    __relocate_a_1(_Tp* __first, _Tp* __last,
-                  _Tp* __result,
+    __relocate_a_1(_Tp* __restrict __first, _Tp* __last,
+                  _Tp* __restrict __result,
                   [[__maybe_unused__]] allocator<_Up>& __alloc) noexcept
     {
       ptrdiff_t __count = __last - __first;
@@ -1147,6 +1147,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
                                 std::__niter_base(__result), __alloc);
     }

+  template <typename _Tp, typename _Up>
+    _GLIBCXX20_CONSTEXPR
+    inline _Tp*
+    __relocate_a(_Tp* __restrict __first, _Tp* __last,
+                _Tp* __restrict __result,
+                allocator<_Up>& __alloc)
+    noexcept(std::__is_bitwise_relocatable<_Tp>::value)
+    {
+      return std::__relocate_a_1(__first, __last, __result, __alloc);
+    }
+
   /// @endcond
 #endif // C++11

it does not make ldist to hit, so the restrict info is still lost.  I think the
problem is that if you call relocate_object the restrict reduces scope, so we
only know that the elements are pairwise disjoint, not that the vectors are.
This is because restrict is interpreted early pre-inlining, but it is really
Richard's area.

It seems that the patch makes us to go through __uninitialized_copy_a instead
of __uninit_copy. I am not even sure how these are different, so I need to
stare at the code bit more to make sense of it :)

[Bug libstdc++/114821] _M_realloc_append should use memcpy instead of loop to copy data when possible

Reply via email to