Together with the preparatory compiler patches, this patch restores
unrolling in std::__find_if, but this time relying on the compiler to do
it by using:

  #pragma GCC unroll 4

which should restore the majority of the regression relative to the
hand-unrolled version while still being vectorizable with WIP alignment
peeling enhancements.

On Neoverse V1 with LTO, this reduces the regression in xalancbmk (from
SPEC CPU 2017) from 5.8% to 1.7% (restoring ~71% of the lost
performance).

Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk?

Thanks,
Alex

libstdc++-v3/ChangeLog:

        PR libstdc++/116140
        * include/bits/stl_algobase.h (std::__find_if): Add #pragma to
        request GCC to unroll the loop.
---
 libstdc++-v3/include/bits/stl_algobase.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index 27f6c377ad6..f13662fc448 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -2104,6 +2104,7 @@ _GLIBCXX_END_NAMESPACE_ALGO
     inline _Iterator
     __find_if(_Iterator __first, _Iterator __last, _Predicate __pred)
     {
+#pragma GCC unroll 4
       while (__first != __last && !__pred(__first))
 	++__first;
       return __first;

Reply via email to