From: Andi Kleen <[email protected]>
This is the hot function in input.cc
The vectorizer can vectorize it now, but in a generic cpu O2 x86 build it isn't.
Add a automatic target clone to handle it for x86 and build
that function with O3.
The ifdef here is ugly, perhaps gcc should have a more convenient
"clone for vectorization if possible" attribute to handle this portably.
gcc/ChangeLog:
* input.cc: (VECTORIZE): Add definition for x86.
(find_end_of_line): Mark for vectorizer.
---
gcc/input.cc | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/gcc/input.cc b/gcc/input.cc
index d5d7dbb043e..f1a15de66f1 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -740,11 +740,18 @@ file_cache_slot::maybe_read_data ()
return read_data ();
}
+#if defined(__x86_64__) && __GNUC__ >= 15
+#define VECTORIZE
__attribute__((target_clones("default,avx2,avx10.2,avx512f"), optimize("O3")))
+#else
+#define VECTORIZE
+#endif
+
/* Helper function for file_cache_slot::get_next_line (), to find the end of
the next line. Returns with the memchr convention, i.e. nullptr if a line
terminator was not found. We need to determine line endings in the same
manner that libcpp does: any of \n, \r\n, or \r is a line ending. */
+VECTORIZE
static const char *
find_end_of_line (const char *s, size_t len)
{
--
2.47.1