On Mon, Jan 27, 2025 at 9:59 PM David Malcolm <dmalc...@redhat.com> wrote: > > On Sat, 2025-01-25 at 23:31 -0800, Andi Kleen wrote: > > From: Andi Kleen <a...@gcc.gnu.org> > > > > This is the hot function in input.cc > > > > The vectorizer can vectorize it now, but in a generic cpu O2 x86 > > build it isn't. > > Add a automatic target clone to handle it for x86 and build > > that function with O3. > > > > The ifdef here is ugly, perhaps gcc should have a more convenient > > "clone for vectorization if possible" attribute to handle this > > portably. > > This patch is very cool (no pun intended); how much does it help? > > The patch is OK by me, but given that we're in stage 4, does a release > manager approve? [CCed]
I'd like to see good evidence that it helps and doesn't cause issues - IIRC target_clones requires IFUNCs so this needs to be likely guarded not only on the host architecture but also the host OS. IIRC I've seen optimized intrinsic code for a function like this from Alexander(?), I'd be fine doing sth like we have in libcpp. Richard. > Thanks > Dave > > > > > gcc/ChangeLog: > > > > * input.cc: (VECTORIZE): Add definition for x86. > > (find_end_of_line): Mark for vectorizer. > > --- > > gcc/input.cc | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/gcc/input.cc b/gcc/input.cc > > index d5d7dbb043e..f1a15de66f1 100644 > > --- a/gcc/input.cc > > +++ b/gcc/input.cc > > @@ -740,11 +740,18 @@ file_cache_slot::maybe_read_data () > > return read_data (); > > } > > > > +#if defined(__x86_64__) && __GNUC__ >= 15 > > +#define VECTORIZE > > __attribute__((target_clones("default,avx2,avx10.2,avx512f"), > > optimize("O3"))) > > +#else > > +#define VECTORIZE > > +#endif > > + > > /* Helper function for file_cache_slot::get_next_line (), to find > > the end of > > the next line. Returns with the memchr convention, i.e. nullptr > > if a line > > terminator was not found. We need to determine line endings in > > the same > > manner that libcpp does: any of \n, \r\n, or \r is a line > > ending. */ > > > > +VECTORIZE > > static const char * > > find_end_of_line (const char *s, size_t len) > > { >