On Mon, Jan 27, 2025 at 9:59 PM David Malcolm <[email protected]> wrote:
>
> On Sat, 2025-01-25 at 23:31 -0800, Andi Kleen wrote:
> > From: Andi Kleen <[email protected]>
> >
> > This is the hot function in input.cc
> >
> > The vectorizer can vectorize it now, but in a generic cpu O2 x86
> > build it isn't.
> > Add a automatic target clone to handle it for x86 and build
> > that function with O3.
> >
> > The ifdef here is ugly, perhaps gcc should have a more convenient
> > "clone for vectorization if possible" attribute to handle this
> > portably.
>
> This patch is very cool (no pun intended); how much does it help?
>
> The patch is OK by me, but given that we're in stage 4, does a release
> manager approve? [CCed]
I'd like to see good evidence that it helps and doesn't cause issues - IIRC
target_clones requires IFUNCs so this needs to be likely guarded not only
on the host architecture but also the host OS.
IIRC I've seen optimized intrinsic code for a function like this from
Alexander(?),
I'd be fine doing sth like we have in libcpp.
Richard.
> Thanks
> Dave
>
> >
> > gcc/ChangeLog:
> >
> > * input.cc: (VECTORIZE): Add definition for x86.
> > (find_end_of_line): Mark for vectorizer.
> > ---
> > gcc/input.cc | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/gcc/input.cc b/gcc/input.cc
> > index d5d7dbb043e..f1a15de66f1 100644
> > --- a/gcc/input.cc
> > +++ b/gcc/input.cc
> > @@ -740,11 +740,18 @@ file_cache_slot::maybe_read_data ()
> > return read_data ();
> > }
> >
> > +#if defined(__x86_64__) && __GNUC__ >= 15
> > +#define VECTORIZE
> > __attribute__((target_clones("default,avx2,avx10.2,avx512f"),
> > optimize("O3")))
> > +#else
> > +#define VECTORIZE
> > +#endif
> > +
> > /* Helper function for file_cache_slot::get_next_line (), to find
> > the end of
> > the next line. Returns with the memchr convention, i.e. nullptr
> > if a line
> > terminator was not found. We need to determine line endings in
> > the same
> > manner that libcpp does: any of \n, \r\n, or \r is a line
> > ending. */
> >
> > +VECTORIZE
> > static const char *
> > find_end_of_line (const char *s, size_t len)
> > {
>