https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89929

--- Comment #10 from Nikolay Bogoychev <nheart at gmail dot com> ---
(In reply to H.J. Lu from comment #9)
> (In reply to Martin Liška from comment #8)
> > Ok, let me first focus on the functional part of the patch.
> > If I'm correct feature_list in get_builtin_code_for_version function should
> > be basically aligned with isa_names_table in fold_builtin_cpu. Difference is
> > following:
> > 
> > +"avx5124fmaps"
> > +"avx5124vnniw"
> > +"avx512bitalg"
> > +"avx512bw"
> > +"avx512cd"
> > +"avx512dq"
> > +"avx512er"
> > +"avx512ifma"
> > +"avx512pf"
> > +"avx512vbmi"
> > +"avx512vbmi2"
> > +"avx512vl"
> > +"avx512vnni"
> > +"avx512vpopcntdq"
> > +"cmov"
> > +"gfni"
> > +"vpclmulqdq"
> > 
> > Adding that should be possible, but one needs to define a priorities of
> > these as seen here:
> > 
> > ```
> >   /* Priority of i386 features, greater value is higher priority.   This is
> >      used to decide the order in which function dispatch must happen.  For
> >      instance, a version specialized for SSE4.2 should be checked for
> > dispatch
> >      before a version for SSE3, as SSE4.2 implies SSE3.  */
> >   enum feature_priority
> > ```
> > 
> > H.J. can you please help me with the priorities?
> 
> What do we gain with these extra target attributes for function
> multiversioning?

Hey,

tl;dr We are able to target specific processors and not crash on Knight's Mill
and Knight's landing.

The problem is that AVX-512 has a 10000 subversions
https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512 

Some of them completely overlap (eg VL DQ and BW), however others are limited
to specific processors. We are developing an application that uses a lot of
intrinsics and we are targetting several different architectures. We rely on
instructions that are included in AVX512BW and if we target the closest
available working thing (AVX512F), we crash with illegal instruction on
Knight's Landing and Knight's Mill processors (which should use the AVX2
codepath instead).

We are also about to add some VNNI code for upcoming Intel processors and we
would need a function version for those, because AVX512F is too broad.

Cheers,

Nick

Reply via email to