On Thu, Mar 02, 2017 at 10:03:28AM +0100, Thomas Koenig wrote:
> Am 02.03.2017 um 09:43 schrieb Jakub Jelinek:
> > On Wed, Mar 01, 2017 at 10:00:08PM +0100, Thomas Koenig wrote:
> > > @@ -101,7 +93,7 @@
> > > `static void
> > > 'matmul_name` ('rtype` * const restrict retarray,
> > > 'rtype` * const restrict a, 'rtype` * const restrict b, int try_blas,
> > > - int blas_limit, blas_call gemm) __attribute__((__target__("avx2")));
> > > + int blas_limit, blas_call gemm) __attribute__((__target__("avx2,fma")));
> > > static' include(matmul_internal.m4)dnl
> > > `#endif /* HAVE_AVX2 */
> > >
> >
> > I guess the question here is if there are any CPUs that have AVX2 but don't
> > have FMA3. If there are none, then this is not controversial, if there are
> > some, it depends on how widely they are used compared to ones that have both
> > AVX2 and FMA3. Going just from our -march= bitsets, it seems if there is
> > PTA_AVX2, then there is also PTA_FMA: haswell, broadwell, skylake,
> > skylake-avx512, knl,
> > bdver4, znver1, there are CPUs that have just PTA_AVX and not PTA_AVX2 and
> > still have PTA_FMA: bdver2, bdver3 (but that is not relevant to this patch).
>
> In a previous incantation of the patch, I saw that the compiler
> generated the same floating point code for AVX and AVX2 (which why
> there currently is no AVX2 floating point version). I could also
> generate an AVX+FMA version for floating point and an AVX2 version
> for integer (if anybody cares about integer matmul).
I think having another avx,fma version is not worth it, avx+fma is far less
common than avx without fma.
> > > @@ -147,7 +141,8 @@
> > > #endif /* HAVE_AVX512F */
> > >
> > > #ifdef HAVE_AVX2
> > > - if (__cpu_model.__cpu_features[0] & (1 << FEATURE_AVX2))
> > > + if ((__cpu_model.__cpu_features[0] & (1 << FEATURE_AVX2))
> > > + && (__cpu_model.__cpu_features[0] & (1 << FEATURE_FMA)))
> > > {
> > > matmul_p = matmul_'rtype_code`_avx2;
> > > goto tailcall;
> >
> > and this too.
>
> Will do.
Note I meant obviously the FEATURE_AVX512F related hunk, not this one,
sorry.
Jakub