Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Thomas Koenig
Am 02.03.2017 um 13:02 schrieb Jakub Jelinek: And this needs to use *matmul_fn instead of *matmul_p too. The whole point is that matmul_p is only loaded using __atomic_load_n and only optionally stored using __atomic_store_n. Ok with those changes. Thanks! Committed as https://gcc.gnu.org/vie

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Jakub Jelinek
On Thu, Mar 02, 2017 at 12:57:05PM +0100, Thomas Koenig wrote: > --- m4/matmul.m4 (Revision 245836) > +++ m4/matmul.m4 (Arbeitskopie) > @@ -123,9 +123,14 @@ void matmul_'rtype_code` ('rtype` * const restrict > 'rtype` * const restrict a, 'rtype` * const restrict b, int try_blas, >

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Thomas Koenig
Hi Jakub, Actually, I see a problem, but not related to this patch. I bet e.g. tsan would complain heavily on the wrappers, because the code is racy: Here is a patch implementing your suggestion. Tested at least so far that all matmul test cases pass on my machine. OK for trunk? Regards

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Jakub Jelinek
On Thu, Mar 02, 2017 at 11:45:59AM +0100, Thomas Koenig wrote: > Here's the updated version, which just uses FMA for AVX2. > > OK for trunk? > > Regards > > Thomas > > 2017-03-01 Thomas Koenig > > PR fortran/78379 > * m4/matmul.m4: (matmul_'rtype_code`_avx2): Also gene

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Jakub Jelinek
On Thu, Mar 02, 2017 at 11:45:59AM +0100, Thomas Koenig wrote: > Here's the updated version, which just uses FMA for AVX2. > > OK for trunk? > > Regards > > Thomas > > 2017-03-01 Thomas Koenig > > PR fortran/78379 > * m4/matmul.m4: (matmul_'rtype_code`_avx2): Also gene

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Thomas Koenig
Here's the updated version, which just uses FMA for AVX2. OK for trunk? Regards Thomas 2017-03-01 Thomas Koenig PR fortran/78379 * m4/matmul.m4: (matmul_'rtype_code`_avx2): Also generate for reals. Add fma to target options. (matmul_'rtype_code`):

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Jakub Jelinek
On Thu, Mar 02, 2017 at 10:03:28AM +0100, Thomas Koenig wrote: > Am 02.03.2017 um 09:43 schrieb Jakub Jelinek: > > On Wed, Mar 01, 2017 at 10:00:08PM +0100, Thomas Koenig wrote: > > > @@ -101,7 +93,7 @@ > > > `static void > > > 'matmul_name` ('rtype` * const restrict retarray, > > > 'rtype` * c

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Thomas Koenig
Am 02.03.2017 um 09:43 schrieb Jakub Jelinek: On Wed, Mar 01, 2017 at 10:00:08PM +0100, Thomas Koenig wrote: @@ -101,7 +93,7 @@ `static void 'matmul_name` ('rtype` * const restrict retarray, 'rtype` * const restrict a, 'rtype` * const restrict b, int try_blas, - int blas_limit, b

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Jakub Jelinek
On Wed, Mar 01, 2017 at 10:00:08PM +0100, Thomas Koenig wrote: > @@ -101,7 +93,7 @@ > `static void > 'matmul_name` ('rtype` * const restrict retarray, > 'rtype` * const restrict a, 'rtype` * const restrict b, int try_blas, > - int blas_limit, blas_call gemm) __attribute__((__target__("

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Jakub Jelinek
On Thu, Mar 02, 2017 at 10:09:31AM +0200, Janne Blomqvist wrote: > > Here's something from the new matmul_r8_avx2: > > > > 156c: c4 62 e5 b8 fd vfmadd231pd %ymm5,%ymm3,%ymm15 > > 1571: c4 c1 79 10 04 06 vmovupd (%r14,%rax,1),%xmm0 > > 1577: c4 62 dd b8 d

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Richard Biener
On Thu, Mar 2, 2017 at 9:09 AM, Janne Blomqvist wrote: > On Thu, Mar 2, 2017 at 9:50 AM, Thomas Koenig wrote: >> Am 02.03.2017 um 08:32 schrieb Janne Blomqvist: >>> >>> On Wed, Mar 1, 2017 at 11:00 PM, Thomas Koenig >>> wrote: Hello world, the attached patch enables FMA for t

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-02 Thread Janne Blomqvist
On Thu, Mar 2, 2017 at 9:50 AM, Thomas Koenig wrote: > Am 02.03.2017 um 08:32 schrieb Janne Blomqvist: >> >> On Wed, Mar 1, 2017 at 11:00 PM, Thomas Koenig >> wrote: >>> >>> Hello world, >>> >>> the attached patch enables FMA for the AVX2 and AVX512F variants of >>> matmul. This should bring a v

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-01 Thread Thomas Koenig
Am 02.03.2017 um 08:32 schrieb Janne Blomqvist: On Wed, Mar 1, 2017 at 11:00 PM, Thomas Koenig wrote: Hello world, the attached patch enables FMA for the AVX2 and AVX512F variants of matmul. This should bring a very nice speedup (although I have been unable to run benchmarks due to lack of a

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-01 Thread Janne Blomqvist
On Wed, Mar 1, 2017 at 11:00 PM, Thomas Koenig wrote: > Hello world, > > the attached patch enables FMA for the AVX2 and AVX512F variants of > matmul. This should bring a very nice speedup (although I have > been unable to run benchmarks due to lack of a suitable machine). In lieu of benchmarks,

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-01 Thread Thomas Koenig
Hi Jerry, I would prefer that it was tested on the actual expected platform. Does anyone anywhere on this list have access to one of these machines to test? If anybody wants to test who does not have --enable-maintainer-mode activated, here is a patch that works "out of the box". Regards

Re: [patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-01 Thread Jerry DeLisle
On 03/01/2017 01:00 PM, Thomas Koenig wrote: Hello world, the attached patch enables FMA for the AVX2 and AVX512F variants of matmul. This should bring a very nice speedup (although I have been unable to run benchmarks due to lack of a suitable machine). Question: Is this still appropriate for

[patch, fortran] Enable FMA for AVX2 and AVX512F for matmul

2017-03-01 Thread Thomas Koenig
Hello world, the attached patch enables FMA for the AVX2 and AVX512F variants of matmul. This should bring a very nice speedup (although I have been unable to run benchmarks due to lack of a suitable machine). Question: Is this still appropriate for the current state of trunk? Or rather, OK for