Hi!
On Thu, Nov 17, 2016 at 02:18:57PM -0800, H.J. Lu wrote:
> > Hi HJ, could you please commit it?
>
> Done.
I'm seeing lots of ICEs with this.
E.g. reduced:
typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
typedef unsigned char __mmask8;
typedef float __v4sf __attribute__ ((__vector_size__ (16)));
static inline __m128 __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
_mm_setzero_ps (void)
{
return __extension__ (__m128){ 0.0f, 0.0f, 0.0f, 0.0f };
}
__m128
foo (__mmask8 __U, __m128 __A, __m128 __B, __m128 __C, __m128 __D, __m128 __E,
__m128 *__F)
{
return (__m128) __builtin_ia32_4fmaddss_mask ((__v4sf) __B,
(__v4sf) __C,
(__v4sf) __D,
(__v4sf) __E,
(__v4sf) __A,
(const __v4sf *) __F,
(__v4sf) _mm_setzero_ps (),
(__mmask8) __U);
}
ICEs with -mavx5124fmaps -O0, but succeeds with
-mavx512vl -mavx5124fmaps -O0 or -mavx5124fmaps -O2.
fcn_mask = gen_avx5124fmaddps_4fmaddss_mask;
fcn_maskz = gen_avx5124fmaddps_4fmaddss_maskz;
msk_mov = gen_avx512vl_loadv4sf_mask;
looks wrong, while -mavx5124fmaps implies -mavx512f, it doesn't
imply -mavx512vl, so using -mavx512vl insns unconditionally is just wrong.
You need some fallback if avx512vl isn't available, perhaps use
avx512f 512-bit masked insns with bits in the mask forced to pick only the
ones you want?
Also, seems there are various formatting issues in the change,
e.g. shortly after s4fma_expand: there is indentation by 3 chars relative to
above { instead of 2, gen_rtx_SUBREG (V16SFmode, tmp, 0)); has extra 1 char
indentation, some lines too long.
Jakub