On Sat, Dec 28, 2019 at 11:48:12AM +0100, Uros Bizjak wrote: > On Sat, Dec 28, 2019 at 10:33 AM Jakub Jelinek <ja...@redhat.com> wrote: > > > > Hi! > > > > In i386.md, we have nearbyint<mode>2 and rint<mode>2 patterns that expand > > SF/DF/XF mode patterns to rounding instructions. For pre-sse4.1 that is > > done using XFmode and so inappropriate for vectorization, but for sse4.1 > > and later we can just use the {,v}{round,rndscale}p{s,d} instructions > > when we emit {,v}rounds{s,d} for SF/DF mode. > > In i386-builtins.c, ix86_builtin_vectorized_function, we already have: > > --cut here-- > CASE_CFN_RINT: > /* The round insn does not trap on denormals. */ > if (flag_trapping_math || !TARGET_SSE4_1) > break; > > if (out_mode == DFmode && in_mode == DFmode) > { > if (out_n == 2 && in_n == 2) > return ix86_get_builtin (IX86_BUILTIN_RINTPD); > else if (out_n == 4 && in_n == 4) > return ix86_get_builtin (IX86_BUILTIN_RINTPD256); > } > if (out_mode == SFmode && in_mode == SFmode) > { > if (out_n == 4 && in_n == 4) > return ix86_get_builtin (IX86_BUILTIN_RINTPS); > else if (out_n == 8 && in_n == 8) > return ix86_get_builtin (IX86_BUILTIN_RINTPS256); > } > break; > --cut here--
Ok, will test removing that stuff, seems nothing in the headers uses that. > which is converting rint functions to corresponding x86 builtin. If we > want to go through generic path, then the above code is probably > redundant and should be removed together with corresponding builtins. > OTOH, the existing code also bails out for flag_trapping_math, so this > condition should also be considered in named expanders. The conditions are: (define_expand "nearbyint<mode>2" [(use (match_operand:MODEF 0 "register_operand")) (use (match_operand:MODEF 1 "nonimmediate_operand"))] "(TARGET_USE_FANCY_MATH_387 && (!(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH) || TARGET_MIX_SSE_I387) && !flag_trapping_math) || (TARGET_SSE4_1 && TARGET_SSE_MATH)" and: (define_expand "rint<mode>2" [(use (match_operand:MODEF 0 "register_operand")) (use (match_operand:MODEF 1 "nonimmediate_operand"))] "TARGET_USE_FANCY_MATH_387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)" Only nearbyint tests flag_trapping_math, and only for the pre-sse4.1 case, with sse4.1 it is enabled regardless of that (just depends on TARGET_SSE_MATH, but I think for vectorization we don't really test that, vectorization is always done in sse*). Jakub