Hi Jakub,
On 27 Apr 23:34, Jakub Jelinek wrote:
> Hi!
> 
> While AVX512F doesn't contain EVEX encoded vround{ss,sd,ps,pd} instructions,
> it contains vrndscale* which performs the same thing if bits [4:7] of the
> immediate are zero.
> 
> For _mm*_round_{ps,pd} we actually already emit vrndscale* for -mavx512f
> instead of vround* unconditionally (because
> <avx512>_rndscale<mode><mask_name><round_saeonly_name>
> instruction has the same RTL as <sse4_1>_round<ssemodesuffix><avxsizesuffix>
> and the former, enabled for TARGET_AVX512F, comes first), for the scalar
> cases (thus __builtin_round* or _mm*_round_s{s,d}) the patterns we have
> don't allow extended registers and thus we end up with unnecessary moves
> if the inputs and/or outputs are or could be most effectively allocated
> in the xmm16+ registers.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
Your patch is OK.
> 
> 2016-04-27  Jakub Jelinek  <ja...@redhat.com>
> 
>       * config/i386/i386.md (sse4_1_round<mode>2): Add avx512f alternative.
>       * config/i386/sse.md (sse4_1_round<ssescalarmodesuffix>): Likewise.
> 
>       * gcc.target/i386/avx-vround-1.c: New test.
>       * gcc.target/i386/avx-vround-2.c: New test.
>       * gcc.target/i386/avx512vl-vround-1.c: New test.
>       * gcc.target/i386/avx512vl-vround-2.c: New test.

--
Thanks, K

Reply via email to