Hi Jakub, On 27 Apr 23:34, Jakub Jelinek wrote: > Hi! > > While AVX512F doesn't contain EVEX encoded vround{ss,sd,ps,pd} instructions, > it contains vrndscale* which performs the same thing if bits [4:7] of the > immediate are zero. > > For _mm*_round_{ps,pd} we actually already emit vrndscale* for -mavx512f > instead of vround* unconditionally (because > <avx512>_rndscale<mode><mask_name><round_saeonly_name> > instruction has the same RTL as <sse4_1>_round<ssemodesuffix><avxsizesuffix> > and the former, enabled for TARGET_AVX512F, comes first), for the scalar > cases (thus __builtin_round* or _mm*_round_s{s,d}) the patterns we have > don't allow extended registers and thus we end up with unnecessary moves > if the inputs and/or outputs are or could be most effectively allocated > in the xmm16+ registers. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk? Your patch is OK. > > 2016-04-27 Jakub Jelinek <ja...@redhat.com> > > * config/i386/i386.md (sse4_1_round<mode>2): Add avx512f alternative. > * config/i386/sse.md (sse4_1_round<ssescalarmodesuffix>): Likewise. > > * gcc.target/i386/avx-vround-1.c: New test. > * gcc.target/i386/avx-vround-2.c: New test. > * gcc.target/i386/avx512vl-vround-1.c: New test. > * gcc.target/i386/avx512vl-vround-2.c: New test.
-- Thanks, K