On Mon, Jun 3, 2019 at 3:50 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Tue, May 21, 2019 at 8:54 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Wed, May 15, 2019 at 2:29 PM Richard Sandiford > > <richard.sandif...@arm.com> wrote: > > > > > > "H.J. Lu" <hjl.to...@gmail.com> writes: > > > > On Thu, Feb 7, 2019 at 9:49 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > >> > > > >> Standard scalar operation patterns which preserve the rest of the > > > >> vector > > > >> look like > > > >> > > > >> (vec_merge:V2DF > > > >> (vec_duplicate:V2DF > > > >> (op:DF (vec_select:DF (reg/v:V2DF 85 [ x ]) > > > >> (parallel [ (const_int 0 [0])])) > > > >> (reg:DF 87)) > > > >> (reg/v:V2DF 85 [ x ]) > > > >> (const_int 1 [0x1])])) > > > >> > > > >> Add such pattens to i386 backend and convert VEC_CONCAT patterns to > > > >> standard standard scalar operation patterns. > > > > > > It looks like there's some variety in the patterns used, e.g.: > > > > > > (define_insn > > > "<sse>_vm<code><mode>3<mask_scalar_name><round_saeonly_scalar_name>" > > > [(set (match_operand:VF_128 0 "register_operand" "=x,v") > > > (vec_merge:VF_128 > > > (smaxmin:VF_128 > > > (match_operand:VF_128 1 "register_operand" "0,v") > > > (match_operand:VF_128 2 "vector_operand" > > > "xBm,<round_saeonly_scalar_constraint>")) > > > (match_dup 1) > > > (const_int 1)))] > > > "TARGET_SSE" > > > "@ > > > <maxmin_float><ssescalarmodesuffix>\t{%2, %0|%0, %<iptr>2} > > > > > > v<maxmin_float><ssescalarmodesuffix>\t{<round_saeonly_scalar_mask_op3>%2, > > > %1, %0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %1, > > > %<iptr>2<round_saeonly_scalar_mask_op3>}" > > > [(set_attr "isa" "noavx,avx") > > > (set_attr "type" "sse") > > > (set_attr "btver2_sse_attr" "maxmin") > > > (set_attr "prefix" "<round_saeonly_scalar_prefix>") > > > (set_attr "mode" "<ssescalarmode>")]) > > > > > > makes the operand a full vector operation, which seems simpler. > > > > This pattern is used to implement scalar smaxmin intrinsics. > > > > > The above would then be: > > > > > > (vec_merge:V2DF > > > (op:V2DF > > > (reg:V2DF 85) > > > (vec_duplicate:V2DF (reg:DF 87))) > > > (reg/v:V2DF 85 [ x ]) > > > (const_int 1 [0x1])])) > > > > > > I guess technically the two have different faulting behaviour though, > > > since the smaxmin gets applied to all elements, not just element 0. > > > > This is the issue. We don't use the correct mode for scalar instructions: > > > > --- > > #include <immintrin.h> > > > > __m128d > > foo1 (__m128d x, double *p) > > { > > __m128d y = _mm_load_sd (p); > > return _mm_max_pd (x, y); > > } > > --- > > > > movq (%rdi), %xmm1 > > maxpd %xmm1, %xmm0 > > ret > > > > > > Here is the updated patch to add standard floating point scalar > > operation patterns to i386 backend. Then we can do > > > > --- > > #include <immintrin.h> > > > > extern __inline __m128d __attribute__((__gnu_inline__, > > __always_inline__, __artificial__)) > > _new_mm_max_pd (__m128d __A, __m128d __B) > > { > > __A[0] = __A[0] > __B[0] ? __A[0] : __B[0]; > > return __A; > > } > > > > __m128d > > foo2 (__m128d x, double *p) > > { > > __m128d y = _mm_load_sd (p); > > return _new_mm_max_pd (x, y); > > } > > > > maxsd (%rdi), %xmm0 > > ret > > > > We should use generic vector operations to implement i386 intrinsics > > as much as we can. > > > > > The patch seems very specific. E.g. why just PLUS, MINUS, MULT and DIV? > > > > This patch only adds +, -, *, /, > and <. We can add more if there > > are testcases > > for them. > > > > > Thanks, > > > Richard > > > > > > > > > >> > > > >> gcc/ > > > >> > > > >> PR target/54855 > > > >> * simplify-rtx.c (simplify_binary_operation_1): Convert > > > >> VEC_CONCAT patterns to standard standard scalar operation > > > >> patterns. > > > >> * config/i386/sse.md (*<sse>_vm<plusminus_insn><mode>3): New. > > > >> (*<sse>_vm<multdiv_mnemonic><mode>3): Likewise. > > > >> > > > >> gcc/testsuite/ > > > >> > > > >> PR target/54855 > > > >> * gcc.target/i386/pr54855-1.c: New test. > > > >> * gcc.target/i386/pr54855-2.c: Likewise. > > > >> * gcc.target/i386/pr54855-3.c: Likewise. > > > >> * gcc.target/i386/pr54855-4.c: Likewise. > > > >> * gcc.target/i386/pr54855-5.c: Likewise. > > > >> * gcc.target/i386/pr54855-6.c: Likewise. > > > >> * gcc.target/i386/pr54855-7.c: Likewise. > > > > > > > > PING: > > > > > > > > https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00398.html > > > > Thanks. > > > > PING: > > https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01416.html >
PING. -- H.J.