Re: [PATCH] Add the member integer_to_sse to processor_cost as a cost simulation for movd/pinsrd. It will be used to calculate the cost of vec_construct.

2021-08-08 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 3, 2021 at 7:12 PM Hongtao Liu wrote: > > On Tue, Aug 3, 2021 at 6:20 PM Richard Biener > wrote: > > > > On Tue, Aug 3, 2021 at 11:20 AM Richard Biener > > wrote: > > > > > > On Wed, Jul 28, 2021 at 4:51 AM Hongtao Liu via Gcc-patches

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-09 Thread Hongtao Liu via Gcc-patches
On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches wrote: > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford > wrote: > > > > Richard Biener via Gcc-patches writes: > > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt wrote: > > >> > > >> Hi: > > >> --- > > >> OK, I think sth is amiss he

Re: [PATCH] i386: Fix typos in amxbf16 runtime test.

2021-08-10 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 10, 2021 at 4:11 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > This patch fixes some typo in amxbf16-dpbf16ps-2 test. > > Tested under sde/spr machine and passed. > > OK for master and backport to GCC 11? Ok for master, and i don't think the backport is necessary. > > gcc/testsuite

Re: [PATCH] i386: Improve single operand AVX512F permutations [PR80355]

2021-08-10 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 10, 2021 at 4:44 PM Jakub Jelinek wrote: > > Hi! > > On the following testcase we emit > vmovdqa32 .LC0(%rip), %zmm1 > vpermd %zmm0, %zmm1, %zmm0 > and > vmovdqa64 .LC1(%rip), %zmm1 > vpermq %zmm0, %zmm1, %zmm0 > instead of > vshufi

Re: [PATCH] i386: Allow some V32HImode and V64QImode permutations even without AVX512BW [PR80355]

2021-08-10 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 10, 2021 at 4:54 PM Jakub Jelinek wrote: > > Hi! > > When working on the PR, I've noticed we generate terrible code for > V32HImode or V64QImode permutations for -mavx512f -mno-avx512bw. > Generally we can't do much with such permutations, but since PR68655 > we can handle at least som

Re: [PATCH] [i386] Combine avx_vec_concatv16si and avx512f_zero_extendv16hiv16si2_1 to avx512f_zero_extendv16hiv16si2_2.

2021-08-11 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 11, 2021 at 3:58 PM Jakub Jelinek wrote: > > On Wed, Aug 11, 2021 at 02:43:06PM +0800, liuhongt wrote: > > Add define_insn_and_split to combine avx_vec_concatv16si/2 and > > avx512f_zero_extendv16hiv16si2_1 since the latter already zero_extend > > the upper bits, similar for other pa

Re: [PATCH] Extend ldexp{s, d}f3 to vscalefs{s, d} when TARGET_AVX512F and TARGET_SSE_MATH.

2021-08-11 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 11, 2021 at 7:16 PM Uros Bizjak wrote: > > On Wed, Aug 11, 2021 at 8:36 AM Uros Bizjak wrote: > > > > On Tue, Aug 10, 2021 at 2:13 PM liuhongt wrote: > > > > > > Hi: > > > AVX512F supported vscalefs{s,d} which is the same as ldexp except the > > > second operand should be floating

Re: [PATCH] [i386] Introduce a scalar version of avx512f_vmscalef and adjust ldexp3 for it.

2021-08-11 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 12, 2021 at 12:05 PM liuhongt wrote: > > Hi: > This is the patch i'm going to checkin. > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}; > > > 2021-08-12 Uros Bizjak > > gcc/ChangeLog: > > PR target/98309 > * config/i386/i386.md (avx512f_scalef2): New >

Re: [PATCH] i386: Fix up V32HImode permutations with -mno-avx512bw [PR101860]

2021-08-12 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 12, 2021 at 3:49 PM Jakub Jelinek wrote: > > Hi! > > My patch from yesterday apparently broke some V32HImode permutations > as the testcase shows. > The first function assumed it would never be called in d->testing_p mode > and so went right away into emitting the code. > And the secon

Re: [PATCH] [i386] Optimize vec_perm_expr to match vpmov{dw,qd,wb}.

2021-08-12 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 12, 2021 at 5:23 PM Jakub Jelinek wrote: > > On Thu, Aug 12, 2021 at 01:43:23PM +0800, liuhongt wrote: > > Hi: > > This is another patch to optimize vec_perm_expr to match vpmov{dw,dq,wb} > > under AVX512. > > For scenarios(like pr101846-2.c) where the upper half is not used, this

Re: [PATCH] [i386] Optimize __builtin_shuffle_vector.

2021-08-16 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 16, 2021 at 3:11 PM Jakub Jelinek via Gcc-patches wrote: > > On Mon, Aug 16, 2021 at 01:18:38PM +0800, liuhongt via Gcc-patches wrote: > > + /* Accept VNxHImode and VNxQImode now. */ > > + if (!TARGET_AVX512VL && GET_MODE_SIZE (mode) < 64) > > +return false; > > + > > + /* vper

Re: [PATCH] [i386] Optimize __builtin_shuffle_vector.

2021-08-16 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 16, 2021 at 3:25 PM Hongtao Liu wrote: > > On Mon, Aug 16, 2021 at 3:11 PM Jakub Jelinek via Gcc-patches > wrote: > > > > On Mon, Aug 16, 2021 at 01:18:38PM +0800, liuhongt via Gcc-patches wrote: > > > + /* Accept VNxHImode and VNxQImode now. */

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-16 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu wrote: > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches > wrote: > > > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford > > wrote: > > > > > > Richard Biener via Gcc-patches writes: &g

Re: [PATCH 4/6] Support -fexcess-precision=16 which will enable FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.

2021-08-16 Thread Hongtao Liu via Gcc-patches
On Fri, Aug 6, 2021 at 2:06 PM Hongtao Liu wrote: > > On Tue, Aug 3, 2021 at 10:44 AM Hongtao Liu wrote: > > > > On Tue, Aug 3, 2021 at 3:34 AM Joseph Myers wrote: > > > > > > On Mon, 2 Aug 2021, liuhongt via Gcc-patches wrote: > > > > > > &

Re: [PATCH] Revert "Add the member integer_to_sse to processor_cost as a cost simulation for movd/pinsrd. It will be used to calculate the cost of vec_construct."

2021-08-17 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 17, 2021 at 8:56 PM H.J. Lu via Gcc-patches wrote: > > On Tue, Aug 17, 2021 at 5:43 AM liuhongt via Gcc-patches > wrote: > > > > This reverts commit 872da9a6f664a06d73c987aa0cb2e5b830158a10. > > > > PR target/101936 > > PR target/101929 > > > > Bootstrapped and regtested on x86_64-l

Re: [PATCH] move x86 to use gather/scatter internal functions

2021-08-17 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 17, 2021 at 10:43 PM Richard Biener via Gcc-patches wrote: > > On Tue, Aug 17, 2021 at 3:29 PM Richard Biener via Gcc-patches > wrote: > > > > This is an attempt to start moving the x86 backend to use > > standard pattern names for [mask_]gather_load and [mask_]scatter_store > > rathe

Re: [PATCH] move x86 to use gather/scatter internal functions

2021-08-17 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 17, 2021 at 10:43 PM Richard Biener via Gcc-patches wrote: > > On Tue, Aug 17, 2021 at 3:29 PM Richard Biener via Gcc-patches > wrote: > > > > This is an attempt to start moving the x86 backend to use > > standard pattern names for [mask_]gather_load and [mask_]scatter_store > > rathe

Re: [PATCH] move x86 to use gather/scatter internal functions

2021-08-17 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 18, 2021 at 11:24 AM Hongtao Liu wrote: > > On Tue, Aug 17, 2021 at 10:43 PM Richard Biener via Gcc-patches > wrote: > > > > On Tue, Aug 17, 2021 at 3:29 PM Richard Biener via Gcc-patches > > wrote: > > > > > > This is an attempt to start

Re: [PATCH] [i386] Add x86 tune to enable v2df vector reduction by paddpd.

2021-08-17 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 17, 2021 at 5:06 PM liuhongt wrote: > > Hi: > This patch add a new x86 tune named X86_TUNE_V2DF_REDUCTION_PREFER_HADDPD > to enable haddpd for v2df vector reduction, the tune is disabled by default. > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,} > Ok for trunk? > Pushe

Re: [PATCH] move x86 to use gather/scatter internal functions

2021-08-18 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 18, 2021 at 4:32 PM Richard Biener wrote: > > On Wed, 18 Aug 2021, Hongtao Liu wrote: > > > On Wed, Aug 18, 2021 at 11:24 AM Hongtao Liu wrote: > > > > > > On Tue, Aug 17, 2021 at 10:43 PM Richard Biener via Gcc-patches > > > wrote: >

Re: [PATCH] move x86 to use gather/scatter internal functions

2021-08-18 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 18, 2021 at 5:54 PM Richard Biener wrote: > > > So in the end I seem to be able to combine AVX & AVX512 arriving > at the following which passes basic testing. I will now see to > teach the vectorizer the required "promotion" to handle > mask_gather_loadv4dfv4si and mask_gather_loadv4

Re: [PATCH] move x86 to use gather/scatter internal functions

2021-08-18 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 18, 2021 at 6:28 PM Richard Biener wrote: > > On Wed, 18 Aug 2021, Richard Biener wrote: > > > > > So in the end I seem to be able to combine AVX & AVX512 arriving > > at the following which passes basic testing. I will now see to > > teach the vectorizer the required "promotion" to h

Re: [PATCH] move x86 to use gather/scatter internal functions

2021-08-18 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 18, 2021 at 7:30 PM Hongtao Liu wrote: > > On Wed, Aug 18, 2021 at 6:28 PM Richard Biener wrote: > > > > On Wed, 18 Aug 2021, Richard Biener wrote: > > > > > > > > So in the end I seem to be able to combine AVX & AVX512 arriving > &g

Re: [PATCH] move x86 to use gather/scatter internal functions

2021-08-18 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 18, 2021 at 7:37 PM Hongtao Liu wrote: > > On Wed, Aug 18, 2021 at 7:30 PM Hongtao Liu wrote: > > > > On Wed, Aug 18, 2021 at 6:28 PM Richard Biener wrote: > > > > > > On Wed, 18 Aug 2021, Richard Biener wrote: > > > > > > &g

Re: [PATCH] x86: Allow CONST_VECTOR for vector load in combine

2021-08-23 Thread Hongtao Liu via Gcc-patches
On Sun, Aug 22, 2021 at 8:54 PM H.J. Lu via Gcc-patches wrote: > > In vetor move pattern, replace nonimmediate_or_sse_const_operand with > nonimmediate_or_sse_const_vector_operand to allow vector load from > non-uniform CONST_VECTOR. Non-uniform CONST_VECTOR is enabled only in > the combine pass

Re: [PATCH] x86: Broadcast from integer to a pseudo vector register

2021-08-23 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 24, 2021 at 9:01 AM H.J. Lu via Gcc-patches wrote: > > Broadcast from integer to a pseudo vector register instead of a hard > vector register to allow LRA to remove redundant move instruction after > broadcast. > > gcc/ > > PR target/102021 > * config/i386/i386-expand.c

Re: [PATCH v2] x86: Allow CONST_VECTOR for vector load in combine

2021-08-23 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 23, 2021 at 9:14 PM H.J. Lu wrote: > > On Mon, Aug 23, 2021 at 03:23:26PM +0800, Hongtao Liu wrote: > > On Sun, Aug 22, 2021 at 8:54 PM H.J. Lu via Gcc-patches > > wrote: > > > > > > In vetor move pattern, replace no

Re: [PATCH] x86: Broadcast from integer to a pseudo vector register

2021-08-23 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 24, 2021 at 9:43 AM H.J. Lu wrote: > > On Mon, Aug 23, 2021 at 6:17 PM Hongtao Liu wrote: > > > > On Tue, Aug 24, 2021 at 9:01 AM H.J. Lu via Gcc-patches > > wrote: > > > > > > Broadcast from integer to a pseudo vector register instead of a

Re: [PATCH 4/6] Support -fexcess-precision=16 which will enable FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.

2021-08-24 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 17, 2021 at 9:53 AM Hongtao Liu wrote: > > On Fri, Aug 6, 2021 at 2:06 PM Hongtao Liu wrote: > > > > On Tue, Aug 3, 2021 at 10:44 AM Hongtao Liu wrote: > > > > > > On Tue, Aug 3, 2021 at 3:34 AM Joseph Myers > > > wrote: > > >

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-24 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu wrote: > > On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu wrote: > > > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches > > wrote: > > > > > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford &g

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-24 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu wrote: > > On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu wrote: > > > > On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu wrote: > > > > > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches > > > wro

Re: [PATCH] [i386] Optimize (a & b) | (c & ~b) to vpternlog instruction.

2021-08-24 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 24, 2021 at 9:36 AM liuhongt wrote: > > Also optimize below 3 forms to vpternlog, op1, op2, op3 are > register_operand or unary_p as (not reg) > > A: (any_logic (any_logic op1 op2) op3) > B: (any_logic (any_logic op1 op2) (any_logic op3 op4)) op3/op4 should > be equal to op1/op2 > C: (

Re: [PATCH] [i386] Enable avx512 embedde broadcast for vpternlog.

2021-08-24 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 24, 2021 at 6:25 PM liuhongt wrote: > > gcc/ChangeLog: > > PR target/101989 > * config/i386/sse.md (_vternlog): > Enable avx512 embedded broadcast. > (*_vternlog_all): Ditto. > (_vternlog_mask): Ditto. > > gcc/testsuite/ChangeLog: > > PR

Re: [PATCH] [i386] Optimize (a & b) | (c & ~b) to vpternlog instruction.

2021-08-24 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 24, 2021 at 9:11 PM Bernhard Reutner-Fischer wrote: > > On Tue, 24 Aug 2021 17:53:27 +0800 > Hongtao Liu via Gcc-patches wrote: > > > On Tue, Aug 24, 2021 at 9:36 AM liuhongt wrote: > > > > > > Also optimize below 3 forms to vpternlog, op1, op

Re: [PATCH] i386: Fix wrong optimization for consecutive masked scatters [PR 101472]

2021-08-24 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 25, 2021 at 2:14 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > For avx512f_scattersi, mask operand only affect set src, we > need to refine the pattern to let gcc know mask register also affect the dest. > So we put mask operand into UNSPEC_VSIBADDR. > > Bootstrapped and regress

Re: [PATCH] Change illegitimate constant into memref of constant pool in change_zero_ext.

2021-08-25 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 25, 2021 at 5:14 AM Segher Boessenkool wrote: > > Hi! > > On Tue, Aug 24, 2021 at 04:55:30PM +0800, liuhongt wrote: > > This patch extend change_zero_ext to change illegitimate constant > > into constant pool, this will enable simplification of below: > > It should be in a separate f

Re: [PATCH] Change illegitimate constant into memref of constant pool in change_zero_ext.

2021-08-25 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 25, 2021 at 5:14 AM Segher Boessenkool wrote: > > Hi! > > On Tue, Aug 24, 2021 at 04:55:30PM +0800, liuhongt wrote: > > This patch extend change_zero_ext to change illegitimate constant > > into constant pool, this will enable simplification of below: > > It should be in a separate f

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-25 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 24, 2021 at 7:39 PM Richard Biener wrote: > > On Tue, Aug 24, 2021 at 11:38 AM Hongtao Liu wrote: > > > > On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu wrote: > > > > > > On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu wrote: > > > > &g

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-25 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 26, 2021 at 7:16 AM Jeff Law wrote: > > > > On 8/24/2021 3:44 AM, Hongtao Liu via Gcc-patches wrote: > > On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu wrote: > > On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu wrote: > > On Mon, Aug 9, 2021 at 4:34 PM Hongtao

Re: [PATCH] Fold more shuffle builtins to VEC_PERM_EXPR.

2021-08-25 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 26, 2021 at 12:57 PM liuhongt wrote: > > This patch is a follow-up to [1], it fold all shufps/shufpd builtins into > gimple. Of course for non-mask or mask all-ones version. > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > [1] https://gcc.gnu.org/pipermail/gcc-patches/

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-26 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches wrote: > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford > wrote: > > > > Richard Biener via Gcc-patches writes: > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford > > > wrote: > > >> > > >> Richard Biener via Gcc-patches

Re: [PATCH] i386: Fix wrong optimization for consecutive masked scatters [PR 101472]

2021-08-26 Thread Hongtao Liu via Gcc-patches
On Fri, Aug 27, 2021 at 10:03 AM Kong, Lingling via Gcc-patches wrote: > > Hi, > > For avx512f_scattersi, mask operand only affect set src, we need > to refine the pattern to let gcc know mask register also affect the dest. > So we put mask operand into UNSPEC_VSIBADDR. > > Bootstrapped and regre

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-30 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 31, 2021 at 2:11 PM Richard Biener wrote: > > On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu wrote: > > > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches > > wrote: > > > > > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandi

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-30 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 31, 2021 at 2:30 PM Hongtao Liu wrote: > > On Tue, Aug 31, 2021 at 2:11 PM Richard Biener > wrote: > > > > On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu wrote: > > > > > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches > > &

Re: [PATCH] Check the type of mask while generating cond_op in gimple simplication.

2021-08-31 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 30, 2021 at 8:25 PM Richard Biener via Gcc-patches wrote: > > On Fri, Aug 27, 2021 at 8:53 AM liuhongt wrote: > > > > When gimple simplifcation try to combine op and vec_cond_expr to cond_op, > > it doesn't check if mask type matches. It causes an ICE when expand cond_op > > with mi

Re: i386: Fix array index in expander

2020-09-14 Thread Hongtao Liu via Gcc-patches
On Mon, Sep 14, 2020 at 3:51 PM Richard Biener via Gcc-patches wrote: > > On Fri, Sep 11, 2020 at 11:19 PM Nathan Sidwell wrote: > > > > I noticed a compiler warning about out-of-bound access. Fixed thusly. > > > > gcc/ > > * config/i386/sse.md (mov): Fix operand indices. > > >

[PATCH] Return mask <-> integer cost for non-AVX512 micro-architecture.

2020-09-14 Thread Hongtao Liu via Gcc-patches
Hi: This patch would avoid spill gprs to mask registers for non-AVX512 micro-architecture and fix regression in PR96744. Bootstrap is ok, regression test for i386/x86-64 backend is ok. No big performance impact on SPEC2017. gcc/ChangeLog: PR taregt/96744 * config/i386/x86-t

[PATCH] Increase rtx_cost of sse_to_integer in skylake_cost.

2020-09-15 Thread Hongtao Liu via Gcc-patches
Hi: Rtx cost of sse_to_integer would be used by pass_stv as a measurement for the scalar-to-vector transformation. As https://gcc.gnu.org/pipermail/gcc-patches/2019-August/528839.html indicates, movement between sse regs and gprs should be much expensive than movement inside gprs(which is 2 as de

[PATCH] -mno-xsave should imply -mno-avx since -mavx implies -mxsave

2020-09-15 Thread Hongtao Liu via Gcc-patches
Hi: If -mavx implies -mxsave, then -mno-xsave should imply -mno-avx. Current status is -mno-avx implies -mno-xsave which should be wrong. Bootstrap is ok, Regression test is ok for i386/x86 backend. Ok for trunk? gcc/ChangeLog * common/config/i386/i386-common.c (OPTION_MASK_ISA_A

Re: [PATCH] Increase rtx_cost of sse_to_integer in skylake_cost.

2020-09-16 Thread Hongtao Liu via Gcc-patches
Thanks. On Wed, Sep 16, 2020 at 8:54 PM Uros Bizjak wrote: > > > gcc/ChangeLog > > > > PR target/96861 > > * config/i386/x86-tune-costs.h (skylake_cost): increase rtx > > cost of sse_to_integer from 2 to 6. > > > > gcc/testsuite > > > > * gcc.target/i386/pr95021-3.

Re: [PATCH] -mno-xsave should imply -mno-avx since -mavx implies -mxsave

2020-09-16 Thread Hongtao Liu via Gcc-patches
Thanks! On Wed, Sep 16, 2020 at 8:57 PM Uros Bizjak wrote: > > > gcc/ChangeLog > > > > * common/config/i386/i386-common.c > > (OPTION_MASK_ISA_AVX_UNSET): Remove OPTION_MASK_ISA_XSAVE_UNSET. > > (OPTION_MASK_ISA_XSAVE_UNSET): Add OPTION_MASK_ISA_AVX_UNSET. > > > > gcc/test

Re: [PATCH] Increase rtx_cost of sse_to_integer in skylake_cost.

2020-09-17 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 17, 2020 at 12:10 PM Jeff Law wrote: > > > On 9/15/20 9:20 PM, Hongtao Liu via Gcc-patches wrote: > > Hi: > > Rtx cost of sse_to_integer would be used by pass_stv as a > > measurement for the scalar-to-vector transformation. As > > https://gcc.g

[PATCH 1/2] [target 87767] Refactor AVX512 broadcast patterns with speical memory constraint.

2020-10-11 Thread Hongtao Liu via Gcc-patches
Hi: This is done in 2 steps: 1. Extend special memory constraint to handle non MEM_P cases, i.e. (vec_duplicate:V4SF (mem:SF (addr))) 2. Refactor implementation of *_bcst{_1,_2,_3} patterns. Add new predicate bcst_mem_operand and corresponding constraint "Br" to merge "$(pattern)_bcst{_1,_2,_

[PATCH 2/2] [target 87767] Refactor AVX512 broadcast patterns with speical memory constraint.

2020-10-11 Thread Hongtao Liu via Gcc-patches
Add new predicate bcst_mem_operand and corresponding constraint "Br" to merge "$(pattern)_bcst{_1,_2,_3}" into "$(pattern)", also delete those separate "*_bcst{_1,_2,_3}" patterns. gcc/ChangeLog: PR target/87767 * config/i386/constraints.md ("Br"): New special memory con

Re: [PATCH] i386: Fixed vec_init_dup_v16bf [PR106887]

2022-09-14 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 15, 2022 at 11:36 AM Kong, Lingling via Gcc-patches wrote: > > Hi > > The patch is to fix vec_init_dup_v16bf, add correct handle for v16bf mode in > ix86_expand_vector_init_duplicate. > Add testcase with sse2 without avx2. > > OK for master? > > gcc/ChangeLog: > > PR target/10

Re: [PATCH] Modernize ix86_builtin_vectorized_function with corresponding expanders.

2022-09-15 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 16, 2022 at 8:55 AM liuhongt wrote: > > For ifloor/lfloor/iceil/lceil/irint/lrint/iround/lround when size of > in_mode is not equal out_mode, vectorizer doesn't go to internal fn > way,still left that part in the ix86_builtin_vectorized_function. > > Remove others builtins and add corr

Re: [PATCH] [x86]Don't optimize cmp mem, 0 to load mem, reg + test reg, reg

2022-09-15 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 16, 2022 at 9:09 AM liuhongt via Gcc-patches wrote: > > There's peephole2 submit in 1990s which split cmp mem, 0 to load mem, > reg + test reg, reg. I don't know exact reason why gcc do this. > > For latest x86 processors, ciscization should help processor frontend > also codesize, for

Re: [PATCH] Support 64-bit vectorization for single-precision floating rounding operation.

2022-09-19 Thread Hongtao Liu via Gcc-patches
On Tue, Sep 20, 2022 at 10:14 AM liuhongt wrote: > > Here's list the patch supported. > rint/nearbyint/ceil/floor/trunc/lrint/lceil/lfloor/round/lround. > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} > Ok for trunk? > > gcc/ChangeLog: > > PR target/106910 > * config

Re: [PATCH] [x86]Don't optimize cmp mem, 0 to load mem, reg + test reg, reg

2022-09-19 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 16, 2022 at 9:38 PM Alexander Monakov via Gcc-patches wrote: > > On Fri, 16 Sep 2022, Uros Bizjak via Gcc-patches wrote: > > > On Fri, Sep 16, 2022 at 3:32 AM Jeff Law via Gcc-patches > > wrote: > > > > > > > > > On 9/15/22 19:06, liuhongt via Gcc-patches wrote: > > > > There's peepho

Re: [PATCH] Fix incorrect handle in vectorizable_induction for mixed induction type.

2022-09-19 Thread Hongtao Liu via Gcc-patches
On Tue, Sep 20, 2022 at 10:23 AM liuhongt wrote: > > The codes in vectorizable_induction for slp_node assume all phi_info > have same induction type(vect_step_op_add), but since we support > nonlinear induction, it could be wrong handled. > So the patch return false when slp_node has mixed inducti

Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-19 Thread Hongtao Liu via Gcc-patches
On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches wrote: > > Hi! > > The following patch implements the compiler part of C++23 > P1467R9 - Extended floating-point types and standard names compiler part > by introducing _Float{16,32,64,128} as keywords and builtin types > like they are

Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-20 Thread Hongtao Liu via Gcc-patches
+My intel folk phoebe working for llvm side. On Tue, Sep 20, 2022 at 11:35 AM Hongtao Liu wrote: > > On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches > wrote: > > > > Hi! > > > > The following patch implements the compiler part of C++23 > >

Re: [PATCH] Don't check can_vec_perm_const_p for nonlinear iv_init when it's constant.

2022-09-21 Thread Hongtao Liu via Gcc-patches
On Wed, Sep 21, 2022 at 3:41 PM Richard Biener via Gcc-patches wrote: > > On Wed, Sep 21, 2022 at 1:41 AM liuhongt via Gcc-patches > wrote: > > > > When init_expr is INTEGER_CST or REAL_CST, can_vec_perm_const_p is not > > necessary since there's no real vec_perm needed, but > > vec_gen_perm_mask

Re: [PATCH] [x86] Fix typo in floorv2sf2, should be register_operand for op1, not vector_operand.

2022-09-21 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 22, 2022 at 9:17 AM liuhongt wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Verify 526.blend_r can be rebuilt with the fix. > > Ok for trunk? > > gcc/ChangeLog: > > PR target/106994 > * config/i386/mmx.md (floorv2sf2): Fix typo, use > reg

Re: [RFC PATCH] __trunc{tf, xf, df, sf, hf}bf2, __truncbfhf2 and __extendbfsf2

2022-09-22 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 22, 2022 at 11:56 PM Jakub Jelinek wrote: > > On Tue, Sep 20, 2022 at 10:51:18AM +0200, Jakub Jelinek via Gcc-patches wrote: > > On Tue, Sep 20, 2022 at 11:35:07AM +0800, Hongtao Liu wrote: > > > > The question is (mainly for aarch64, arm and x86 backend mai

Re: [PATCH] i386: Optimize code generation of __mm256_zextsi128_si256(__mm_set1_epi8(-1))

2022-09-22 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 22, 2022 at 3:20 PM Hu, Lin1 via Gcc-patches wrote: > > Hi all, > > This patch aims to optimize code generation of > __mm256_zextsi128_si256(__mm_set1_epi8(-1)). Reduce the number of > instructions required to achieve the final result. > > Regtested on x86_64-pc-linux-gnu. Ok for tru

Re: [PATCH] i386: Optimize code generation of __mm256_zextsi128_si256(__mm_set1_epi8(-1))

2022-09-22 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 23, 2022 at 11:07 AM Hu, Lin1 wrote: > > Hi, Hongtao > > I have modefied this patch and regtested on x86_64-pc-linux-gnu. > Ok. > BRs. > Lin > > -Original Message- > From: Hongtao Liu > Sent: Friday, September 23, 2022 9:48 AM > To: Hu,

Re: [PATCH] i386: Mark XMM4-XMM6 as clobbered by encodekey128/encodekey256

2022-09-27 Thread Hongtao Liu via Gcc-patches
On Wed, Sep 28, 2022 at 7:35 AM H.J. Lu via Gcc-patches wrote: > > encodekey128 and encodekey256 operations clear XMM4-XMM6. But it is > documented that XMM4-XMM6 are reserved for future usages and software > should not rely upon them being zeroed. Change encodekey128 and Indeed. Ok for trunk an

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-10-11 Thread Hongtao Liu via Gcc-patches
This commit failed tests FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq FAIL: gcc.target/i386/pr92645.c scan-tree-dump-times optimized "vec_unpack_" 4 FAIL: gcc.target/i38

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-10-11 Thread Hongtao Liu via Gcc-patches
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107226 On Wed, Oct 12, 2022 at 9:55 AM Hongtao Liu wrote: > > This commit failed tests > > FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq > FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq > FAIL: gcc.target/i3

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 1, 2022 at 10:27 AM H.J. Lu via Gcc-patches wrote: > > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote: > > > > .. in ix86_expand_vector_move and > > ix86_convert_const_wide_int_to_broadcast(called by the former). > > > > ix86_expand_vector_move is called by emit_move_insn which is use

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches wrote: > > On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu wrote: > > > > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote: > > > > > > .. in ix86_expand_vector_move and > > > ix86_convert_const_wide_int_to_broadcast(called by the former). > > > > >

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-03-01 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 2, 2022 at 6:49 AM H.J. Lu wrote: > > On Tue, Mar 1, 2022 at 7:06 AM H.J. Lu wrote: > > > > On Mon, Feb 28, 2022 at 9:36 PM Hongtao Liu wrote: > > > > > > On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches > > > wrote: > >

Re: [PATCH] x86: Always return pseudo register in ix86_gen_scratch_sse_rtx

2022-03-03 Thread Hongtao Liu via Gcc-patches
On Thu, Mar 3, 2022 at 10:22 PM H.J. Lu via Gcc-patches wrote: > > ix86_gen_scratch_sse_rtx returns XMM7/XMM15/XMM31 as a scratch vector > register to prevent RTL optimizers from removing vector register. It > introduces a conflict with explicit XMM7/XMM15/XMM31 usage and when it > is called by R

Re: [PATCH] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-03 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches wrote: > > This is incremental patch based on [1], it enables optimization as below > > - vbroadcastss.LC1(%rip), %xmm0 > + movl$-45, %edx > + vmovd %edx, %xmm0 > + vpshufd $0, %xmm0, %xmm0 > > According to

Re: [PATCH] i386: Fix up cond_{and,ior,xor,mul}* [PR104779]

2022-03-06 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 5, 2022 at 4:05 PM Jakub Jelinek wrote: > > Hi! > > The following testcase ICEs, because the cond_andv* expander > has vector_operand predicates in both of the commutative inputs > and calls gen_andv*_mask which calls ix86_binary_operator_ok > in its condition, but nothing calls ix86_f

Re: [PATCH V2] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-06 Thread Hongtao Liu via Gcc-patches
Met some problem in git send-email --cc=a,b,c, so manually CC. On Mon, Mar 7, 2022 at 1:11 PM liuhongt via Gcc-patches wrote: > > >What happens if you set preferred_for_speed to false for alternative 1? > It works, and I've removed the newly added splitter in this patch. > Also i tried to do simi

Re: [PATCH] [i386] Prevent vectorization for load from parm_decl at O2 to avoid STF issue.

2022-03-07 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 4, 2022 at 3:28 PM liuhongt via Gcc-patches wrote: > > For parameter passing through stack, vectorized load from parm_decl > in callee may trigger serious STF issue. This is why GCC12 regresses > 50% for cray at -O2 compared to GCC11. > > The patch add an extremely large number to stmt

Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-03-07 Thread Hongtao Liu via Gcc-patches
ping^1 On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote: > > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote: > > > > The patch fixes ICE in ix86_gimple_fold_builtin. > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > Ok for main trunk? > > > g

Re: [PATCH] [i386] Prevent vectorization for load from parm_decl at O2 to avoid STF issue.

2022-03-07 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 7, 2022 at 5:37 PM Richard Biener via Gcc-patches wrote: > > On Fri, Mar 4, 2022 at 8:27 AM liuhongt wrote: > > > > For parameter passing through stack, vectorized load from parm_decl > > in callee may trigger serious STF issue. This is why GCC12 regresses > > 50% for cray at -O2 comp

Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-03-10 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 8, 2022 at 9:30 AM Hongtao Liu wrote: > > ping^1 > > On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote: > > > > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote: > > > > > > The patch fixes ICE in ix86_gimple_fold_builtin. > > > >

Re: [PATCH] target/104762 - vectorization costs of CONSTRUCTORs

2022-03-11 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 11, 2022 at 8:43 PM Richard Biener via Gcc-patches wrote: > > After accounting for GPR -> XMM move cost for vec_construct the > base cost needs adjustments to not double-cost those. This also > lowers the cost when such move is not necessary. > > This fixes the observed 538.imagick_r

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-13 Thread Hongtao Liu via Gcc-patches
On Sun, Mar 13, 2022 at 3:28 AM Jakub Jelinek wrote: > > Hi! > > These intrinsics are supposed to do an unaligned may_alias load > of a 16-bit or 32-bit value and store it as the first element of > a 128-bit integer vector, with all other elements cleared. > > The current _mm_storeu_* implementati

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-14 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote: > > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote: > > LGTM, thanks for handling this. > > Thanks, committed. > > > > Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2, > > > f

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-14 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 14, 2022 at 8:20 PM Hongtao Liu wrote: > > On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote: > > > > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote: > > > LGTM, thanks for handling this. > > > > Thanks, committed. > >

Re: [PATCH v2] x86: Also check _SOFT_FLOAT in

2022-03-15 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 15, 2022 at 10:40 PM H.J. Lu wrote: > > On Mon, Mar 14, 2022 at 7:31 AM H.J. Lu wrote: > > > > Push target("general-regs-only") in if x87 is enabled. > > > > gcc/ > > > > PR target/104890 > > * config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before > > push

Re: [x86 PATCH] PR target/94680: Clear upper bits of V2DF using movq (like V2DI).

2022-03-15 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 15, 2022 at 10:52 PM Roger Sayle wrote: > > > This simple i386 patch unblocks a more significant change. The testcase > gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and > alas the fix for PR target/94680 doesn't (yet) handle V2DF mode. > > For the first test fro

Re: [PATCH] [i386] Add extra cost for unsigned_load which may have stall forward issue.

2022-03-17 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 16, 2022 at 5:54 PM Richard Biener via Gcc-patches wrote: > > On Wed, Mar 16, 2022 at 3:19 AM liuhongt wrote: > > > > This patch only handle pure-slp for by-value passed parameter which > > has nothing to do with IPA but psABI. For by-reference passed > > parameter IPA is required. >

Re: [PATCH] x86: Correct march=sapphirerapids to base on icelake server

2022-03-18 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote: > > Hi Hongtao, > > This patch is to correct march=sapphirerapids to base on icelake server. > and update sapphirerapids in the documentation. > > OK for master and backport to GCC 11? Ok. > > > gcc/Changelog: > > PR target/104963 >

Re: [PATCH] AVX512FP16: Fix masm=intel output for vfc?(madd|mul)csh [PR 104977]

2022-03-20 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 19, 2022 at 8:06 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > This patch fixes typo in subst for scalar complex mask_round operand. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > Ok for master? > Ok. > gcc/ChangeLog: > > PR target/104977 > * c

Re: [PATCH] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-20 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 19, 2022 at 8:09 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For complex scalar intrinsic like _mm_mask_fcmadd_sch, the > mask should be and by 1 to ensure the mask is bind to lowest byte. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > Ok for master? > > gcc

Re: [PATCH] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-20 Thread Hongtao Liu via Gcc-patches
m_mask_move_ss (__m128 src, __mmask8 k, __m128 a, __m128 b) https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=vmovss&ig_expand=3807,3081,3082,3084,3083,4837,4838 > > LLVM generates mask & 1 for these intrinsics. > > Hongtao Liu via Gcc-patches 于20

Re: [PATCH v2] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-21 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 21, 2022 at 7:52 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For complex scalar intrinsic like _mm_mask_fcmadd_sch, the > mask should be and by 1 to ensure the mask is bind to lowest byte. > Use masked vmovss to perform same operation which omits higher bits > of mask. > > Bootst

Re: [PATCH] [i386] Extend splitter pattern to reversed condition by swapping then and else rtx. [PR target/104982]

2022-03-21 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 21, 2022 at 9:06 PM liuhongt wrote: > > Failed to match this instruction: > (set (reg/v:SI 88 [ z ]) > (if_then_else:SI (eq (zero_extract:SI (reg:SI 92) > (const_int 1 [0x1]) > (zero_extend:SI (subreg:QI (reg:SI 93) 0))) > (const_int 0 [0

Re: [PATCH] Fix ICE caused by NULL_RTX returned by lowpart_subreg.

2022-03-22 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 23, 2022 at 2:05 PM liuhongt via Gcc-patches wrote: > > In validate_subreg, both (subreg:V2HF (reg:SI) 0) > and (subreg:V8HF (reg:V2HF) 0) are valid, but not > for (subreg:V8HF (reg:SI) 0) which causes ICE. > > Ideally it should be handled in validate_subreg to support > subreg for all

Re: [PATCH][RFC] tree-optimization/101908 - avoid STLF fails when vectorizing

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches wrote: > > Since we're now vectorizing by default at -O2 issues like PR101908 > become more important where we apply basic-block vectorization to > parts of the function covering loads from function parameters passed > on the stack. S

Re: [PATCH][RFC] tree-optimization/101908 - avoid STLF fails when vectorizing

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 25, 2022 at 9:42 PM Richard Biener wrote: > > On Fri, 25 Mar 2022, Hongtao Liu wrote: > > > On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches > > wrote: > > > > > > Since we're now vectorizing by default at -O2 issues like P

Re: [PATCH] x86: Use x constraint on KL patterns

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 4:50 AM H.J. Lu via Gcc-patches wrote: > > Since KL instructions have no AVX512 version, replace the "v" register > constraint with the "x" register constraint. > > PR target/105058 > * config/i386/sse.md (loadiwkey): Replace "v" with "x". > (aesu8):

Re: [PATCH] x86: Use x constraint on SSSE3 patterns with MMX operands

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 1:27 AM H.J. Lu via Gcc-patches wrote: > > Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND > have no AVX512 version, replace the "Yv" register constraint with the > "x" register constraint. LGTM, please backport to GCC10/GCC11 branch. > > PR t

Re: [PATCH] x86: Use x constraint on KL patterns

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 10:05 AM Hongyu Wang via Gcc-patches wrote: > > > > Is it possible to create a test case that gas would throw an error for > > > invalid operands? > > > > You can use -ffix-xmmN to disable XMM0-15. > > I mean can we create an intrinsic test for this PR that produces xmm16-3

<    2   3   4   5   6   7   8   9   10   11   >