Re: [PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-08-10 Thread Hongtao Liu via Gcc-patches
Ping^3 On Tue, Aug 4, 2020 at 4:21 PM Hongtao Liu wrote: > > ping ^2 > > On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > > > ping > > > > On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote: > > > > > > Those two define_insns h

[PATCH] [PR target/96350]Force ENDBR immediate into memory to avoid fake ENDBR opcode.

2020-08-10 Thread Hongtao Liu via Gcc-patches
Hi: The issue is described in the bugzilla. Bootstrap is ok, regression test for i386/x86-64 backend is ok. Ok for trunk? ChangeLog gcc/ PR target/96350 * config/i386/i386.c (ix86_legitimate_constant_p): Return false for ENDBR immediate. (ix86_legitimate_addre

Re: [PATCH] [PR target/96350]Force ENDBR immediate into memory to avoid fake ENDBR opcode.

2020-08-11 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 11, 2020 at 4:38 PM Uros Bizjak wrote: > > On Tue, Aug 11, 2020 at 5:30 AM Hongtao Liu wrote: > > > > Hi: > > The issue is described in the bugzilla. > > Bootstrap is ok, regression test for i386/x86-64 backend is ok. > > Ok for trunk? >

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-11 Thread Hongtao Liu via Gcc-patches
Hi: The issue is described in the bugzilla. Bootstrap is ok, regression test for i386/x86-64 backend is ok. Ok for trunk? ChangeLog gcc/ PR target/96551 * config/i386/sse.md (vec_unpacku_float_hi_v16si): For vector compare to integer mask, don't use gen_rtx_LT , use

Re: [PATCH] [PR target/96350]Force ENDBR immediate into memory to avoid fake ENDBR opcode.

2020-08-13 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 11, 2020 at 5:56 PM Uros Bizjak wrote: > > On Tue, Aug 11, 2020 at 11:36 AM Hongtao Liu wrote: > > > > On Tue, Aug 11, 2020 at 4:38 PM Uros Bizjak wrote: > > > > > > On Tue, Aug 11, 2020 at 5:30 AM Hongtao Liu wrote: > > > > >

[PATCH]Don't use pinsr for struct initialization.

2020-08-13 Thread Hongtao Liu via Gcc-patches
Hi: For struct initialization, when it fits in a TImode, gcc will use pinsr insn which causes poor codegen described in PR93897 and PR96562. Bootstrap is ok, regression test is ok for i386/x86-64 backend. Ok for trunk? ChangeLog gcc/ PR target/96562 PR target/93897 *

[PATCH 1/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-14 Thread Hongtao Liu via Gcc-patches
Hi: First, since avx512 masks involve both vector isa and general part, so i add both maintainers to the maillist. I'm doing this in 4 steps: 1 - Add cost model for operation of mask registers. 2 - Introduce new cover class INT_MASK_REGS, this will enable direct move between gpr and mask r

[PATCH 2/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-14 Thread Hongtao Liu via Gcc-patches
Enable direct move between masks and gprs in pass_reload with consideration of cost model. Changelog gcc/ * config/i386/i386.c (inline_secondary_memory_needed): No memory is needed between mask regs and gpr. (ix86_hard_regno_mode_ok): Add condition TARGET_AVX512F for

[PATCH 3/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-14 Thread Hongtao Liu via Gcc-patches
1. Set cost of movement inside mask registers a bit higher than gpr's. 2. Set cost of movement between mask register and gpr much higher than movement inside gpr, but still less equal than load/store. 3. Set cost of mask register load/store a bit higher than gpr load/store. -- BR, Hongtao From

[PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-14 Thread Hongtao Liu via Gcc-patches
Enable operator or/xor/and/andn/not for mask register, kxnor is not enabled since there's no corresponding instruction for general registers. gcc/ PR target/88808 * config/i386/i386.md: (*movsi_internal): Adjust constraints for mask registers. (*movhi_internal): Dit

[PATCH] Adjust testcase gcc.target/i386/pr92865-1.c

2020-08-16 Thread Hongtao Liu via Gcc-patches
Hi: Since This testcase is used to check generation of AVX512 vector comparison, scan-assembler for vmov instruction could be deleted, also -mprefer-vector-width=512 is added to avoid impact of different default arch/tune of GCC. Sorry for the inaccuracy of the testcase. ChangeLog gcc/testsuit

Re: [PATCH]Don't use pinsr for struct initialization.

2020-08-17 Thread Hongtao Liu via Gcc-patches
On Fri, Aug 14, 2020 at 5:57 PM Uros Bizjak wrote: > > On Fri, Aug 14, 2020 at 8:03 AM Hongtao Liu wrote: > > > > Hi: > > For struct initialization, when it fits in a TImode, gcc will use > > pinsr insn which causes poor codegen described in PR93897 and PR96562.

[PATCH]Adjust testcase.

2020-08-18 Thread Hongtao Liu via Gcc-patches
Hi: Rewriting testcase with cpp source file, then compare operator could be used directly for vector, this would avoid impact of vectorizer. gcc/testsuite/ChangeLog: PR target/96667 * gcc.target/i386/avx512bw-pr96246-1.c: Moved to... * g++.target/i386/avx512bw-pr96246-1.C

Re: [PATCH 1/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-18 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 17, 2020 at 5:20 PM Uros Bizjak wrote: > > On Fri, Aug 14, 2020 at 10:22 AM Hongtao Liu wrote: > > > > Hi: > > First, since avx512 masks involve both vector isa and general part, > > so i add both maintainers to the maillist. > > > > I&#

Re: [PATCH 2/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-18 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 19, 2020 at 10:17 AM Hongtao Liu wrote: > > On Mon, Aug 17, 2020 at 5:34 PM Uros Bizjak wrote: > > > > On Fri, Aug 14, 2020 at 10:24 AM Hongtao Liu wrote: > > > > > > Enable direct move between masks and gprs in pass_reload wi

Re: [PATCH 2/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-18 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 17, 2020 at 5:34 PM Uros Bizjak wrote: > > On Fri, Aug 14, 2020 at 10:24 AM Hongtao Liu wrote: > > > > Enable direct move between masks and gprs in pass_reload with > > consideration of cost model. > > > > Changelog >

Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-18 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 17, 2020 at 6:08 PM Uros Bizjak wrote: > > On Fri, Aug 14, 2020 at 10:26 AM Hongtao Liu wrote: > > > > Enable operator or/xor/and/andn/not for mask register, kxnor is not > > enabled since there's no corresponding instruction for general > > reg

Re: [PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-08-19 Thread Hongtao Liu via Gcc-patches
ping ^ 4, it's a very simple fix for ICE. On Mon, Aug 10, 2020 at 6:00 PM Hongtao Liu wrote: > > Ping^3 > > On Tue, Aug 4, 2020 at 4:21 PM Hongtao Liu wrote: > > > > ping ^2 > > > > On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > > >

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-19 Thread Hongtao Liu via Gcc-patches
ping^1 On Tue, Aug 11, 2020 at 5:43 PM Hongtao Liu wrote: > > Hi: > The issue is described in the bugzilla. > Bootstrap is ok, regression test for i386/x86-64 backend is ok. > Ok for trunk? > > ChangeLog > gcc/ > PR target/96551 >

Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-20 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 19, 2020 at 3:05 PM Uros Bizjak wrote: > > On Wed, Aug 19, 2020 at 4:25 AM Hongtao Liu wrote: > > > > On Mon, Aug 17, 2020 at 6:08 PM Uros Bizjak wrote: > > > > > > On Fri, Aug 14, 2020 at 10:26 AM Hongtao Liu wrote: > > > > >

Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-20 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 20, 2020 at 3:24 PM Hongtao Liu wrote: > > On Wed, Aug 19, 2020 at 3:05 PM Uros Bizjak wrote: > > > > On Wed, Aug 19, 2020 at 4:25 AM Hongtao Liu wrote: > > > > > > On Mon, Aug 17, 2020 at 6:08 PM Uros Bizjak wrote: > > > > >

Re: [PATCH 2/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-20 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 19, 2020 at 2:31 PM Uros Bizjak wrote: > > On Wed, Aug 19, 2020 at 4:17 AM Hongtao Liu wrote: > > OK, modulo: > > +/* { dg-final { scan-assembler-not "%xmm" } } */ > > It is not clear to me what the testcase is testing here. The scan > string i

Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-20 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 20, 2020 at 3:40 PM Uros Bizjak wrote: > > On Thu, Aug 20, 2020 at 9:31 AM Hongtao Liu wrote: > > > > On Thu, Aug 20, 2020 at 3:24 PM Hongtao Liu wrote: > > > > > > On Wed, Aug 19, 2020 at 3:05 PM Uros Bizjak wrote: > > > > > >

Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-21 Thread Hongtao Liu via Gcc-patches
On Fri, Aug 21, 2020 at 9:15 PM Uros Bizjak wrote: > > > > > gcc/ > > > > PR target/88808 > > > > * config/i386/i386.c (ix86_preferred_reload_class): Allow > > > > QImode data go into mask registers. > > > > * config/i386/i386.md: (*movhi_internal): Adjust constrain

Re: [PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-08-21 Thread Hongtao Liu via Gcc-patches
On Fri, Aug 21, 2020 at 5:44 PM Richard Sandiford wrote: > > Hongtao Liu via Gcc-patches writes: > > ping ^ 4, it's a very simple fix for ICE. > > OK, thanks. (Reviewing on the basis that I agree it's a simple rtx > correctness fix.) > Thanks for the review.

Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-21 Thread Hongtao Liu via Gcc-patches
On Fri, Aug 21, 2020 at 11:50 PM Uros Bizjak wrote: > > On Fri, Aug 21, 2020 at 5:41 PM Hongtao Liu wrote: > > > > On Fri, Aug 21, 2020 at 9:15 PM Uros Bizjak wrote: > > > > > > > > > gcc/ > > > > > > PR target/88808 > &

Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-21 Thread Hongtao Liu via Gcc-patches
On Sat, Aug 22, 2020 at 12:36 AM H.J. Lu wrote: > > On Fri, Aug 21, 2020 at 9:29 AM Hongtao Liu wrote: > > > > On Fri, Aug 21, 2020 at 11:50 PM Uros Bizjak wrote: > > > > > > On Fri, Aug 21, 2020 at 5:41 PM Hongtao Liu wrote: > > > > > >

Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

2020-08-21 Thread Hongtao Liu via Gcc-patches
On Sat, Aug 22, 2020 at 1:08 AM H.J. Lu wrote: > > On Fri, Aug 21, 2020 at 10:02 AM H.J. Lu wrote: > > > > On Fri, Aug 21, 2020 at 9:46 AM Hongtao Liu wrote: > > > > > > On Sat, Aug 22, 2020 at 12:36 AM H.J. Lu wrote: > > > > > >

[PATCH] Fix ICE.

2020-08-24 Thread Hongtao Liu via Gcc-patches
Hi: This patch is to fix a typo in my last patch [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551982.html Bootstrap is ok, gcc regression test hosted on CLX for i386/x86-64 backend is ok. Ok for trunk? gcc/ChangeLog: PR target/96755 * config/i386/sse.md:

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-08-28 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 27, 2020 at 8:24 PM Jakub Jelinek wrote: > > On Thu, Jul 09, 2020 at 04:33:46PM +0800, Hongtao Liu via Gcc-patches wrote: > > +static void > > +replace_constant_pool_with_broadcast (rtx_insn* insn) > > +{ > > + subrtx_ptr_iterator::array_type array; &g

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-30 Thread Hongtao Liu via Gcc-patches
ping ^2 On Wed, Aug 19, 2020 at 7:37 PM Hongtao Liu wrote: > > ping^1 > > On Tue, Aug 11, 2020 at 5:43 PM Hongtao Liu wrote: > > > > Hi: > > The issue is described in the bugzilla. > > Bootstrap is ok, regression test for i386/x86-64 backend is ok.

[PATCH] Adjust testcase

2020-08-30 Thread Hongtao Liu via Gcc-patches
Hi: This patch is to adjust testcases which failed the regression test when gcc is built with -march=skylake-avx512. Also add runtime check for AVX512 tests. gcc/testsuite/ChangeLog: PR target/96246 PR target/96855 PR target/96856 PR target/96857 * g++.t

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-09-01 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 27, 2020 at 8:24 PM Jakub Jelinek wrote: > > On Thu, Jul 09, 2020 at 04:33:46PM +0800, Hongtao Liu via Gcc-patches wrote: > > +static void > > +replace_constant_pool_with_broadcast (rtx_insn* insn) > > +{ > > + subrtx_ptr_iterator::array_type array; &g

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-09-01 Thread Hongtao Liu via Gcc-patches
On Tue, Sep 1, 2020 at 6:11 PM Jakub Jelinek wrote: > > On Tue, Sep 01, 2020 at 05:55:18PM +0800, Hongtao Liu wrote: > > I tried define_split, but there's too many of them(considering usage > > of define_subst for mask). > > Also for new added instructions whi

[PATCH][AVX512]Lower AVX512 vector compare to AVX version when dest is vector

2020-09-02 Thread Hongtao Liu via Gcc-patches
Hi: Add define_peephole2 to eliminate potential redundant conversion from mask to vector. Bootstrap is ok, regression test is ok for i386/x86-64 backend. Ok for trunk? gcc/ChangeLog: PR target/96891 * config/i386/sse.md (VI_128_256): New mode iterator. (define_peephol

Re: [PATCH] Adjust testcase

2020-09-02 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 31, 2020 at 2:19 PM Hongtao Liu wrote: > > Hi: > This patch is to adjust testcases which failed the regression test > when gcc is built with -march=skylake-avx512. > Also add runtime check for AVX512 tests. > > gcc/testsuite/ChangeLog: > PR target/9

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-09-02 Thread Hongtao Liu via Gcc-patches
On Wed, Sep 2, 2020 at 5:58 PM Jakub Jelinek wrote: > > On Wed, Sep 02, 2020 at 09:57:08AM +0800, Hongtao Liu via Gcc-patches wrote: > > + > > + first = XVECEXP (constant, 0, 0); > > + /* There could be some rtx like > > + (mem/u/c:V16QI (symbol_ref/u

[PATCH] Optimize comparison between result of us_minus and 0.

2020-09-03 Thread Hongtao Liu via Gcc-patches
Hi: Add define_peephole2 to perform optimization like bellow: +/* Optimize for TARGET_AVX512F + vpsubusw op1, op2, dst1; + vxorps xmm, xmm, dst2; > vpcmpleuw op1, op2, dst3 + vpcmpeqw dst1, dst2, dst3 */ and +/* Optimize for target above TARGET_SSE4_1 + vpsubusw op1, op2, dst1;

Re: [r11-1851 Regression] FAIL: gcc.dg/vect/slp-46.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 on Linux/x86_64 (-m64 -march=cascadelake)

2020-09-07 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 31, 2020 at 8:35 PM H.J. Lu via Gcc-patches wrote: > > On Mon, Aug 31, 2020 at 12:25 AM Richard Biener wrote: > > > > On Sat, 29 Aug 2020, sunil.k.pandey wrote: > > > > > On Linux/x86_64, > > > > > > dccbf1e2a6e544f71b4a5795f0c79015db019fc3 is the first bad commit > > > commit dccbf1e

[PATCH] Implement __builtin_thread_pointer for x86 TLS

2020-09-08 Thread Hongtao Liu via Gcc-patches
Hi: We have "*load_tp_" in i386.md for load of thread pointer in i386.md, so this patch merely adds the expander for __builtin_thread_pointer. Bootstrap is ok, regression test is ok for i386/x86-64 backend. Ok for trunk? gcc/ChangeLog: PR target/96955 * config/i386/i386.md (

Re: [PATCH] Implement __builtin_thread_pointer for x86 TLS

2020-09-08 Thread Hongtao Liu via Gcc-patches
On Tue, Sep 8, 2020 at 4:52 PM Jakub Jelinek wrote: > > On Tue, Sep 08, 2020 at 04:14:52PM +0800, Hongtao Liu wrote: > > Hi: > > We have "*load_tp_" in i386.md for load of thread pointer in > > i386.md, so this patch merely adds the expander for > > __bui

Re: [PATCH] Implement __builtin_thread_pointer for x86 TLS

2020-09-09 Thread Hongtao Liu via Gcc-patches
On Wed, Sep 9, 2020 at 2:35 PM Jakub Jelinek wrote: > > On Wed, Sep 09, 2020 at 10:30:46AM +0800, Hongtao Liu wrote: > > From 400418fadce46e7db7bd37be45ef5ff5beb08d19 Mon Sep 17 00:00:00 2001 > > From: liuhongt > > Date: Tue, 8 Sep 2020 15:44:58 +0800 > &

[PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-09 Thread Hongtao Liu via Gcc-patches
g embedded broadcast instead. 2020-07-09 Hongtao Liu gcc/ChangeLog: PR target/87767 * config/i386/i386-features.c (replace_constant_pool_with_broadcast): New function. (constant_pool_broadcast): Ditto. (class pass_constant_pool_broadcast): New pass. (make_pass_constant_pool_broadcast): Ditto.

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-10 Thread Hongtao Liu via Gcc-patches
+ maintainer. cc H.J On Thu, Jul 9, 2020 at 4:33 PM Hongtao Liu wrote: > > Hi: > For a constant vector having one duplicated value, there's no need > to put the whole vector in the constant pool, using embedded broadcast > instead. > > Bootstrap test is Ok, regre

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-17 Thread Hongtao Liu via Gcc-patches
ping! On Fri, Jul 10, 2020 at 5:24 PM Hongtao Liu wrote: > > + maintainer. > cc H.J > > On Thu, Jul 9, 2020 at 4:33 PM Hongtao Liu wrote: > > > > Hi: > > For a constant vector having one duplicated value, there's no need > > to put the whole

[PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-07-19 Thread Hongtao Liu via Gcc-patches
Hi: For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a boolean value and try to do some optimization. But it is not true for vector compare, also other places in rtl passes hold the same assumption. Bootstrap is ok, regression test is ok for i386 backend. 2020-07-20 Hongtao Liu

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-07-20 Thread Hongtao Liu via Gcc-patches
Correct PR number in ChangeLog it's pr96243. On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote: > > Hi: > For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a > boolean value and try to do some optimization. But it is not true for > vector compare, also other

[PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-07-21 Thread Hongtao Liu via Gcc-patches
matched. 2020-07-21 Hongtao Liu gcc/ PR target/96246 * config/i386/sse.md (_load_mask, _load_mask): Extend to generate blendm instructions. (_blendm, _blendm): Change define_insn to define_expand. gcc/testsuite/ * gcc.target/i386/avx512bw-p

[PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-07-22 Thread Hongtao Liu via Gcc-patches
Bootstrap is ok, regression test is ok for i386 backend. gcc/ PR target/96262 * config/i386/i386-expand.c (ix86_expand_vec_shift_qihi_constant): Refine. gcc/testsuite/ * gcc.target/i386/pr96262-1.c: New test. --- gcc/config/i386/i386-expand.c | 6 +

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-23 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 23, 2020 at 4:39 PM Jan Hubicka wrote: > > Hello, > sorry for taking so long to get to this. > > diff --git a/gcc/config/i386/i386-features.c > > b/gcc/config/i386/i386-features.c > > index 535fc7e981d..8f81d101382 100644 > > --- a/gcc/config/i386/i386-features.c > > +++ b/gcc/config/

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-23 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 23, 2020 at 9:53 PM Hongtao Liu wrote: > > On Thu, Jul 23, 2020 at 4:39 PM Jan Hubicka wrote: > > > > Hello, > > sorry for taking so long to get to this. > > > diff --git a/gcc/config/i386/i386-features.c > > > b/gcc/config/i386/i386-featu

Re: [PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-07-27 Thread Hongtao Liu via Gcc-patches
ping On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu wrote: > > Bootstrap is ok, regression test is ok for i386 backend. > > gcc/ > PR target/96262 > * config/i386/i386-expand.c > (ix86_expand_vec_shift_qihi_constant): Refine. > > gcc/testsuite/

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-07-27 Thread Hongtao Liu via Gcc-patches
ping On Mon, Jul 20, 2020 at 4:40 PM Hongtao Liu wrote: > > Correct PR number in ChangeLog > it's pr96243. > > On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote: > > > > Hi: > > For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a > > b

Re: [PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-07-27 Thread Hongtao Liu via Gcc-patches
ping On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote: > > Those two define_insns have same pattern, and > _load_mask would always be matched since it show up > earlier in the md file, and it may lose some opportunity in > pass_reload since _load_mask only have constraint &quo

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-08-03 Thread Hongtao Liu via Gcc-patches
Update patch. There are a lot of avx512 define_insns which lack corresponding memory broadcast version, i only add *avx512f_mul3_bcst and *avx512dq_mul3_bcst in this patch. On Fri, Jul 24, 2020 at 10:37 AM Hongtao Liu wrote: > > On Thu, Jul 23, 2020 at 9:53 PM Hongtao Liu wrote: > >

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-04 Thread Hongtao Liu via Gcc-patches
ping^2 On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > ping > > On Mon, Jul 20, 2020 at 4:40 PM Hongtao Liu wrote: > > > > Correct PR number in ChangeLog > > it's pr96243. > > > > On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote: >

Re: [PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-08-04 Thread Hongtao Liu via Gcc-patches
ping ^2 On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > ping > > On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote: > > > > Those two define_insns have same pattern, and > > _load_mask would always be matched since it show up > > earlier

Re: [PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-08-04 Thread Hongtao Liu via Gcc-patches
ping ^2 On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > ping > > On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu wrote: > > > > Bootstrap is ok, regression test is ok for i386 backend. > > > > gcc/ > > PR target/962

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-04 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 4, 2020 at 6:28 PM Kirill Yukhin wrote: > > On 04 авг 13:26, Kirill Yukhin wrote: > > Could you please clarify, how your patch relared to [1]? > > I see from the bug that it describes perf issue w.r.t. scalar > > operations. > Sorry for Typo, it's pr96243. https://gcc.gnu.org/bugzilla/

[PATH] Enable GCC support for SERIALIZE

2020-04-01 Thread Hongtao Liu via Gcc-patches
2001 From: liuhongt Date: Wed, 4 Mar 2020 14:08:40 +0800 Subject: [PATCH] Enable GCC support for SERIALIZE 2020-03-04 Hongtao Liu 2020-03-04 Wei Xiao gcc/Changelog: * gcc/common/config/i386/i386-common.c (OPTION_MASK_ISA2_SERIALIZE_SET, OPTION_MASK_ISA2_SERIALIZE_UNSET): New macros. (ix86_ha

[PATCH] Enable GCC support for TSXLDTRK

2020-04-01 Thread Hongtao Liu via Gcc-patches
Hi: This patch is about to enable GCC support for TSXLDTRK which would be in GLC. There's only 2 instructions: XRESLDTRK, XSUSLDTRK, more details please refer to https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf I kno

Re: [PATCH] Enable GCC support for TSXLDTRK

2020-04-01 Thread Hongtao Liu via Gcc-patches
On Wed, Apr 1, 2020 at 3:32 PM Hongtao Liu wrote: > > Hi: > This patch is about to enable GCC support for TSXLDTRK which would > be in GLC. There's only 2 instructions: XRESLDTRK, XSUSLDTRK, more > details please > refer to > https://software.intel.com/sites/d

Re: [PATCH] Enable GCC support for TSXLDTRK

2020-05-05 Thread Hongtao Liu via Gcc-patches
On Mon, May 4, 2020 at 12:58 AM Uros Bizjak wrote: > > The part above is OK, but you are missing support for > __attribute__((__target__("..."))). Please see how for example -msgx > is handled in isa2_opts in i386-options.c and in > gcc.target/i386/funcspec-56.h test source. > > Please repost the

Re: [PATH] Enable GCC support for SERIALIZE

2020-05-05 Thread Hongtao Liu via Gcc-patches
On Mon, May 4, 2020 at 1:17 AM Uros Bizjak wrote: > > On Wed, Apr 1, 2020 at 9:23 AM Hongtao Liu wrote: > > > > Hi: > > This patch is about to enable GCC support for SERIALIZE which would > > be in GLC. There's only 1 instruction: SERIALIZE, more det

[PATCH] Add enqcmd,avx512bf16,avx512vp2intersect to funcspec-56.inc

2020-05-06 Thread Hongtao Liu via Gcc-patches
Hi: Test is ok for funcspec-5.c, funcspec-6.c. gcc/testuite/ChangeLog * gcc.target/i386/funcspec-56.inc: Add enqcmd, avx512bf16, avx512vp2intersect. gcc/testsuite/gcc.target/i386/funcspec-56.inc | 6 ++ 1 file changed, 6 insertions(+) diff --git a/gcc/testsuite/gcc.target/

[PATCH] [PR94118]] Update documentation for x86 operand modifier.

2020-05-11 Thread Hongtao Liu via Gcc-patches
Documents operand modifiers which are available in asm stmt but missing in document. | Modifier | Description | Available in asm stmt | Existed in documentation | | --- | --- | --- | - | | L,W,B,Q,S,T | print the opcode suffix for specified size of operand. | Available | Not | | C | pr

[PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-20 Thread Hongtao Liu via Gcc-patches
Hi: Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: PR target/92658 * config/i386/sse.md (trunc2, truncv32hiv32qi2, trunc2): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/pr92658-avx512f.c: New test. * gcc.

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-20 Thread Hongtao Liu via Gcc-patches
On Wed, May 20, 2020 at 11:43 PM Uros Bizjak wrote: > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu wrote: > > > > Hi: > > Bootstrap is ok, regression test on i386/x86-64 backend is ok. > > > > gcc/ChangeLog: > > PR target/92658 > >

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-21 Thread Hongtao Liu via Gcc-patches
On Thu, May 21, 2020 at 7:18 PM Uros Bizjak wrote: > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu wrote: > > > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak wrote: > > > > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu wrote: > > > > > &

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-22 Thread Hongtao Liu via Gcc-patches
On Fri, May 22, 2020 at 2:41 PM Uros Bizjak wrote: > > On Fri, May 22, 2020 at 6:55 AM Hongtao Liu wrote: > > > > On Thu, May 21, 2020 at 7:18 PM Uros Bizjak wrote: > > > > > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu wrote: > > > > >

[PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-23 Thread Hongtao Liu via Gcc-patches
Hi: This patch fix non-conforming expander for floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, refer to PR95211, PR95256. bootstrap ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: PR target/95211 PR target/95256 * config/i386/sse.md v2d

[PATCH] Add missing expander for vector float_extend and float_truncate [PR target/95125]

2020-05-24 Thread Hongtao Liu via Gcc-patches
Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog PR target/95125 * config/i386/sse.md (sf2dfmode_lower): New mode attribute. (trunc2) New expander. (extend2): Ditto. gcc/testsuite/ChangeLog * gcc.target/i386/pr95125-avx.c: New

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Hongtao Liu via Gcc-patches
On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > Hi: > > This patch fix non-conforming expander for > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, > > refer to

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 1:55 AM Uros Bizjak wrote: > > On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote: > > > > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > > > > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > >

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 1:55 AM Uros Bizjak wrote: > > On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote: > > > > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > > > > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > >

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 7:36 PM Richard Biener wrote: > > On Mon, 25 May 2020, Uros Bizjak wrote: > > > On Mon, May 25, 2020 at 8:27 AM Richard Biener wrote: > > > > > > On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak > > > wrote: > > >

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 8:00 PM Uros Bizjak wrote: > > On Mon, May 25, 2020 at 1:56 PM Hongtao Liu wrote: > > > > On Mon, May 25, 2020 at 7:36 PM Richard Biener wrote: > > > > > > On Mon, 25 May 2020, Uros Bizjak wrote: > > > > >

[PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-25 Thread Hongtao Liu via Gcc-patches
rrent implementation. Also for other vpmov instructions which have memory_operand narrower than 128bits. 2020-05-25 Hongtao Liu gcc/ChangeLog * config/i386/sse.md (*avx512vl_v2div2qi2_store): Refine size of memory_operand according to Intel SDM. (avx512vl_v2div2qi2_mask_store):

Re: [IMPORTANT] ChangeLog related changes

2020-05-25 Thread Hongtao Liu via Gcc-patches
On Tue, May 26, 2020 at 6:49 AM Jakub Jelinek via Gcc-patches wrote: > > Hi! > > I've turned the strict mode of Martin Liška's hook changes, > which means that from now on no commits to the trunk or release branches > should be changing any ChangeLog files together with the other files, > ChangeLo

Re: [IMPORTANT] ChangeLog related changes

2020-05-25 Thread Hongtao Liu via Gcc-patches
Great, thanks! On Tue, May 26, 2020 at 2:08 PM Martin Liška wrote: > > On 5/26/20 7:22 AM, Hongtao Liu via Gcc wrote: > > i commit a separate patch alone only for ChangeLog files, should i revert > > it? > > Hello. > > I've just done it. > > Martin -- BR, Hongtao

Re: [PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-26 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 8:41 PM Uros Bizjak wrote: > > On Mon, May 25, 2020 at 2:21 PM Hongtao Liu wrote: > > > > According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit > > memory_operand instead of 128-bit one which exists in current > > implem

Re: [PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-27 Thread Hongtao Liu via Gcc-patches
On Wed, May 27, 2020 at 8:01 PM Uros Bizjak wrote: > > On Wed, May 27, 2020 at 8:02 AM Hongtao Liu wrote: > > > > On Mon, May 25, 2020 at 8:41 PM Uros Bizjak wrote: > > > > > > On Mon, May 25, 2020 at 2:21 PM Hongtao Liu wrote: > > > > > >

Re: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-28 Thread Hongtao Liu via Gcc-patches
On Thu, May 28, 2020 at 11:37 PM H.J. Lu wrote: > > On Thu, May 28, 2020 at 8:00 AM Richard Sandiford > wrote: > > > > "Yangfei (Felix)" writes: > > > Thanks for reviewing this. > > > Attached please find the v5 patch. > > > Note: we also need to modify local variable "mode" once we catch one >

[PATCH] Fix zero-masking for vcvtps2ph when dest operand is memory.

2020-06-02 Thread Hongtao Liu via Gcc-patches
Hi: When dest is memory, zero-masking is not valid, only merging-masking is available, Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: * gcc/config/i386/sse.md (*vcvtps2ph_store): Refine from *vcvtps2ph_store. (vcvtps2ph256): Refine constr

Re: [PATCH] Fix zero-masking for vcvtps2ph when dest operand is memory.

2020-06-03 Thread Hongtao Liu via Gcc-patches
Hi Richard: Could you help review this patch. uros said he wouldn't review patches related to x86 vector ISA anymore. On Wed, Jun 3, 2020 at 10:26 AM Hongtao Liu wrote: > > Hi: > When dest is memory, zero-masking is not valid, only merging-masking > is available, >

[PATCH] Fix typo in expander trunc2 [AVX512]

2020-06-03 Thread Hongtao Liu via Gcc-patches
This patch to is fix uppercase of mode in trunc2, it should be lowercase for standard pattern name. Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: * config/i386/sse.md (pmov_dst_3_lower): New mode attribute. (trunc2): Refine from trunc2. g

Re: [PATCH] Fix zero-masking for vcvtps2ph when dest operand is memory.

2020-06-03 Thread Hongtao Liu via Gcc-patches
On Thu, Jun 4, 2020 at 2:43 PM Richard Biener wrote: > > On Thu, 4 Jun 2020, Hongtao Liu wrote: > > > Hi Richard: > > Could you help review this patch. > > uros said he wouldn't review patches related to x86 vector ISA anymore. > > I can't spot any

[PATCH] Optimize multiplication for V8QI,V16QI,V32QI under TARGET_AVX512BW [target/95488]

2020-06-04 Thread Hongtao Liu via Gcc-patches
Hi: +/* Optimize vector MUL generation for V8QI, V16QI and V32QI + under TARGET_AVX512BW. i.e. for v16qi a * b, it has + + vpmovzxbw ymm2, xmm0 + vpmovzxbw ymm3, xmm1 + vpmullw ymm4, ymm2, ymm3 + vpmovwb xmm0, ymm4 + + it would take less instructions than ix86_expand_vecop_qihi. +

Re: [PATCH] Fix up testcase.

2020-12-10 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 10, 2020 at 8:52 PM Prathamesh Kulkarni wrote: > > On Wed, 9 Dec 2020 at 15:52, Hongtao Liu wrote: > > > > On Wed, Dec 9, 2020 at 5:22 PM Prathamesh Kulkarni via Gcc-patches > > wrote: > > > > > > On Wed, 9 Dec 2020 at 00:29, sunil.k.pand

[PATCH] [X86] Fold more shuffle builtins to VEC_PERM_EXPR.

2020-12-15 Thread Hongtao Liu via Gcc-patches
Hi: As indicated in PR98167, this patch is a follow-up to [1]. Bootstrapped and regtested on x86_64-linux-gnu. Ok for trunk? gcc/ PR target/98167 * config/i386/i386.c (ix86_gimple_fold_builtin): Handle IX86_BUILTIN_SHUFPD512, IX86_BUILTIN_SHUFPS512, IX86_BUILT

Re: [PATCH] [X86] Fold more shuffle builtins to VEC_PERM_EXPR.

2020-12-16 Thread Hongtao Liu via Gcc-patches
On Tue, Dec 15, 2020 at 7:11 PM Jakub Jelinek wrote: > > On Tue, Dec 15, 2020 at 06:10:57PM +0800, Hongtao Liu via Gcc-patches wrote: > > --- a/gcc/config/i386/i386.c > > +++ b/gcc/config/i386/i386.c > > @@ -18187,21 +18187,67 @@ ix86_gimple_fold_builtin (gimple

[PATCH][X86] Fix Typo

2020-12-21 Thread Hongtao Liu via Gcc-patches
When i'm working on PR98348, i notice there's Typo in define_insn "*one_cmpl2_1", There are 2 alternatives, so the index couldn't be 2. Bootstrap and regress test is ok on x86_64-unknown-linux. gcc/ChangeLog * config/i386/i386.md (*one_cmpl2_1): Fix typo, change alternative fr

[PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-03 Thread Hongtao Liu via Gcc-patches
Hi: The following patch adds define_insn_and_split to optimize vpmovmskb %xmm0, %eax - movzwl %ax, %eax notl%eax Bootstrapped/regtested on x86_64-linux-gnu {,-m32}. Ok for trunk? gcc/ChangeLog PR target/98461 * config/i386/sse.md (*sse2_pmovs

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-03 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 4, 2021 at 3:40 PM Uros Bizjak wrote: > > On Mon, Jan 4, 2021 at 6:54 AM Hongtao Liu wrote: > > > > Hi: > > The following patch adds define_insn_and_split to optimize > > > >vpmovmskb %xmm0, %eax > > -

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-04 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 4, 2021 at 4:49 PM Jakub Jelinek wrote: > > On Mon, Jan 04, 2021 at 01:56:44PM +0800, Hongtao Liu via Gcc-patches wrote: > > +(define_insn_and_split "*sse2_pmovskb_zexthisi" > > + [(set (match_operand:SI 0 "register_operand") > > +

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-04 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 4, 2021 at 4:59 PM Hongtao Liu wrote: > > On Mon, Jan 4, 2021 at 4:49 PM Jakub Jelinek wrote: > > > > On Mon, Jan 04, 2021 at 01:56:44PM +0800, Hongtao Liu via Gcc-patches wrote: > > > +(define_insn_and_split "*sse2_pmovskb_zexthisi&quo

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-05 Thread Hongtao Liu via Gcc-patches
On Tue, Jan 5, 2021 at 3:20 PM Uros Bizjak wrote: > > On Tue, Jan 5, 2021 at 8:04 AM Uros Bizjak wrote: > > > > > > +(define_split > > > + [(set (match_operand:SI 0 "register_operand") > > > +(zero_extend:SI > > > + (not:HI > > > +(subreg:HI > > > + (uns

Re: [PATCH][AVX512]Lower AVX512 vector compare to AVX version when dest is vector

2021-01-05 Thread Hongtao Liu via Gcc-patches
> >> > >> Note there's a data dependency between them. insn 7 feeds insn 9. When > >> there's a data dependency, combiner patterns are usually the better > >> choice than peepholes. I think you'd be looking to match something > >> likethis (from the . combine dump): > >> Using combiner patterns

[PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]

2021-01-05 Thread Hongtao Liu via Gcc-patches
Hi: ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn for vector comparison to vector mask, but ix86_expand_sse_cmp(which is called in upper 2 functions.) may return integer mask whenever integer mask is available, so convert integer mask back to vector mask if needed. gcc/Cha

Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]

2021-01-06 Thread Hongtao Liu via Gcc-patches
On Wed, Jan 6, 2021 at 10:39 PM Jakub Jelinek wrote: > > On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote: > > ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn > > for vector comparison to vector mask, but ix86_expand_sse_cmp(which is >

<    7   8   9   10   11   12   13   14   >