Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-29 Thread Hongtao Liu via Gcc-patches
On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches wrote: > > > On 2022-05-24 23:39, liuhongt wrote: > > Rigt now, mem_cost for separate mem alternative is 1 * frequency which > > is pretty small and caused the unnecessary SSE spill in the PR, I've tried > > to rework backend cost mo

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-30 Thread Hongtao Liu via Gcc-patches
On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches wrote: > > > > In the PR, the spill happens in the initial basic block of the function, > > > i.e. > > > the one with the highest frequency. > > > > > > Also as noted in the PR, swapping the 'unlikely' branch to 'likely' > > > avo

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-30 Thread Hongtao Liu via Gcc-patches
On Mon, May 30, 2022 at 3:44 PM Alexander Monakov wrote: > > On Mon, 30 May 2022, Hongtao Liu wrote: > > > On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches > > wrote: > > > > > > > > The spill is mainly decided by 3 insns related

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-31 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 1, 2022 at 12:40 AM Richard Sandiford wrote: > > Vladimir Makarov via Gcc-patches writes: > > On 2022-05-29 23:05, Hongtao Liu wrote: > >> On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches > >> wrote: > >>> > >>>

Re: [x86_64 PATCH] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.

2022-06-01 Thread Hongtao Liu via Gcc-patches
On Thu, Jun 2, 2022 at 2:24 PM Roger Sayle wrote: > > > This patch resolves PR target/105791 which is a regression that was > accidentally introduced for my workaround to PR tree-optimization/10566. > (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it > shouldn't). The latest is

Re: [PATCH] x86: harmonize __builtin_ia32_psadbw*() types

2022-06-05 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 6, 2022 at 3:17 AM Uros Bizjak via Gcc-patches wrote: > > On Thu, Jun 2, 2022 at 5:04 PM Jan Beulich wrote: > > > > The 64-bit, 128-bit, and 512-bit variants have VDI return type, in > > line with instruction behavior. Make the 256-bit builtin match, thus > > also making it match the

Re: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit preference to vector store.

2022-06-05 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 1, 2022 at 11:56 PM H.J. Lu via Gcc-patches wrote: > > On Tue, May 31, 2022 at 10:06 PM Cui,Lili wrote: > > > > This patch is to update {skylake,icelake,alderlake}_cost to add a bit > > preference to vector store. > > Since the interger vector construction cost has changed, we need t

Re: [PATCH] Disparages SSE_REGS alternatives sligntly with ?v instead of *v in *mov{si, di}_internal.

2022-06-07 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 7, 2022 at 3:41 PM liuhongt via Gcc-patches wrote: > > So alternative v won't be igored in record_reg_classess. > > Similar for *r alternatives in some vector patterns. > > It helps testcase in the PR, also RA now makes better decisions for > gcc.target/i386/extract-insert-combining.c

Re: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit preference to vector store.

2022-06-09 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 8, 2022 at 11:44 AM Cui, Lili wrote: > > > -Original Message- > > From: Hongtao Liu > > Sent: Monday, June 6, 2022 1:25 PM > > To: H.J. Lu > > Cc: Cui, Lili ; Liu, Hongtao ; > > GCC > > Patches > > Subject: Re: [PATCH] U

Re: [PATCH] Add optional __Bfloat16 support

2022-06-10 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 10, 2022 at 3:47 PM liuhongt via Libc-alpha wrote: > > Pass and return __Bfloat16 values in XMM registers. > > Background: > __Bfloat16 (BF16) is a new floating-point format that can accelerate machine > learning (deep learning training, in particular) algorithms. > It's first introdu

Re: [PATCH] testsuite: Add -mtune=generic to dg-options for two testcases.

2022-06-10 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 10, 2022 at 4:45 PM Cui,Lili via Gcc-patches wrote: > > This patch is to change dg-options for two testcases. > > Use -mtune=generic to limit these two testcases. Because configuring them with > -mtune=cascadelake or znver3 will vectorize them. > > regtested on x86_64-linux-gnu{-m32,}.

Re: [PATCH] Add optional __Bfloat16 support

2022-06-12 Thread Hongtao Liu via Gcc-patches
On Sat, Jun 11, 2022 at 1:46 AM H.J. Lu wrote: > > On Fri, Jun 10, 2022 at 7:44 AM H.J. Lu wrote: > > > > On Fri, Jun 10, 2022 at 2:38 AM Florian Weimer wrote: > > > > > > * liuhongt via Libc-alpha: > > > > > > > +\subsubsection{Special Types} > > > > + > > > > +The \code{__Bfloat16} type uses a

Re: PING^1 [PATCH] x86: Skip ENDBR when emitting direct call/jmp to local function

2022-06-26 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 21, 2022 at 3:50 AM Uros Bizjak via Gcc-patches wrote: > > On Mon, Jun 20, 2022 at 8:14 PM H.J. Lu wrote: > > > > On Tue, May 10, 2022 at 9:25 AM H.J. Lu wrote: > > > > > > Mark a function with SYMBOL_FLAG_FUNCTION_ENDBR when inserting ENDBR at > > > function entry. Skip the 4-byte

Re: [PATCH gcc 0/1] [PATCH] target: Fix asm generation for AVX builtins when using -masm=intel [PR106095]

2022-06-27 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 28, 2022 at 9:26 AM ~antoyo via Gcc-patches wrote: > > Hi. > > This fixes the following bug: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106095 The patch LGTM, thanks for handling this. > > It's the first time I work outside of the jit component, so please tell > me if I forgot anyt

Re: [PATCH gcc 0/1] [PATCH] target: Fix asm generation for AVX builtins when using -masm=intel [PR106095]

2022-06-28 Thread Hongtao Liu via Gcc-patches
this. Yes for the case in your patch, I think it's a typo. But there could be some difference for operand modifiers between AT&T and Intel syntaxes in some patterns. .i.e the use of mode attr . > > On Tue, 2022-06-28 at 14:22 +0800, Hongtao Liu wrote: > > On Tue, J

Re: [PATCH] Change march=alderlake ISA list and add m_ALDERLAKE to m_CORE_AVX2

2021-04-12 Thread Hongtao Liu via Gcc-patches
On Mon, Apr 12, 2021 at 3:20 PM Uros Bizjak via Gcc-patches wrote: > > On Mon, Apr 12, 2021 at 5:13 AM Cui, Lili wrote: > > > > Hi Uros, > > > > This patch is about to change Alder Lake ISA list to GCC add m_ALDERLAKE to > > m_CORE_AVX2. > > Alder Lake Intel Hybrid Technology is based on Tremont

[PATCH] [GCC-9] backport -march=tigerlake to GCC9 [PR target/100009]

2021-04-13 Thread Hongtao Liu via Gcc-patches
From-SVN: r274693 -- BR, Hongtao From 89a130f2e626d9e7d92ec8d51956a4ae0d10d277 Mon Sep 17 00:00:00 2001 From: Hongtao Liu Date: Tue, 20 Aug 2019 07:06:03 + Subject: [PATCH] backport TIGERLAKE part to GCC9. 2019-08-20 Lili Cui gcc/ * common/config/i386/i386-common.c (processor_n

Re: [PATCH] [GCC-9] backport -march=tigerlake to GCC9 [PR target/100009]

2021-04-13 Thread Hongtao Liu via Gcc-patches
On Tue, Apr 13, 2021 at 6:38 PM Uros Bizjak wrote: > > On Tue, Apr 13, 2021 at 12:18 PM Hongtao Liu wrote: > > > > Hi: > > As described in PR, we introduced tigerlake string in driver-i386.c > > by r9-8652 w/o support -march/tune=tigerlake which causes an er

Re: [PATCH wwwdoc] Mention Rocketlake [GCC11]

2021-04-13 Thread Hongtao Liu via Gcc-patches
On Mon, Apr 12, 2021 at 6:20 PM Cui, Lili via Gcc-patches wrote: > > > Updated wwwdocs for Rocketlake [GCC11], thanks. > > [PATCH] Mention Rocketlake > --- > htdocs/gcc-11/changes.html | 4 > 1 file changed, 4 insertions(+) > > diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes

[PATCH] [i386] Fix a small performance regression. [PR98348]

2021-04-18 Thread Hongtao Liu via Gcc-patches
Hi: This patch is about to add a pre-reload splitter to transform vpcmpeqd with a zero operand to vptestnmd, which could save a vpxor instruction. .i.e - vpxor %xmm1, %xmm1, %xmm1 - vpcmpd $0, %zmm1, %zmm0, %k0 + vptestnmd %zmm0, %zmm0, %k0 vpmovm2d zmm0, k0 B

[PATCH] [i386] MASK_AVX256_SPLIT_UNALIGNED_STORE/LOAD should be cleared in opts->x_target_flags when X86_TUNE_AVX256_UNALIGNED_LOAD/STORE_OPTIMAL is enabled by target attribute.

2021-04-22 Thread Hongtao Liu via Gcc-patches
Hi: Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/100093 * config/i386/i386-options.c (ix86_option_override_internal): Clear MASK_AVX256_SPLIT_UNALIGNED_LOAD/STORE in x_target_flags when X86_TUNE_AVX256_UNALIGNED_

[PATCH] [i386] Optimize __builtin_shuffle when it's used to zero the upper bits of the dest. [PR target/94680]

2021-04-22 Thread Hongtao Liu via Gcc-patches
Hi: If the second operand of __builtin_shuffle is const vector 0, and with specific mask, it can be optimized to movq/vmovps. .i.e. foo128: - vxorps %xmm1, %xmm1, %xmm1 - vmovlhps%xmm1, %xmm0, %xmm0 + vmovq %xmm0, %xmm0 foo256: - vxorps %xmm1, %xmm1, %xmm1 -

[PATCH] Add folding and remove expanders for x86 *pcmp{et,gt}* builtins [PR target/98911]

2021-04-22 Thread Hongtao Liu via Gcc-patches
Hi: The patch is a follow-up to https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564320.html. Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/98911 * config/i386/i386-builtin.def (BDESC): Change the icode of the foll

Re: [PATCH] Add folding and remove expanders for x86 *pcmp{et,gt}* builtins [PR target/98911]

2021-04-23 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 23, 2021 at 2:50 PM Uros Bizjak wrote: > > On Fri, Apr 23, 2021 at 8:36 AM Hongtao Liu wrote: > > > > Hi: > > The patch is a follow-up to > > https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564320.html. > > Bootstrapped and regtested

Re: [PATCH] Add folding and remove expanders for x86 *pcmp{et,gt}* builtins [PR target/98911]

2021-04-23 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 23, 2021 at 3:18 PM Uros Bizjak wrote: > > On Fri, Apr 23, 2021 at 9:15 AM Hongtao Liu wrote: > > > > On Fri, Apr 23, 2021 at 2:50 PM Uros Bizjak wrote: > > > > > > On Fri, Apr 23, 2021 at 8:36 AM Hongtao Liu wrote: > > > > &

Re: [PATCH] [Refactor] [AVX512] Combine VI12_AVX512VL with VI48_AVX512VL into VI_AVX512VLBW

2021-04-24 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 2, 2020 at 9:57 AM Hongtao Liu wrote: > > On Wed, Dec 2, 2020 at 8:28 AM Jeff Law wrote: > > > > > > > > On 11/30/20 10:17 PM, Hongtao Liu via Gcc-patches wrote: > > > Hi: > > > There're many pairs of define_insn/define_expand

Re: [PATCH] [i386] Optimize __builtin_shuffle when it's used to zero the upper bits of the dest. [PR target/94680]

2021-04-24 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 23, 2021 at 5:13 PM Jakub Jelinek wrote: > > On Fri, Apr 23, 2021 at 12:53:58PM +0800, Hongtao Liu via Gcc-patches wrote: > > + if (!CONST_INT_P (er)) > > + return 0; > > + ei = INTVAL (er); > > + if (i < nelt2 && ei !

Re: [PATCH] Disparage slightly the mask register alternative for bitwise operations. [PR target/101142]

2021-06-21 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 21, 2021 at 3:28 PM Uros Bizjak via Gcc-patches wrote: > > On Mon, Jun 21, 2021 at 6:56 AM liuhongt wrote: > > > > The avx512 supports bitwise operations with mask registers, but the > > throughput of those instructions is much lower than that of the > > corresponding gpr version, so

Re: [PATCH][AVX512] Fix ICE for vpexpand*.

2021-06-21 Thread Hongtao Liu via Gcc-patches
This is the patch I'm going to push to the trunk. On Wed, May 12, 2021 at 3:29 PM Hongtao Liu wrote: > > ping > > On Fri, Apr 30, 2021 at 12:42 PM Hongtao Liu wrote: > > > > Hi: > > This patch is to fix ice which was introduced by my > > r11-5696-

Re: [PATCH][AVX512] Optimize vpexpand* to mask mov when mask have all ones in it's lower part (including 0 and -1).

2021-06-21 Thread Hongtao Liu via Gcc-patches
This is the patch I'm going to push to the trunk. On Wed, May 12, 2021 at 3:28 PM Hongtao Liu wrote: > > ping > > On Fri, Apr 30, 2021 at 12:49 PM Hongtao Liu wrote: > > > > Hi: > > For v{,p}expand* When mask is 0, -1, or has all all one bits in its >

Re: [PATCH] Add vect_recog_popcount_pattern to handle mismatch between the vectorized popcount IFN and scalar popcount builtin.

2021-06-21 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 21, 2021 at 6:05 PM Richard Biener wrote: > > On Thu, Jun 17, 2021 at 8:29 AM liuhongt wrote: > > > > The patch remove those pro- and demotions when backend support direct > > optab. > > > > For i386: it enables vectorization for vpopcntb/vpopcntw and optimized > > for vpopcntq. > > >

Re: [PATCH] Add vect_recog_popcount_pattern to handle mismatch between the vectorized popcount IFN and scalar popcount builtin.

2021-06-21 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 22, 2021 at 10:43 AM Hongtao Liu wrote: > > On Mon, Jun 21, 2021 at 6:05 PM Richard Biener > wrote: > > > > On Thu, Jun 17, 2021 at 8:29 AM liuhongt wrote: > > > > > > The patch remove those pro- and demotions when backend support direct > &

Re: [PATCH] [i386] Support avx512 vector shift with vector [PR98434]

2021-06-23 Thread Hongtao Liu via Gcc-patches
B(just like what we did in ix86_expand_vec_shift_qihi_constant). So I guess maybe gimple should handle such situations to avoid "nonoptimal codegen". On Mon, May 24, 2021 at 5:49 PM Hongtao Liu wrote: > > Hi: > This patch is about to add expanders for vashl, > vlshr, > vashr and vashr. > >

Re: [PATCH] [i386] Support avx512 vector shift with vector [PR98434]

2021-06-23 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 23, 2021 at 3:23 PM Hongtao Liu wrote: > > Here's the patch I'm going to check in. > > The patch will regress pr91838.C with extra options: -march=cascadelake > > using T = unsigned char; // or ushort, or uint > using V [[gnu::vector_size(8)]] = T; >

Re: [PATCH] Disparage slightly the mask register alternative for bitwise operations. [PR target/101142]

2021-06-23 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 23, 2021 at 3:59 PM Uros Bizjak wrote: > > On Mon, Jun 21, 2021 at 10:08 AM Hongtao Liu wrote: > > > > On Mon, Jun 21, 2021 at 3:28 PM Uros Bizjak via Gcc-patches > > wrote: > > > > > > On Mon, Jun 21, 2021 at 6:56 AM liuhongt wrote: &

Re: [PATCH] Disparage slightly the mask register alternative for bitwise operations. [PR target/101142]

2021-06-23 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 23, 2021 at 4:50 PM Hongtao Liu wrote: > > On Wed, Jun 23, 2021 at 3:59 PM Uros Bizjak wrote: > > > > On Mon, Jun 21, 2021 at 10:08 AM Hongtao Liu wrote: > > > > > > On Mon, Jun 21, 2021 at 3:28 PM Uros Bizjak via Gcc-patches > > > wro

Re: [PATCH] Disparage slightly the mask register alternative for bitwise operations. [PR target/101142]

2021-06-23 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 23, 2021 at 5:55 PM Uros Bizjak wrote: > > On Wed, Jun 23, 2021 at 11:41 AM Uros Bizjak wrote: > > > > On Wed, Jun 23, 2021 at 11:32 AM Hongtao Liu wrote: > > > > > > > > > > Also when allocano cost of GENERAL_REGS

Re: [PATCH] [i386] Support avx512 vector shift with vector [PR98434]

2021-06-23 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 23, 2021 at 5:08 PM Richard Biener wrote: > > On Wed, Jun 23, 2021 at 10:01 AM Jakub Jelinek wrote: > > > > On Wed, Jun 23, 2021 at 09:53:27AM +0200, Richard Biener via Gcc-patches > > wrote: > > > On Wed, Jun 23, 2021 at 9:19 AM Hongtao

Re: [PATCH] x86: Compile CPUID functions with -mgeneral-regs-only

2021-06-24 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 25, 2021 at 12:13 AM Uros Bizjak via Gcc-patches wrote: > > On Thu, Jun 24, 2021 at 2:12 PM H.J. Lu wrote: > > > > CPUID functions are used to detect CPU features. If vector ISAs > > are enabled, compiler is free to use them in these functions. Add > > __attribute__ ((target("genera

Re: PING^1 [PATCH v4 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-24 Thread Hongtao Liu via Gcc-patches
I didn't receive https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572436.html in my gmail account, does anyone know why? >--- a/gcc/config/i386/i386-protos.h >+++ b/gcc/config/i386/i386-protos.h >@@ -260,6 +260,7 @@ extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, >bool, bool); > extern

Re: PING^1 [PATCH v4 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-24 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 25, 2021 at 2:01 PM Hongtao Liu wrote: > > I didn't receive > https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572436.html in my > gmail account, does anyone know why? > > > >--- a/gcc/config/i386/i386-protos.h > >+++ b/gcc/config/i386/i386-pro

Re: [PATCH 1/2] [i386] Fold blendv builtins into gimple.

2021-06-24 Thread Hongtao Liu via Gcc-patches
Hi: Ater a second thought, I gave up on refactoring blendv's pattern, we already have vec_mege with const_int mask, integer mask, and introducing vector mask doesn't look very good. Here is the final patch I'm going to check in. Fold __builtin_ia32_pblendvb128 (a, b, c) as VEC_COND_EXPR (c < 0

Re: [PATCH 2/2] [i386] For 128/256-bit vec_cond_expr, When mask operands is lt reg const0_rtx, blendv can be used instead of avx512 mask. [PR target/100648]

2021-06-24 Thread Hongtao Liu via Gcc-patches
On Mon, May 24, 2021 at 12:59 PM Hongtao Liu wrote: > > Hi: > This patch is about to add define_insn_and_split to convert avx512 > mask mov back to pblendv instructions when mask operand is (lt: reg > const0_rtx). > Hi: Here's the patch I'm going to check in.

Re: [PATCH v5 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-27 Thread Hongtao Liu via Gcc-patches
On Sun, Jun 27, 2021 at 4:02 AM H.J. Lu wrote: > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO > operands to vector broadcast from an integer with AVX2. > 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which > won't increase stack alignment requirement

Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-28 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 28, 2021 at 2:50 PM Kewen.Lin wrote: > > Hi! > > on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote: > > Hi, > > > > PR100328 has some details about this issue, I am trying to > > brief it here. In the hottest function LBM_performStreamCollideTRT > > of SPEC2017 bmk 519.lbm_r, there

Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-28 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 28, 2021 at 3:12 PM Hongtao Liu wrote: > > On Mon, Jun 28, 2021 at 2:50 PM Kewen.Lin wrote: > > > > Hi! > > > > on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote: > > > Hi, > > > > > > PR100328 has some details about this i

Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-30 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 28, 2021 at 3:27 PM Kewen.Lin wrote: > > on 2021/6/28 下午3:20, Hongtao Liu wrote: > > On Mon, Jun 28, 2021 at 3:12 PM Hongtao Liu wrote: > >> > >> On Mon, Jun 28, 2021 at 2:50 PM Kewen.Lin wrote: > >>> > >>> Hi! > >

Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-30 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 30, 2021 at 5:42 PM Kewen.Lin wrote: > > on 2021/6/30 下午4:53, Hongtao Liu wrote: > > On Mon, Jun 28, 2021 at 3:27 PM Kewen.Lin wrote: > >> > >> on 2021/6/28 下午3:20, Hongtao Liu wrote: > >>> On Mon, Jun 28, 2021 at 3:12 PM Hongtao Liu wrote:

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-06-30 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 1:48 PM liuhongt wrote: > > Hi: > AVX512FP16 is disclosed, refer to [1]. > There're 100+ instructions for AVX512FP16, 67 gcc patches, for the > convenience of review, we divide the 67 patches into 2 major parts. > The first part is 2 patches containing basic support f

Re: [PATCH 2/2] AVX512FP16: Add HFmode support in libgcc.

2021-06-30 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 1:48 PM liuhongt wrote: > > 1. Add extendhftf2, extendhfxf2, truncxfhf2, trunctfhf2, fixhfti, > fixunshfti, floattihf and floatuntihf. > 2. Always add _divhc3.c and _mulhc3.c. If assembler doesn't support > AVX512FP16, they are empty. > > 2019-01-01 H.J. Lu > gcc/ChangeLo

Re: [PATCH 1/2] AVX512FP16: Initial support for _Float16 type and AVX512FP16 feature.

2021-06-30 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 1:48 PM liuhongt wrote: > > From: "Guo, Xuepeng" > > gcc/ChangeLog: > > * common/config/i386/cpuinfo.h (get_available_features): > Detect FEATURE_AVX512FP16. > * common/config/i386/i386-common.c > (OPTION_MASK_ISA_AVX512FP16_SET, > OP

Re: [PATCH] [i386] Clear odata for aes(enc|dec)(wide)?kl intrinsics

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 3:51 PM Hongyu Wang wrote: > > For Keylocker aesenc/aesdec intrinsics, current implementation > moves idata to odata unconditionally, which causes safety issue when > the instruction meets runtime error. So we add a branch to clear > odata when ZF is set after instruction ex

Re: [PATCH v6 2/2] x86: Add vec_duplicate expander

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote: > > Add vec_duplicate expander for SSE2 if we can move from GPR to SSE > register directly. > > * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate): > Make it global. > * config/i386/i386-protos.h (ix86_expand_vecto

Re: [PATCH v6 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote: > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR > operands to vector broadcast from an integer with AVX. > 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which > won't increase stack alignment requirement

Re: [PATCH 56/62] AVX512FP16: Optimize (_Float16) sqrtf ((float) f16) to sqrtf16 (f16).

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 5:51 PM Richard Biener via Gcc-patches wrote: > > On Thu, Jul 1, 2021 at 9:20 AM liuhongt via Gcc-patches > wrote: > > How does this look on GIMPLE and why's it not better handled there? Do you mean in match.pd, i'll try that. C++ FE doesn't support _FLoat16, and the plac

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 7:10 PM Uros Bizjak wrote: > > [Sorry for double post, gcc-patches address was wrong in original post] > > On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote: > > > > Hi: > > AVX512FP16 is disclosed, refer to [1]. > > There're 100+ instructions for AVX512FP16, 67 gcc patches

Re: [llvm-dev] [PATCH] Add optional _Float16 support

2021-07-02 Thread Hongtao Liu via Gcc-patches
On Fri, Jul 2, 2021 at 3:46 PM Richard Biener via llvm-dev wrote: > > On Fri, Jul 2, 2021 at 1:34 AM Jacob Lifshay via Gcc-patches > wrote: > > > > On Thu, Jul 1, 2021, 15:28 H.J. Lu via llvm-dev > > wrote: > > > > > On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers > > > wrote: > > > > > > > > On Th

Re: [PATCH] i386: Punt on broadcasts from TImode integers [PR101286]

2021-07-02 Thread Hongtao Liu via Gcc-patches
On Fri, Jul 2, 2021 at 3:32 PM Jakub Jelinek wrote: > > Hi! > > ix86_expand_vector_init_duplicate doesn't handle TImode -> V2TImode > or TImode -> V4TImode broadcasts, so I think we should punt on TImode > inner mode in ix86_broadcast_from_integer_constant, otherwise we ICE > in ix86_expand_vector

Re: [PATCH] i386: Uglify some local identifiers in *intrin.h [PR107748]

2022-11-20 Thread Hongtao Liu via Gcc-patches
On Sat, Nov 19, 2022 at 4:38 PM Jakub Jelinek wrote: > > Hi! > > While reporting PR107748 (where is a problem with non-uglified names, > but I've left it out because it needs fixing anyway), I've noticed > various spots where identifiers in *intrin.h headers weren't uglified. > The following patch

Re: [PATCH] i386: Only enable small loop unrolling in backend [PR 107602]

2022-11-20 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 21, 2022 at 9:01 AM Liu, Hongtao via Gcc-patches wrote: > > > > > -Original Message- > > From: Wang, Hongyu > > Sent: Saturday, November 19, 2022 2:26 PM > > To: gcc-patches@gcc.gnu.org > > Cc: richard.guent...@gmail.com; ubiz...@gmail.com; Liu, Hongtao > > > > Subject: [PATC

Re: [PATCH] [x86] Some tidy up for RA related hooks.

2022-11-20 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 21, 2022 at 10:13 AM liuhongt wrote: > > When i'm working at [1] for ix86_can_change_mode_class, > I notice there're some incorrectness/misoptimization in current RA-related > hook. > This patch tries to do some fix and tidy up for them: > > 1. We also need to guard size of TO to be >

Re: [PATCH] [x86] Some tidy up for RA related hooks.

2022-11-21 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 21, 2022 at 3:17 PM Uros Bizjak wrote: > > On Mon, Nov 21, 2022 at 6:24 AM Hongtao Liu wrote: > > > > On Mon, Nov 21, 2022 at 10:13 AM liuhongt wrote: > > > > > > When i'm working at [1] for ix86_can_change_mode_class, > > > I not

Re: [PATCH] i386: Only enable small loop unrolling in backend [PR 107602]

2022-11-21 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 22, 2022 at 1:41 AM Jeff Law via Gcc-patches wrote: > > > On 11/18/22 23:25, Hongyu Wang via Gcc-patches wrote: > > Hi, > > > > Followed by the discussion in pr107602, -munroll-only-small-loops > > Does not turns on/off -funroll-loops, and current check in > > pass_rtl_unroll_loops::ga

Re: [PATCH] [x86] Fix incorrect implementation for mm_cvtsbh_ss.

2022-11-23 Thread Hongtao Liu via Gcc-patches
On Wed, Nov 23, 2022 at 8:40 PM Jakub Jelinek wrote: > > On Wed, Nov 23, 2022 at 08:28:20PM +0800, liuhongt via Gcc-patches wrote: > > After supporting real __bf16 type, implementation of mm_cvtsbh_ss went > > wrong. > > The patch supports extendbfsf2/truncsfbf2 with pslld/psrld, > > and then ref

Re: [PATCH v2] [x86] Fix incorrect _mm_cvtsbh_ss.

2022-11-24 Thread Hongtao Liu via Gcc-patches
On Thu, Nov 24, 2022 at 4:53 PM Jakub Jelinek wrote: > > On Thu, Nov 24, 2022 at 09:22:00AM +0800, liuhongt via Gcc-patches wrote: > > --- a/gcc/config/i386/i386.md > > +++ b/gcc/config/i386/i386.md > > @@ -130,6 +130,7 @@ (define_c_enum "unspec" [ > >;; For AVX/AVX512F support > >UNSPEC_S

Re: [PATCH 0/2] Support HWASAN with Intel LAM

2022-11-27 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 11, 2022 at 9:26 AM liuhongt wrote: > > 2 years ago, ARM folks support HWASAN[1] in GCC[2], and introduced several > target hooks(Many thanks to their work) so other backends can do similar > things if they have similar feature. > Intel LAM(linear Address Masking)[3 Charpter 14] su

Re: [PATCH 0/2] Support HWASAN with Intel LAM

2022-11-28 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 28, 2022 at 10:40 PM Martin Liška wrote: > > On 11/11/22 02:26, liuhongt via Gcc-patches wrote: > >2 years ago, ARM folks support HWASAN[1] in GCC[2], and introduced > > several > > target hooks(Many thanks to their work) so other backends can do similar > > things if they have si

Re: [PATCH] [x86] Fix unrecognizable insn due to illegal immediate_operand (const_int 255) of QImode.

2022-11-28 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 28, 2022 at 9:06 PM liuhongt wrote: > > For __builtin_ia32_vec_set_v16qi (a, -1, 2) with > !flag_signed_char. it's transformed to > __builtin_ia32_vec_set_v16qi (_4, 255, 2) in the gimple, > and expanded to (const_int 255) in the rtl. But for immediate_operand, > it expects (const_int

Re: [PATCH] [x86] Fix unrecognizable insn due to illegal immediate_operand (const_int 255) of QImode.

2022-11-29 Thread Hongtao Liu via Gcc-patches
On Wed, Nov 30, 2022 at 3:12 AM H.J. Lu wrote: > > On Mon, Nov 28, 2022 at 11:04 PM Hongtao Liu wrote: > > > > On Mon, Nov 28, 2022 at 9:06 PM liuhongt wrote: > > > > > > For __builtin_ia32_vec_set_v16qi (a, -1, 2) with > >

Re: [PATCH 0/2] Support HWASAN with Intel LAM

2022-12-08 Thread Hongtao Liu via Gcc-patches
On Wed, Nov 30, 2022 at 10:07 PM Martin Liška wrote: > > On 11/29/22 03:37, Hongtao Liu wrote: > > On Mon, Nov 28, 2022 at 10:40 PM Martin Liška wrote: > >> > >> On 11/11/22 02:26, liuhongt via Gcc-patches wrote: > >>>2 years ago, ARM folks s

Re: [PATCH] [i386] Add option -mvect-compare-costs

2021-12-16 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 16, 2021 at 5:00 PM Richard Sandiford via Gcc-patches wrote: > > Obviously I'm not in a position to comment on the target bits, but: > > liuhongt via Gcc-patches writes: > > Also with corresponding target attribute, option default disabled. > > > > Bootstrapped and regtested on x86_64

Re: [PATCH] [i386][avx512]Add combine splitter to transform vpternlogd/vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0

2021-12-16 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 15, 2021 at 9:26 AM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch fix the regression previously reported on the combine splitter > under '-m32 -march=cascadelake' options. > > Regtested on x86_64-pc-linux-gnu. Ok. > > BRs, > Haochen > > gcc/ChangeLog: > > PR

Re: [PATCH] [i386] Optimize bit_and op1 float_vector_all_ones_operands to op1.

2021-12-19 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 16, 2021 at 1:59 PM liuhongt wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? Pushed to trunk. > > gcc/ChangeLog: > > PR target/98468 > * config/i386/sse.md (*bit_and_float_vector_all_ones): New > pre-reload splitter. > > gcc/

Re: [PATCH] i386: Enable intrinsics that convert float and bf16 data to each other.

2021-12-22 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 22, 2021 at 11:28 AM Kong, Lingling via Gcc-patches wrote: > > Hi, > > > This patch is to enable intrinsics that convert float and bf16 data to each > other. > Ok for master? > Ok. > gcc/ChangeLog: > > * config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Add new intrinsic. >

Re: [PATCH] [i386] Add define_insn_and_split for vpcmp{b, w, d, q} vpcmp{ph, ps, pd}.

2021-12-22 Thread Hongtao Liu via Gcc-patches
On Tue, Dec 21, 2021 at 2:27 PM liuhongt wrote: > > The purpose of those define_insn_and_split: > 1. Combine vpcmpuw and zero_extend into vpcmpuw. > 2. Canonicalize vpcmpuw pattern so CSE can replace duplicate vpcmpuw to just > kmov > 3. Use DImode as dest of zero_extend so cprop_hardreg can elim

Re: [PATCH] [i386]Fix tdpbf16ps testcase

2021-12-27 Thread Hongtao Liu via Gcc-patches
On Fri, Dec 24, 2021 at 4:51 PM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch fix the testcase of amxbf16-dpbf16ps-2.c. Previously the type > convert has some issue. > > Ok for trunk? Ok. > > BRs, > Haochen > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/amx-check.h (

Re: [PATCH] [i386] Remove register restriction on operands for andnot insn

2022-01-09 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 10, 2022 at 2:23 PM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch removes the register restriction on operands for andnot insn so > that it can be used from memory. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > BRs, > Haochen > > gcc/ChangeLog: > >

Re: [PATCH] [i386] Remove register restriction on operands for andnot insn

2022-01-10 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 10, 2022 at 3:21 PM Jiang, Haochen wrote: > > Hi Hongtao, > > I have changed that message in this patch. Ok for trunk? Ok. > > Thx, > Haochen > > -Original Message- > From: Hongtao Liu > Sent: Monday, January 10, 2022 3:25 PM > To: Jia

Re: [PATCH] [i386] Fix ICE of unrecognizable insn. [PR target/104001]

2022-01-13 Thread Hongtao Liu via Gcc-patches
Here's the patch I'm going to check in, the patch is pre-approved in PR. On Thu, Jan 13, 2022 at 11:59 PM liuhongt wrote: > > For define_insn_and_split "*xor2andn": > > 1. Refine predicate of operands[0] from nonimmediate_operand to > register_operand. > 2. Remove TARGET_AVX512BW from condition t

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-15 Thread Hongtao Liu via Gcc-patches
On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches wrote: > > On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang wrote: > > > > Thanks for the suggestion, here is the updated patch that survived > > bootstrap/regtest. > > LGTM for me, but please get the final approval from Hongtao. > Ok, thank

Re: [PATCH] [i386]Adjust testcase for --target_board='unix{-m64\ -march=cascadelake}'

2022-01-17 Thread Hongtao Liu via Gcc-patches
On Tue, Jan 18, 2022 at 10:57 AM liuhongt via Gcc-patches wrote: > > Change scan-assembler from "\tucomisd" to "\t\[v\]?ucomisd". It's an obvious "fix", Pushed to trunk. > > refer to https://gcc.gnu.org/pipermail/gcc-regression/2022-January/076241.html > > gcc/testsuite/ChangeLog: > > * g+

Re: [PATCH] RTL: Bugfix for wrong code with v16hi compare & mask

2023-03-26 Thread Hongtao Liu via Gcc-patches
On Sun, Mar 26, 2023 at 3:01 AM Jeff Law via Gcc-patches wrote: > > > > On 3/24/23 08:11, pan2.li--- via Gcc-patches wrote: > > From: Pan Li > > > > Fix the bug of the incorrect code generation for the > > below code sample. > > > > typedef unsigned short __attribute__((__vector_size__ (32))) V;

Re: [PATCH] Adjust memory_move_cost for MASK_REGS when MODE_SIZE > 8.

2023-03-30 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 31, 2023 at 1:57 PM Uros Bizjak wrote: > > On Fri, Mar 31, 2023 at 7:11 AM liuhongt wrote: > > > > RA sometimes will use lowest the cost of the mode with all different > > regclasses > > w/o check if it's hard_regno_mode_ok. > > It's impossible to put modes whose size > 8 into MASK_R

Re: [PATCH 0/2] Support Intel AMX-COMPLEX

2023-04-05 Thread Hongtao Liu via Gcc-patches
On Mon, Apr 3, 2023 at 4:51 PM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > These patch aims to add Intel AMX-COMPLEX instructions. Also we added > AMX-COMPLEX to -march=graniterapids. > > The information is based on newly released > Intel Architecture Instruction Set Extensions and Future

Re: [PATCH] combine: Fix simplify_comparison AND handling for WORD_REGISTER_OPERATIONS targets [PR109040]

2023-04-09 Thread Hongtao Liu via Gcc-patches
On Sun, Apr 9, 2023 at 9:15 AM Jeff Law via Gcc-patches wrote: > > > > On 4/6/23 05:37, Jakub Jelinek wrote: > > On Thu, Apr 06, 2023 at 12:51:20PM +0200, Eric Botcazou wrote: > >>> If we want to fix it in the combiner, I think the fix would be following. > >>> The optimization is about > >>> (and

Re: [PATCH] combine: Fix simplify_comparison AND handling for WORD_REGISTER_OPERATIONS targets [PR109040]

2023-04-09 Thread Hongtao Liu via Gcc-patches
On Mon, Apr 10, 2023 at 1:13 PM Hongtao Liu wrote: > > On Sun, Apr 9, 2023 at 9:15 AM Jeff Law via Gcc-patches > wrote: > > > > > > > > On 4/6/23 05:37, Jakub Jelinek wrote: > > > On Thu, Apr 06, 2023 at 12:51:20PM +0200, Eric Botcazou wrote: > > &

Re: [PATCH] gcc-13: Mention Intel AMX-COMPLEX ISA support and revise march support

2023-04-12 Thread Hongtao Liu via Gcc-patches
On Mon, Apr 10, 2023 at 10:08 AM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch mentions Intel AMX-COMPLEX ISA support in GCC 13. > > Also it revises the march support according to newly released > Intel Architecture Instruction Set Extensions and Future Features. > > Ok for trunk

<    9   10   11   12   13   14