m_mask_move_ss (__m128 src, __mmask8 k, __m128 a, __m128 b)
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=vmovss&ig_expand=3807,3081,3082,3084,3083,4837,4838
>
> LLVM generates mask & 1 for these intrinsics.
>
> Hongtao Liu via Gcc-patches 于20
On Mon, Mar 21, 2022 at 7:52 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
> mask should be and by 1 to ensure the mask is bind to lowest byte.
> Use masked vmovss to perform same operation which omits higher bits
> of mask.
>
> Bootst
On Mon, Mar 21, 2022 at 9:06 PM liuhongt wrote:
>
> Failed to match this instruction:
> (set (reg/v:SI 88 [ z ])
> (if_then_else:SI (eq (zero_extract:SI (reg:SI 92)
> (const_int 1 [0x1])
> (zero_extend:SI (subreg:QI (reg:SI 93) 0)))
> (const_int 0 [0
On Wed, Mar 23, 2022 at 2:05 PM liuhongt via Gcc-patches
wrote:
>
> In validate_subreg, both (subreg:V2HF (reg:SI) 0)
> and (subreg:V8HF (reg:V2HF) 0) are valid, but not
> for (subreg:V8HF (reg:SI) 0) which causes ICE.
>
> Ideally it should be handled in validate_subreg to support
> subreg for all
On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches
wrote:
>
> Since we're now vectorizing by default at -O2 issues like PR101908
> become more important where we apply basic-block vectorization to
> parts of the function covering loads from function parameters passed
> on the stack. S
On Fri, Mar 25, 2022 at 9:42 PM Richard Biener wrote:
>
> On Fri, 25 Mar 2022, Hongtao Liu wrote:
>
> > On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > Since we're now vectorizing by default at -O2 issues like P
On Sat, Mar 26, 2022 at 4:50 AM H.J. Lu via Gcc-patches
wrote:
>
> Since KL instructions have no AVX512 version, replace the "v" register
> constraint with the "x" register constraint.
>
> PR target/105058
> * config/i386/sse.md (loadiwkey): Replace "v" with "x".
> (aesu8):
On Sat, Mar 26, 2022 at 1:27 AM H.J. Lu via Gcc-patches
wrote:
>
> Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND
> have no AVX512 version, replace the "Yv" register constraint with the
> "x" register constraint.
LGTM, please backport to GCC10/GCC11 branch.
>
> PR t
On Sat, Mar 26, 2022 at 10:05 AM Hongyu Wang via Gcc-patches
wrote:
>
> > > Is it possible to create a test case that gas would throw an error for
> > > invalid operands?
> >
> > You can use -ffix-xmmN to disable XMM0-15.
>
> I mean can we create an intrinsic test for this PR that produces xmm16-3
On Thu, Mar 31, 2022 at 6:45 PM Richard Biener via Gcc-patches
wrote:
>
> On Thu, Mar 31, 2022 at 7:51 AM liuhongt wrote:
> >
> > Since cfg is freed before machine_reorg, just do a rough calculation
> > of the window according to the layout.
> > Also according to an experiment on CLX, set window
On Fri, Apr 1, 2022 at 2:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Apr 1, 2022 at 8:47 AM liuhongt via Gcc-patches
> wrote:
> >
> > Update in V2:
> > 1. Use get_insns instead of FOR_EACH_BB_CFUN and FOR_BB_INSNS.
> > 2. Return for any_uncondjump_p and ANY_RETURN_P.
> > 3. Add dump i
On Fri, Apr 1, 2022 at 4:32 PM liuhongt via Gcc-patches
wrote:
>
> Update in V3:
> 1. Add -param=x86-stlf-window-ninsns= (default 64).
> 2. Exclude call in the window.
>
> Since cfg is freed before machine_reorg, just do a rough calculation
> of the window according to the layout.
> Also according
On Wed, Apr 6, 2022 at 5:56 AM Roger Sayle wrote:
>
>
>
> This simple patch allows the i386 backend to generate pandn instructions
>
> for V1TI mode. Currently, the testcase:
>
>
>
> typedef unsigned __int128 v1ti __attribute__ ((__vector_size__ (16)));
>
> v1ti andnot1(v1ti x, v1ti y) { return ~
On Fri, Apr 22, 2022 at 4:12 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> Add missing macro under O0 and adjust macro format for scalf
> intrinsics.
>
Please add the corresponding intrinsic test in sse-14.c.
> Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}.
>
> Ok for master and backpor
On Wed, Jan 19, 2022 at 8:00 AM Jakub Jelinek wrote:
>
> On Sun, Jan 16, 2022 at 12:22:18PM +0800, Hongtao Liu via Gcc-patches wrote:
> > On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches
> > wrote:
> > >
> > > On Sat, Jan 15, 2022
On Wed, Jan 19, 2022 at 9:40 AM Jakub Jelinek wrote:
>
> Hi!
>
> On Wed, Jan 19, 2022 at 09:09:41AM +0800, Hongtao Liu wrote:
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > Yes, thanks.
>
> Thanks. Committed.
> grep '{[^|}
On Sun, Jan 23, 2022 at 8:28 PM H.J. Lu via Gcc-patches
wrote:
>
> Return false for invalid mode on memory broadcast in bcst_mem_operand:
>
> (vec_duplicate:V16SF (mem/j:V4SF (reg/v/f:DI 109 [ b ])))
>
Yes, thanks.
> gcc/
>
> PR target/104188
> * config/i386/predicates.md (bcst_mem
On Fri, Jan 28, 2022 at 5:53 AM H.J. Lu via Gcc-patches
wrote:
>
> The v3 patch was posted at
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574847.html
>
> There is no progress with repeated pings since then. Glibc 2.35 and
> binutils 2.38 will support GNU_PROPERTY_1_NEEDED_INDIRECT_EXT
On Wed, Feb 9, 2022 at 10:53 AM H.J. Lu via Gcc-patches
wrote:
>
> commit 9775e465c1fbfc32656de77c618c61acf5bd905d
> Author: H.J. Lu
> Date: Tue Jul 27 07:46:04 2021 -0700
>
> x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register
>
> called ix86_check_avx_upper_register to check mode
On Thu, Feb 10, 2022 at 9:58 PM H.J. Lu via Gcc-patches
wrote:
>
> 1. Require linker with GNU_PROPERTY_1_NEEDED support for PR 35513
> run-time tests.
> 2. Compile pr35513-8.c to scan assembly code.
>
> PR testsuite/104481
> * g++.target/i386/pr35513-1.C: Require property_1_needed
On Tue, Feb 1, 2022 at 2:56 AM H.J. Lu via Gcc-patches
wrote:
>
> Before MPX was removed, "%!" was mapped to
>
> case '!':
> if (ix86_bnd_prefixed_insn_p (current_output_insn))
> fputs ("bnd ", file);
> return;
>
> After CET was added and MPX was removed, "%
On Tue, Feb 1, 2022 at 2:55 AM H.J. Lu via Gcc-patches
wrote:
>
> Backport -mindirect-branch-cs-prefix:
>
> commit 48a4ae26c225eb018ecb59f131e2c4fd4f3cf89a
> Author: H.J. Lu
> Date: Wed Oct 27 06:27:15 2021 -0700
>
> x86: Add -mindirect-branch-cs-prefix
>
> Add -mindirect-branch-cs-pref
On Wed, Feb 16, 2022 at 10:17 PM Jakub Jelinek via Gcc-patches
wrote:
>
> On Wed, Feb 16, 2022 at 05:03:09PM +0800, liuhongt via Gcc-patches wrote:
> > > > +(match (cond_expr_convert_p @0 @2 @3 @6)
> > > > + (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3))
> > > > + (if (types_mat
On Thu, Feb 17, 2022 at 12:00 PM liuhongt wrote:
>
> Bootstrap and regrestest on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/cpuid.h (bit_MPX): Removed.
> (bit_BNDREGS): Ditto.
> (bit_BNDCSR): Ditto.
> ---
> gcc/config/i386/cpuid.h | 5
On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
wrote:
>
> Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
> Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
> transition penalty. Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
> generate vzero
On Thu, Feb 17, 2022 at 9:47 PM Richard Biener via Gcc-patches
wrote:
>
> The x86 backend piggy-backs on mode-switching for insertion of
> vzeroupper. A recent improvement there was implemented in a way
> to walk possibly the whole basic-block for all DF reg def definitions
> in its mode_needed h
On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches
wrote:
>
> This uses the now passed SLP node to the vectorizer costing hook
> to adjust vector construction costs for the cost of moving an
> integer component from a GPR to a vector register when that's
> required for building a vect
On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote:
>
> On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote:
> > On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches
> > wrote:
> > >
> > > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
&
On Tue, Feb 22, 2022 at 2:35 AM H.J. Lu wrote:
>
> On Sun, Feb 20, 2022 at 6:01 PM Hongtao Liu wrote:
> >
> > On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote:
> > >
> > > On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote:
> > > > On
On Tue, Feb 22, 2022 at 12:46 AM Jakub Jelinek wrote:
>
> Hi!
>
> We ICE on the following testcase for -m32 since r12-3435. because
> operands[2] is (subreg:SF (reg:DI ...) 0) and
According to validate_subreg, (subreg:V4SF (reg:DI ...) 0) should be
valid(but not sure if it really works )
For -m64
On Mon, Feb 21, 2022 at 5:10 PM Richard Biener wrote:
>
> On Mon, 21 Feb 2022, Hongtao Liu wrote:
>
> > On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > This uses the now passed SLP node to the vectorizer costing h
On Wed, Feb 23, 2022 at 5:48 PM Jakub Jelinek via Gcc-patches
wrote:
>
> On Wed, Feb 23, 2022 at 05:21:26PM +0800, liuhongt via Gcc-patches wrote:
> > For evex encoding vp{xor,or,and}, suffix is needed.
> >
> > Or there would be an error for
> > vpxor %ymm0, %ymm31, %ymm1
>
> The insn is about V1T
On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
>
> The patch fixes ICE in ix86_gimple_fold_builtin.
>
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
Ok for main trunk?
> gcc/ChangeLog:
>
> PR target/104666
> * config/i386/i386-expand.cc
> (ix86_check_builtin_isa_m
On Fri, Feb 25, 2022 at 4:44 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch intends to sync with llvm change in
> https://reviews.llvm.org/D120307 to add enumeration and truncate
This will be documented in intel intrinsic guide.
> imm to unsigned char, so users could use ~ on immedia
On Tue, Nov 16, 2021 at 4:35 PM Hongtao Liu wrote:
>
> On Tue, Nov 16, 2021 at 4:23 PM Kong, Lingling via Gcc-patches
> wrote:
> >
> > Hi,
> >
> > This patch is to add alias for f*mul_*ch intrinsics.
> >
> > Ok for master?
> This patch j
On Fri, Nov 19, 2021 at 3:53 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Fri, Nov 19, 2021 at 8:50 AM Uros Bizjak wrote:
> >
> > On Fri, Nov 19, 2021 at 2:14 AM liuhongt wrote:
> > >
> > > >Why is the above declared as a special memory constraint? Also the
> > > Change to define_memory_constrai
On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches
wrote:
>
> Hi,
>
> This patch is to add a test case similar to the one in i386
> to add testing coverage for 510.parest_r hotspots.
>
> As evaluated, the emulated gather capability of vectorizer
> (r12-2733) can help to speed up SPEC2017 5
On Thu, Nov 25, 2021 at 12:18 PM H.J. Lu via Gcc-patches
wrote:
>
> Replace long with int64_t to work with -mx32.
Thanks.
>
> * gcc.target/i386/pr103194-5.c: Replace long with int64_t.
> ---
> gcc/testsuite/gcc.target/i386/pr103194-5.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletio
On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak wrote:
>
> On Mon, Nov 29, 2021 at 2:32 AM liuhongt wrote:
> >
> > There're several failures reported in [1]:
> > 1. unsupported instruction `pextrw` for "pextrw $0, %xmm31, 16(%rax)"
> > %vpextrw should be used in output templates.
> > 2. ICE in get_a
On Tue, Nov 30, 2021 at 5:21 AM Uros Bizjak wrote:
>
> On Mon, Nov 29, 2021 at 10:48 AM Hongtao Liu wrote:
> >
> > On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak wrote:
> > >
> > > On Mon, Nov 29, 2021 at 2:32 AM liuhongt wrote:
> > > >
> > &g
On Tue, Nov 30, 2021 at 5:44 PM liuhongt via Gcc-patches
wrote:
>
> ix86_attr_length_immediate_default assume TYPE ishift only have 1
> constant operand,
> but *x86_64_shld_1/*x86_shld_1/*x86_64_shrd_1/*x86_shrd_1 has 2, with
> condition: INTVAL (operands[3]) == 32 - INTVAL (operands[2]) or
> INTV
On Thu, Dec 2, 2021 at 6:07 AM Uros Bizjak wrote:
>
> Introduce vec_set_0 pattern for V8HI and V8HF modes to implement scalar
> element 0 inserts to from a GP register, SSE register or memory. Also
> add V8HI and V8HF AVX2 (x,x,x) alternative to PINSR insn pattern, which is
> split after reload t
On Thu, Dec 2, 2021 at 4:27 PM liuhongt wrote:
>
> The patch helps reload to choose GENENRAL_REGS alternatives for
> SSE_FLOAT_MODE and enabled optimization like
>
> - vmovd %xmm0, -4(%rsp)
> - movl$1, %eax
> - addl-4(%rsp), %eax
> + movd%xmm0, %eax
> +
care for 64-bit moves which are expensive on 32-bit
> targets.
I like your version, update patch.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} w/ and w/o -march=k8.
On Mon, Dec 6, 2021 at 11:41 AM liuhongt wrote:
>
> When moves between integer and sse registers are cheap.
>
&
On Wed, Dec 8, 2021 at 11:13 AM Jiang, Haochen via Gcc-patches
wrote:
>
> Hi Uros,
>
> I have fixed that in this patch attached for checking in. Is that ok for
> trunk?
>
Uros already said it's ok with that change, let me check in the patch for you.
> Regtested on x86_64-pc-linux-gnu.
>
> Thx,
>
On Wed, Dec 8, 2021 at 2:47 PM Haochen Jiang via Gcc-patches
wrote:
>
> Hi,
>
> This patch add combine splitter to transform vashr/vlshr/vashl_optab to
> ashr/lshr/ashl_optab for const vector duplicate operand.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
Ok.
>
> BRs,
> Haochen
>
> gcc/Cha
On Fri, Oct 14, 2022 at 4:14 PM Iain Sandoe via Gcc-patches
wrote:
>
> Hi Haochen
>
> > On 14 Oct 2022, at 08:54, Haochen Jiang via Gcc-patches
> > wrote:
> >
>
> > These six patches aimed to add Intel Sierra Forest instructions, including
> > AVX-IFMA, AVX-VNNI0INT8, AVX-NE-CONVERT, CMPccXADD.
On Fri, Oct 14, 2022 at 4:24 PM Iain Sandoe wrote:
>
>
>
> > On 14 Oct 2022, at 09:20, Hongtao Liu wrote:
> >
> > On Fri, Oct 14, 2022 at 4:14 PM Iain Sandoe via Gcc-patches
> > wrote:
> >>
> >> Hi Haochen
> >>
> >>&g
> >> I could not see any target-requires changes in the testcases .. hence my
> >> question.
> >>
> > Guess you are looking at compile tests?
>
> yes, compile tests would need support from the assembler.
> >
In my understanding, dg-do compile tests don't need assembler support,
it just scan dump o
This patch tries to add a parameter to generate instruction prefetch
instead of data prefetch. Currently, __builtin_prefetch assumes data
prefetch only.
On Fri, Oct 14, 2022 at 4:39 PM Haochen Jiang via Gcc-patches
wrote:
>
> gcc/ChangeLog:
>
> * builtins.cc (expand_builtin_prefetch): Han
On Fri, Oct 14, 2022 at 4:36 PM Iain Sandoe wrote:
>
>
>
> > On 14 Oct 2022, at 09:30, Hongtao Liu wrote:
> >
> > On Fri, Oct 14, 2022 at 4:24 PM Iain Sandoe wrote:
> >>
> >>
> >>
> >>> On 14 Oct 2022, at 09:20, Hongtao Liu wrot
On Fri, Oct 14, 2022 at 3:41 PM Haochen Jiang via Gcc-patches
wrote:
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h:
> (get_intel_cpu): Handle Raptorlake.
> * common/config/i386/i386-common.cc:
> (processor_alias_table): Add Raptorlake.
Ok.
> ---
> gcc/commo
On Fri, Oct 14, 2022 at 3:41 PM Haochen Jiang via Gcc-patches
wrote:
>
> From: "Hu, Lin1"
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h:
> (get_intel_cpu): Handle Meteorlake.
> * common/config/i386/i386-common.cc:
> (processor_alias_table): Add Meteorlake.
On Mon, Oct 17, 2022 at 9:30 AM Bernhard Reutner-Fischer
wrote:
>
> On 17 October 2022 03:02:22 CEST, Hongtao Liu via Gcc-patches
>
> >> >> Do you have this series as a branch somewhere that I can try on one of
> >> >> the
> >> >> like affect
On Mon, Oct 17, 2022 at 11:26 AM Liwei Xu via Gcc-patches
wrote:
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/forwprop-19.c: Move scanning pass from forwprop1 to
> dse1, This fixs
> the test case fail.
Looks like an obvious fix to me.
> ---
> gcc/testsuite/gcc.dg/tree-ssa/f
On Fri, Oct 14, 2022 at 3:57 PM Haochen Jiang via Gcc-patches
wrote:
>
> From: Kong Lingling
>
> gcc/ChangeLog
>
> * common/config/i386/cpuinfo.h (get_available_features): Detect
> avxvnniint8.
> * common/config/i386/i386-common.cc
> (OPTION_MASK_ISA2_AVXVNNIINT8_S
On Fri, Oct 14, 2022 at 3:58 PM Haochen Jiang via Gcc-patches
wrote:
>
> From: Kong Lingling
>
> gcc/ChangeLog:
>
> * common/config/i386/i386-common.cc
> (OPTION_MASK_ISA2_AVXNECONVERT_SET,
> OPTION_MASK_ISA2_AVXNECONVERT_UNSET): New.
> (ix86_handle_option): Handle
On Mon, Oct 17, 2022 at 2:27 PM Jiang, Haochen wrote:
>
> > -Original Message-
> > From: Hongtao Liu
> > Sent: Monday, October 17, 2022 12:05 PM
> > To: Jiang, Haochen
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> > Subject: Re: [PATCH 2/6] Sup
On Wed, Oct 5, 2022 at 5:33 AM H.J. Lu wrote:
>
> On Wed, Sep 21, 2022 at 1:42 PM H.J. Lu wrote:
> >
> > If shadow stack is enabled, when unwinding stack, we count how many stack
> > frames we pop to reach the landing pad and adjust shadow stack by the same
> > amount. When counting the stack fr
This should be already fixed.
On Mon, Oct 17, 2022 at 4:34 PM haochen.jiang via Gcc-patches
wrote:
>
> On Linux/x86_64,
>
> 25413fdb2ac24933214123e24ba165026452a6f2 is the first bad commit
> commit 25413fdb2ac24933214123e24ba165026452a6f2
> Author: Andre Vieira
> Date: Tue Oct 11 10:49:27 2022
On Wed, Oct 19, 2022 at 7:49 AM H.J. Lu wrote:
>
> On Tue, Oct 18, 2022 at 4:25 PM liuhongt wrote:
> >
> > Fix unexpected non-canon form from gimple vector selector.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR targe
On Tue, Oct 18, 2022 at 5:13 PM Haochen Jiang via Gcc-patches
wrote:
>
> From: Kong Lingling
>
> Hi all,
>
> This is our v2 patch on AVX-VNNI-INT8. This main change in this patch is to
> rename the previous UNSPEC_VPMADDxxx things to new vnni style.
>
> Ok for trunk?
The patch LGTM, but please le
On Tue, Oct 18, 2022 at 5:18 PM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> We would like to add one more patch to enhance the codegen with avxvnniint8.
> Also renamed two awkward named mode_attr to make them more aligned with
> others.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
On Wed, Oct 19, 2022 at 7:09 PM Iain Sandoe wrote:
>
> Hi Hongtao
>
> > On 17 Oct 2022, at 02:56, Hongtao Liu wrote:
> >
> > On Mon, Oct 17, 2022 at 9:30 AM Bernhard Reutner-Fischer
> > wrote:
> >>
> >> On 17 October 2022 03:02:22 CEST, Hongtao
On Thu, Oct 20, 2022 at 5:15 AM Segher Boessenkool
wrote:
>
> On Wed, Oct 19, 2022 at 10:14:28AM -0700, Andrew Pinski wrote:
> > Do the testcases really need to be changed rather than adding new testcases?
> > Usually it is better if the testcases not change unless really needed
> > to be. That is
On Thu, Oct 20, 2022 at 5:08 AM Segher Boessenkool
wrote:
>
> On Fri, Oct 14, 2022 at 04:34:05PM +0800, Haochen Jiang wrote:
> > * config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter
> > for
>
> (Many too long lines here, this is the first one. Changelog lines are
> max. 80
On Thu, Oct 20, 2022 at 9:39 AM Hongtao Liu wrote:
>
> On Thu, Oct 20, 2022 at 5:08 AM Segher Boessenkool
> wrote:
> >
> > On Fri, Oct 14, 2022 at 04:34:05PM +0800, Haochen Jiang wrote:
> > > * config/s390/s390.cc (s390_expand_cpymem): Generate fourth
> &g
* gcc.target/i386/prefetchi-3.c: Ditto.
> > * gcc.target/i386/sse-12.c: Add -mprefetchi.
> > * gcc.target/i386/sse-13.c: Ditto.
> > * gcc.target/i386/sse-14.c: Ditto.
> > * gcc.target/i386/sse-22.c: Add prefetchi.
> > *
On Thu, Oct 20, 2022 at 9:11 AM Hongtao Liu wrote:
>
> On Wed, Oct 19, 2022 at 7:09 PM Iain Sandoe wrote:
> >
> > Hi Hongtao
> >
> > > On 17 Oct 2022, at 02:56, Hongtao Liu wrote:
> > >
> > > On Mon, Oct 17, 2022 at 9:30 AM Bernhard Reutner-Fis
On Thu, Oct 20, 2022 at 5:17 PM Iain Sandoe wrote:
>
>
>
> > On 20 Oct 2022, at 10:09, Hongtao Liu via Gcc-patches
> > wrote:
> >
> > On Thu, Oct 20, 2022 at 9:11 AM Hongtao Liu wrote:
> >>
> >> On Wed, Oct 19, 2022 at 7:09 PM Iain Sandoe
&g
> Thanks for giving me a chance to test, this seems OK on Darwin (no large-scale
> fallout, anyway) ..
>
Good to hear that.
> I tested the ise046 branch which looks like it collects several of the posted
> patch
> series, so I’ve covered those too. (not had a chance to test on AVX512 yet,
> but i
On Wed, Oct 19, 2022 at 2:04 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> Here is the update patch that align the implementation to AVX-VNNI,
> and corrects some spelling error for AVX512IFMA pattern.
>
> Bootstrapped/regtested on x86_64-pc-linux-gnu and sde. Ok for trunk?
Ok for this one.
>
On Wed, Oct 19, 2022 at 9:41 AM Hongtao Liu wrote:
>
> On Tue, Oct 18, 2022 at 5:13 PM Haochen Jiang via Gcc-patches
> wrote:
> >
> > From: Kong Lingling
> >
> > Hi all,
> >
> > This is our v2 patch on AVX-VNNI-INT8. This main change in this patch i
On Wed, Oct 19, 2022 at 9:43 AM Hongtao Liu wrote:
>
> On Tue, Oct 18, 2022 at 5:18 PM Haochen Jiang via Gcc-patches
> wrote:
> >
> > Hi all,
> >
> > We would like to add one more patch to enhance the codegen with avxvnniint8.
> > Also renamed two aw
On Mon, Oct 24, 2022 at 2:20 PM Kong, Lingling wrote:
>
> > From: Gcc-patches
> > On Behalf Of Hongtao Liu via Gcc-patches
> > Sent: Monday, October 17, 2022 1:47 PM
> > To: Jiang, Haochen
> > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org
> > Subject: Re: [
Any comments?
On Mon, Oct 24, 2022 at 10:46 AM Cui,Lili via Gcc-patches
wrote:
>
> Hi Hongtao,
>
> This patch introduces function finish_cost and
> determine_suggested_unroll_factor for x86 backend, to make it be
> able to suggest the unroll factor for a given loop being vectorized.
> Referring t
I'm going to check in this patch.
On Wed, Oct 26, 2022 at 10:30 AM liuhongt wrote:
>
> Enable V4BFmode and V2BFmode with the same ABI as V4HFmode and
> V2HFmode. No real operation is supported for them except for movement.
> This should solve PR target/107261.
>
> Also I notice there's redundancy
On Thu, Oct 27, 2022 at 2:59 AM H.J. Lu via Gcc-patches
wrote:
>
> In i386.md, neg patterns which set MODE_CC register like
>
> (set (reg:CCC FLAGS_REG)
> (ne:CCC (match_operand:SWI48 1 "general_reg_operand") (const_int 0)))
>
> can lead to errors when operand 1 is a constant value. If FLAGS
On Fri, Oct 28, 2022 at 1:56 PM Hongtao Liu wrote:
>
> On Thu, Oct 27, 2022 at 2:59 AM H.J. Lu via Gcc-patches
> wrote:
> >
> > In i386.md, neg patterns which set MODE_CC register like
> >
> > (set (reg:CCC FLAGS_REG)
> > (ne:CCC (match_operand:SWI4
On Fri, Oct 28, 2022 at 2:20 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> Previously we use unsigned short to represent bf16. It's not a good
> expression, and at the time the front end didn't support bf16 type.
> Now we introduced __bf16 to X86 psABI. So we can switch intrinsics to the n
On Tue, Nov 1, 2022 at 9:21 AM Kong, Lingling via Gcc-patches
wrote:
>
> Hi
>
> The patch is for mention Intel __bf16 support in AVX512BF16 intrinsics.
> Ok for master ?
>
> Thanks,
> Lingling
>
> ---
> htdocs/gcc-13/changes.html | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/htdocs/g
On Mon, Oct 31, 2022 at 5:22 PM Uros Bizjak wrote:
>
> On Mon, Oct 31, 2022 at 2:10 AM liuhongt wrote:
> >
> > >You have a couple of other patterns where operand 1 is matched to
> > >produce vmovddup insn. These are *avx512f_unpcklpd512 and
> > >avx_unpcklpd256. You can also remove expander in bo
On Fri, Oct 14, 2022 at 3:57 PM Haochen Jiang via Gcc-patches
wrote:
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h (get_intel_cpu):
> Add Sierra Forest.
> * common/config/i386/i386-common.cc
> (processor_names): Add Sierra Forest.
> (processor_alias_
On Thu, Nov 3, 2022 at 2:53 PM Kong, Lingling wrote:
>
> > > > diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
> > > > index 7c6bfa6e..cd0282f1 100644
> > > > --- a/htdocs/gcc-13/changes.html
> > > > +++ b/htdocs/gcc-13/changes.html
> > > > @@ -230,6 +230,8 @@ a work-in-progre
On Wed, Aug 10, 2022 at 1:42 PM Alexandre Oliva via Gcc-patches
wrote:
>
> On Aug 9, 2022, Alexandre Oliva wrote:
>
> > Ping?
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598276.html
>
> Oops, sorry, I linked to the wrong patch. This is the one I meant to ping:
>
> https://gcc.gnu.or
On Tue, Aug 16, 2022 at 3:50 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> The patch is support vector init/broadcast/set/extract for __bf16 type.
> The __bf16 type is a storage type.
>
> OK for master?
Ok.
>
> gcc/ChangeLog:
>
> * config/i386/i386-expand.cc (ix86_expand_sse_movcc):
On Sat, Aug 20, 2022 at 1:31 AM H.J. Lu wrote:
>
> On Thu, Aug 18, 2022 at 5:56 PM Hongtao Liu via Gcc-patches
> wrote:
> >
> > On Thu, Aug 18, 2022 at 3:36 PM Haochen Jiang via Gcc-patches
> > wrote:
> > >
> > > Hi all,
> > >
> > &
On Mon, Aug 22, 2022 at 9:02 AM Hongtao Liu wrote:
>
> On Sat, Aug 20, 2022 at 1:31 AM H.J. Lu wrote:
> >
> > On Thu, Aug 18, 2022 at 5:56 PM Hongtao Liu via Gcc-patches
> > wrote:
> > >
> > > On Thu, Aug 18, 2022 at 3:36 PM Haochen Jiang via Gcc-pa
On Tue, Aug 23, 2022 at 1:02 AM H.J. Lu wrote:
>
> On 64-bit Windows, long is 32 bits and can't be used as stride in memory
> operand when base is a pointer which is 64 bits. Cast stride to
> __PTRDIFF_TYPE__, instead of long.
Ok.
>
> PR target/106714
> * config/i386/amxtileintrin
On Mon, Aug 22, 2022 at 10:16 AM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> This patch added __m128bf16/__m256bf16/__m512bf16 type in testcases.
Ok.
>
> BRs,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/x86_64/abi/bf16/bf16-helper.h:
> Add _m128bf16/m256bf16/_m
On Wed, Aug 24, 2022 at 9:15 AM liuhongt wrote:
>
> Since 256-bit vector integer comparison is under TARGET_AVX2,
> and gimple folding for vblendvpd/vblendvps/vpblendvb relies on that.
> Restrict gimple fold condition to TARGET_AVX2.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
>
On Sat, Aug 27, 2022 at 12:51 AM H.J. Lu wrote:
>
> On Mon, Aug 22, 2022 at 7:05 PM Hongtao Liu wrote:
> >
> > On Tue, Aug 23, 2022 at 1:02 AM H.J. Lu wrote:
> > >
> > > On 64-bit Windows, long is 32 bits and can't be used as stride in memory
> &g
On Wed, Aug 31, 2022 at 2:52 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> Handle E_V8BFmode in expand_vec_perm_broadcast_1 and
> ix86_expand_vector_init_duplicate.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/106742
> * config/i386/i386-expand.cc (ix86_expand_vector_in
On Fri, Sep 2, 2022 at 4:08 PM Kong, Lingling wrote:
>
> Hi,
>
> I fixed it in a new patch. And added BF vector mode in SUBST_V and
> avx512fmaskhalfmode for @vec_interleave_high.
> Ok for trunk ?
Ok.
>
> > > Hi,
> > >
> > > Handle E_V8BFmode in expand_vec_perm_broadcast_1 and
> > ix86_expand_ve
On Mon, Sep 5, 2022 at 10:44 AM liuhongt wrote:
>
> zmm-version vcvtps2ph is special, it encodes {sae} in evex, but put
> round control in the imm. For intrinsic _mm512_cvt_roundps_ph (a,
> imm), imm contains both {sae} and round control, we need to separate
> it in the assembly output since vcvtp
On Fri, Apr 22, 2022 at 8:43 PM Hongyu Wang wrote:
>
> > Please add the corresponding intrinsic test in sse-14.c
>
> Sorry for forgetting this part. Updated patch. Thanks.
>
LGTM.
> Hongtao Liu via Gcc-patches 于2022年4月22日周五 16:49写道:
> >
> > On Fri, Apr 22, 20
On Thu, May 5, 2022 at 3:37 PM liuhongt wrote:
>
> Enable optimization for TImode only under 32-bit target, for 64-bit
> target there could be extra ineteger <-> sse move regarding psABI,
> not efficient.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
> gcc/ChangeLo
On Thu, May 5, 2022 at 4:09 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, May 5, 2022 at 9:50 AM Richard Biener via Gcc-patches
> wrote:
> >
> > On Thu, May 5, 2022 at 9:37 AM liuhongt via Gcc-patches
> > wrote:
> > >
> > > Enable optimization for TImode only under 32-bit target, for 64-bit
On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches
wrote:
>
> This is adjusted patch only for OImode.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/104610
> * config/i386/i386-expand.cc (ix86_expand_branch): Use
On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches
wrote:
>
> pand/pandn may be used to clear upper/lower bits of the operands, in
> that case there will be 4-5 instructions for permutation, and it's
> still better than scalar codes.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,
701 - 800 of 1392 matches
Mail list logo