On Wed, Jun 5, 2024 at 10:44 PM Jeff Law wrote:
>
>
>
> On 6/4/24 10:22 PM, liuhongt wrote:
> >> Can you add a testcase for this? I don't mind if it's x86 specific and
> >> does a bit of asm scanning.
> >>
> >> Also note that the context for this patch has changed, so it won't
> >> automatically
On Thu, Jun 6, 2024 at 2:39 PM Hongyu Wang wrote:
>
> Current target apxf check does not specify sub-features that assembler
> supports, so the check with older binutils will fail at assemble stage
> for new apx features like NF,CCMP or CFCMOV. Adjust the assembler check
> for latest apx subfeatur
vpternlogd[ \\t] 694
>
>
> 2024-06-06 Roger Sayle
> Hongtao Liu
>
> gcc/ChangeLog
> * config/i386/i386-expand.cc (ix86_expand_args_builtin): Call
> fixup_modeless_constant before testing predicates. Only call
> copy_to_mode_reg on memory
On Mon, Jun 10, 2024 at 3:20 PM Roger Sayle wrote:
>
>
> This patch fixes PR target/115397, a recent regression caused by my
> ternlog patch that results in an ICE (building numpy) with -m32 -fPIC.
> The problem is that ix86_broadcast_from_constant, which calls
> get_pool_constant, doesn't handle
On Mon, Jun 10, 2024 at 2:37 PM Collin Funk wrote:
>
> A shift of 31 on a signed int is undefined behavior. Since unsigned
> int is 32-bits wide this change fixes it and silences the warning.
Ok.
>
> gcc/ChangeLog:
>
> PR target/115409
> * config/i386/avx512fp16intrin.h (_mm512_co
On Thu, Jun 13, 2024 at 4:20 AM Roger Sayle wrote:
>
>
> This patch makes more use of m32bcst and m64bcst addressing modes in
> ix86_expand_ternlog. Previously, the i386 backend would only consider
> using a m32bcst if the inner mode of the vector was 32-bits, or using
> m64bcst if the inner mode
On Thu, Jun 6, 2024 at 4:49 PM Kong, Lingling wrote:
>
> Enable ZU for IMUL (opcodes 0x69 and 0x6B) and SETcc.
>
> gcc/ChangeLog:
>
> * config/i386/i386-opts.h (enum apx_features):Add apx_zu.
> * config/i386/i386.h (TARGET_APX_ZU): Define.
> * config/i386/i386.md (*imulhizu
On Thu, May 30, 2024 at 1:52 PM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to extend __builtin_ia32_cmp[p|s][s|d] from avx to
> sse/sse2/avx, where its immediate is in range of [0, 7].
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
On Fri, Jun 14, 2024 at 6:31 PM Richard Biener wrote:
>
> The following retires vcond{,u,eq} optabs by stopping to use them
> from the middle-end. Targets instead (should) implement vcond_mask
> and vec_cmp{,u,eq} optabs. The PR this change refers to lists
> possibly affected targets - those imp
On Sat, Jun 15, 2024 at 1:22 AM Jeff Law wrote:
>
>
>
> On 6/14/24 11:10 AM, Alexander Monakov wrote:
> >
> > On Fri, 14 Jun 2024, Kong, Lingling wrote:
> >
> >> APX CFCMOV[1] feature implements conditionally faulting which means that
> >> all memory faults are suppressed
> >> when the condition
On Fri, Jun 14, 2024 at 10:53 PM Hongtao Liu wrote:
>
> On Fri, Jun 14, 2024 at 6:31 PM Richard Biener wrote:
> >
> > The following retires vcond{,u,eq} optabs by stopping to use them
> > from the middle-end. Targets instead (should) implement vcond_mask
> > and
On Thu, Jun 13, 2024 at 3:13 PM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to refine all cvtt* instructions with UNSPEC instead of
> FIX/UNSIGNED_FIX. Because the intrinsics should behave as documented.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
Ok.
>
> BRs,
> Lin
>
On Fri, Jun 14, 2024 at 9:35 AM Levy Hsu wrote:
>
> This patch updates the GCC x86 backend to efficiently handle
> odd, incrementally increasing permutations of BF16 vectors
> using the cvtne2ps2bf16 instruction.
> It modifies ix86_vectorize_vec_perm_const to support these operations
> and adds a
On Wed, Jun 19, 2024 at 5:04 AM Roger Sayle wrote:
>
>
> This patch tweaks ix86_ternlog_idx to allow any SUBREG that matches
> the register_operand predicate, and is split out as an independent
> piece of a patch that I have to clean-up redundant ternlog patterns
> in sse.md. It turns out that so
On Wed, Oct 25, 2023 at 2:49 AM Richard Sandiford
wrote:
>
> This patch adds a combine pass that runs late in the pipeline.
> There are two instances: one between combine and split1, and one
> after postreload.
>
> The pass currently has a single objective: remove definitions by
> substituting int
On Sat, Jun 22, 2024 at 5:49 AM Collin Funk wrote:
>
> Hi Hongtao,
>
> I submitted a patch silencing -Wshift-overflow on a signed int
> constant here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654016.html
>
> You OK'd it here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/202
On Wed, Jun 26, 2024 at 2:52 PM Richard Biener
wrote:
>
> On Wed, Jun 26, 2024 at 8:09 AM liuhongt wrote:
> >
> > 416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8c1c0.
> > The commit adjust rtx_cost of mem to reduce cost of (add op0 disp).
> > But Cost of ADDR could be cheaper tha
On Wed, Jun 26, 2024 at 4:02 PM Richard Biener
wrote:
>
> On Wed, Jun 26, 2024 at 9:14 AM Hongtao Liu wrote:
> >
> > On Wed, Jun 26, 2024 at 2:52 PM Richard Biener
> > wrote:
> > >
> > > On Wed, Jun 26, 2024 at 8:09 AM liuhongt wrote:
> > >
On Thu, Jul 18, 2024 at 5:29 PM Kong, Lingling wrote:
>
> I adjusted my patch based on the comments by H.J.
> And I will add the testcase like gcc.target/i386/pr101395-1.c when the march
> for APX is determined.
>
> Ok for trunk?
Synced with LLVM folks, they agreed to this solution.
Ok.
>
> Than
On Wed, Jul 24, 2024 at 3:11 PM Kong, Lingling wrote:
>
> Tested spec2017 performance in Sierra Forest, Icelake, CascadeLake, at least
> there is no obvious regression.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
>
> OK for trunk?
Ok.
>
> gcc/ChangeLog:
>
> * config/i386
On Wed, Jul 24, 2024 at 3:57 PM liuhongt wrote:
>
> For below pattern, RA may still allocate r162 as v/k register, try to
> reload for address with leaq __libc_tsd_CTYPE_B@gottpoff(%rip), %rsi
> which result a linker error.
>
> (set (reg:DI 162)
> (mem/u/c:DI
>(const:DI (unspec:DI
>
On Fri, Jul 26, 2024 at 2:28 PM Jiang, Haochen wrote:
>
> Ping for this patch
>
> Thx,
> Haochen
>
> > -Original Message-
> > From: Haochen Jiang
> > Sent: Thursday, July 18, 2024 9:45 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Liu, Hongtao ; hjl.to...@gmail.com;
> > ubiz...@gmail.com
> >
On Fri, Jul 26, 2024 at 2:59 PM liuhongt wrote:
>
> (insn 98 94 387 2 (parallel [
> (set (reg:TI 337 [ _32 ])
> (ashift:TI (reg:TI 329)
> (reg:QI 521)))
> (clobber (reg:CC 17 flags))
> ]) "test.c":11:13 953 {ashlti3_doubleword}
>
On Thu, Jul 25, 2024 at 3:23 PM Hongtao Liu wrote:
>
> On Wed, Jul 24, 2024 at 3:57 PM liuhongt wrote:
> >
> > For below pattern, RA may still allocate r162 as v/k register, try to
> > reload for address with leaq __libc_tsd_CTYPE_B@gottpoff(%rip), %rsi
> &g
On Fri, Jul 26, 2024 at 4:55 PM Haochen Jiang wrote:
>
> Hi all,
>
> I added related O0 testcase in this patch.
>
> Ok for trunk and backport to GCC 14 and GCC 13?
Ok.
>
> Thx,
> Haochen
>
> ---
>
> Changes in v2: Add testcases.
>
> ---
>
> Under -O0, with the "newly" introduced intrins, the varia
On Tue, Jul 30, 2024 at 9:27 AM Hongtao Liu wrote:
>
> On Fri, Jul 26, 2024 at 4:55 PM Haochen Jiang wrote:
> >
> > Hi all,
> >
> > I added related O0 testcase in this patch.
> >
> > Ok for trunk and backport to GCC 14 and GCC 13?
> Ok.
I mean for tru
On Wed, Jul 31, 2024 at 2:08 PM Kong, Lingling wrote:
>
> *add_4 and *adddi_4 are for shorter opcode from cmp to inc/dec or add
> $128.
>
> But NDD code is longer than the cmp code, so there is no need to support NDD.
>
>
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
>
> Ok for tr
On Wed, Jul 31, 2024 at 1:06 AM Uros Bizjak wrote:
>
> On Tue, Jul 30, 2024 at 3:00 PM Richard Biener wrote:
> >
> > On Tue, 30 Jul 2024, Alexander Monakov wrote:
> >
> > >
> > > On Tue, 30 Jul 2024, Richard Biener wrote:
> > >
> > > > > Oh, and please add a small comment why we don't use XFmode
On Wed, Jul 31, 2024 at 3:17 PM Uros Bizjak wrote:
>
> On Wed, Jul 31, 2024 at 9:11 AM Hongtao Liu wrote:
> >
> > On Wed, Jul 31, 2024 at 1:06 AM Uros Bizjak wrote:
> > >
> > > On Tue, Jul 30, 2024 at 3:00 PM Richard Biener wrote:
> > > >
>
On Tue, Jul 30, 2024 at 1:05 PM Hongyu Wang wrote:
>
> Richard Biener 于2024年7月26日周五 19:45写道:
> >
> > On Fri, Jul 26, 2024 at 10:50 AM Hongyu Wang wrote:
> > >
> > > Hi,
> > >
> > > When introducing munroll-only-small-loops, the option was marked as
> > > Target Save and added to -O2 default whic
On Thu, Aug 1, 2024 at 10:03 AM Kong, Lingling wrote:
>
>
>
> > -Original Message-
> > From: Liu, Hongtao
> > Sent: Thursday, August 1, 2024 9:35 AM
> > To: Kong, Lingling ; gcc-patches@gcc.gnu.org
> > Cc: Wang, Hongyu
> > Subject: RE: [PATCH] i386: Fix memory constraint for APX NF
> >
>
On Tue, Jul 30, 2024 at 11:04 AM liuhongt wrote:
>
> (insn 98 94 387 2 (parallel [
> (set (reg:TI 337 [ _32 ])
> (ashift:TI (reg:TI 329)
> (reg:QI 521)))
> (clobber (reg:CC 17 flags))
> ]) "test.c":11:13 953 {ashlti3_doubleword}
>
On Sat, Apr 13, 2024 at 6:42 AM H.J. Lu wrote:
>
> The x86 instruction size limit is 15 bytes. If a NDD instruction has
> a segment prefix byte, a 4-byte opcode prefix, a MODRM byte, a SIB byte,
> a 4-byte displacement and a 4-byte immediate, adding an address size
> prefix will exceed the size l
On Wed, Apr 24, 2024 at 1:46 PM Haochen Jiang wrote:
>
> Hi all,
>
> When we are using -mavx10.1-256 in command line and avx10.1-256 in
> target attribute together, zmm should never be generated. But current
> GCC will generate zmm since it wrongly enables EVEX512 for non-explicitly
> set AVX512.
On Tue, Apr 30, 2024 at 3:38 PM Jakub Jelinek wrote:
>
> On Tue, Apr 30, 2024 at 09:30:00AM +0200, Richard Biener wrote:
> > On Mon, Apr 29, 2024 at 5:30 PM H.J. Lu wrote:
> > >
> > > On Mon, Apr 29, 2024 at 6:47 AM liuhongt wrote:
> > > >
> > > > The Fortran standard does not specify what the r
CC uros.
On Mon, May 6, 2024 at 11:03 AM Kong, Lingling wrote:
>
> Hi,
> (if_then_else:SI (eq (reg:CCZ 17 flags)
> (const_int 0 [0]))
> (reg/v:SI 101 [ e ])
> (reg:SI 102))
> The cost is 8 for the rtx, the cost for
> (eq (reg:CCZ 17 flags) (const_int 0 [0])) is 4, but this is just
On Mon, May 6, 2024 at 3:40 PM Kong, Lingling wrote:
>
> Hi,
> Originally eliminate_regs_in_insn will transform
> (parallel [
> (set (reg:QI 130)
> (plus:QI (subreg:QI (reg:DI 19 frame) 0)
> (const_int 96)))
> (clobber (reg:CC 17 flag))]) {*addqi_1}
> to
> (set (reg:QI 130)
> (subr
On Wed, May 8, 2024 at 10:13 AM Hu, Lin1 wrote:
>
> Hi all,
>
> This patch aims to fix some intrinsics without alignment requirement, but
> raised runtime error's problem.
>
> Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> PR target/845
On Fri, May 10, 2024 at 6:26 AM Roger Sayle wrote:
>
>
> The following one line patch improves the code generated for V8QI and V4QI
> shifts when AV512BW and AVX512VL functionality is available.
+ /* With AVX512 its cheaper to do vpmovsxbw/op/vpmovwb. */
+ && !(TARGET_AVX512BW && TARGET
, that would also fix this mem operand
> issue. I hope to submit it for review this weekend.
I opened a PR for that. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
>
> Thanks again,
> Roger
>
> > From: Hongtao Liu
> > On Fri, May 10, 2024 at 6:26 AM Roger Sayle
&
On Mon, Aug 26, 2024 at 2:43 PM Haochen Jiang wrote:
>
> Hi all,
>
> I have just commited AVX10.2 new instructions patches into trunk hours
> ago. The next and final part for AVX10.2 upstream is to optimize code
> with AVX10.2 new instructions.
>
> In this patch series, it will contain the followi
On Mon, Sep 2, 2024 at 4:33 PM Levy Hsu wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> This patch introduces new mode iterators and expands for the i386
> architecture to support partial vectorization of bf16 operations using
> AVX10.2 instructions. Thes
On Mon, Sep 2, 2024 at 4:42 PM Levy Hsu wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
Ok.
>
> This patch supports sminmax for partial vectorized V2BF/V4BF.
>
> gcc/ChangeLog:
>
> * config/i386/mmx.md (3): New define_expand for
> V2BF/V4BFsmaxmin
>
>
On Tue, Sep 3, 2024 at 9:45 AM Jiang, Haochen via Gcc-regression
wrote:
>
> As each AVX10.2 testcases previously, this is caused by option combination
> warning,
> which is expected.
>
Can we put the warning for mix usage of mavx10 and -mavx512f under -Wpsabi
And add -Wno-psabi in addition to -ma
On Tue, Sep 3, 2024 at 2:24 PM Haochen Jiang wrote:
>
> Hi all,
>
> The intrin for non-optimized got a typo in mask type, which will cause
> the high bits of __mmask32 being unexpectedly zeroed.
>
> The test does not fail under O0 with current 1b since the testcase is
> wrong. We need to include a
On Wed, Sep 4, 2024 at 11:31 AM Levy Hsu wrote:
>
> Hi
>
> Bootstrapped and tested on x86-64-pc-linux-gnu.
> Ok for trunk?
Ok.
>
> This patch introduces support for vectorized FMA operations for bf16 types in
> V2BF and V4BF modes on the i386 architecture. New mode iterators and
> define_expand en
On Wed, Sep 4, 2024 at 10:53 AM Levy Hsu wrote:
>
> Hi
>
> This patch adds support for bf16 operations in V2BF and V4BF modes on i386,
> handling signbit, xorsign, copysign, abs, neg, and various logical operations.
>
> Bootstrapped and tested on x86-64-pc-linux-gnu.
> Ok for trunk?
Ok.
>
> gcc/Ch
On Wed, Sep 4, 2024 at 9:32 AM Levy Hsu wrote:
>
> Hi
>
> This change adds BFmode support to the ix86_preferred_simd_mode function
> enhancing SIMD vectorization for BF16 operations. The update ensures
> optimized usage of SIMD capabilities improving performance and aligning
> vector sizes with pr
On Fri, Sep 6, 2024 at 10:34 AM Jiang, Haochen wrote:
>
> > From: Levy Hsu
> > Sent: Thursday, September 5, 2024 4:55 PM
> > To: gcc-patches@gcc.gnu.org
> >
> > Simple testcase fix, ok for trunk?
> >
> > This patch removes specific register checks to account for possible
> > register spills and d
On Tue, Sep 10, 2024 at 3:35 PM Levy Hsu wrote:
>
> Simple testcase fix, ok for trunk?
Ok.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx10_2-partial-bf-vector-fma-1.c: Separated 32-bit
> scan
> and removed register checks in spill situations.
> ---
> .../i386/avx10_2-par
On Thu, Sep 5, 2024 at 10:05 AM Haochen Jiang wrote:
>
> Hi all,
>
> In avx512f-mask-type.h, we need SIZE being defined to get
> MASK_TYPE defined correctly. Fix those testcases where
> SIZE are not defined before the include for avv512f-mask-type.h.
>
> Note that for convert intrins in AVX10.2, t
On Wed, Sep 11, 2024 at 4:04 PM Richard Biener
wrote:
>
> On Wed, Sep 11, 2024 at 4:17 AM liuhongt wrote:
> >
> > GCC12 enables vectorization for O2 with very cheap cost model which is
> > restricted
> > to constant tripcount. The vectorization capacity is very limited w/
> > consideration
> >
On Thu, Sep 12, 2024 at 9:55 AM Levy Hsu wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
Ok.
>
> gcc/ChangeLog:
>
> * config/i386/i386.cc (ix86_get_mask_mode):
> Enable BFmode for targetm.vectorize.get_mask_mode with AVX10.2.
> * config/
On Wed, Sep 11, 2024 at 4:21 PM Hongtao Liu wrote:
>
> On Wed, Sep 11, 2024 at 4:04 PM Richard Biener
> wrote:
> >
> > On Wed, Sep 11, 2024 at 4:17 AM liuhongt wrote:
> > >
> > > GCC12 enables vectorization for O2 with very cheap cost model which is
>
On Tue, Feb 18, 2020 at 4:24 PM Uros Bizjak wrote:
>
>
>
> On Thu, Feb 13, 2020 at 9:39 AM Uros Bizjak wrote:
>>
>> > Changelog
>> > gcc/
>> >* config/i386/avx512vbmi2intrin.h
>> >(_mm512_[,mask_,maskz_]shrdi_epi16,
>> >_mm512_[,mask_,maskz_]shrdi_epi32,
>> >_m512_
On Tue, Feb 18, 2020 at 7:00 PM Hongtao Liu wrote:
>
> On Tue, Feb 18, 2020 at 4:24 PM Uros Bizjak wrote:
> >
> >
> >
> > On Thu, Feb 13, 2020 at 9:39 AM Uros Bizjak wrote:
> >>
> >> > Changelog
> >> > gcc/
> >> >
Hi:
This patch is enabling missing avx512f intrinsics listed as
_mm_mask_roundscale_sd
_mm_mask_roundscale_round_sd
_mm_maskz_roundscale_sd
_mm_maskz_roundscale_round_sd
_mm_mask_roundscale_ss
_mm_mask_roundscale_round_ss
_mm_maskz_roundscale_ss
_mm_maskz_roundscale_round_ss
Bootstrap ok, reg
On Sat, Oct 12, 2019 at 4:15 PM Jakub Jelinek wrote:
>
> Hi!
>
> > gcc/
> > * config/i386/avx512fintrin.h (_mm_mask_roundscale_ss,
> > _mm_maskz_roundscale_ss, _mm_maskz_roundscale_round_ss,
> > _mm_maskz_roundscale_round_ss, _mm_mask_roundscale_sd,
> > _mm_maskz_roundscale
On Mon, Oct 21, 2019 at 1:15 AM Gerald Pfeifer wrote:
>
> On Fri, 11 Oct 2019, liuho...@gcc.gnu.org wrote:
> > commit 63fbcfeaf27d9dd2083ccbd34bdff8fccb63949c
> > Author: liuhongt
> > Date: Fri Oct 11 14:27:47 2019 +0800
> >
> > Update gcc10 changes with new intel ISA.
>
> I just applied th
On Sat, Nov 16, 2019 at 7:27 AM Jeff Law wrote:
>
> On 11/14/19 5:21 AM, Richard Biener wrote:
> > On Tue, Nov 12, 2019 at 11:35 AM Hongtao Liu wrote:
> >>
> >> Hi:
> >> As mentioned in https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00832.html
> >
hi jakub:
VF is used for differentiating AVX512F/AVX/SSE, but there's
condition TARGET_AVX512F in avx512f_maskcmp3, it must be a TYPO
and should be VF_AVX512VL instead.
Bootstrap and regression test on i386/x86_64 backend is ok.
OK for trunk?
diff --git a/gcc/config/i386/sse.md b/gcc/config/i3
Hi:
Currently for VCOND_EXPR, integer mask operation is only available
for 512-bit vector, but since mask register is related to isa not
vector size, under avx512f we can also have 128/256-bit vector
condition move. My local tests show there's no boost frequency penalty
for using integer mask reg
Hi Uros and all:
This patch is about to enable support for ENQCMD(Enqueue Command)
which will be in Willow Cove.
There are two instructions for ENQCMD: ENQCMD and ENQCMDS. More
details please refer to
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-exte
On Fri, May 24, 2019 at 3:51 PM Uros Bizjak wrote:
>
> On Fri, May 24, 2019 at 9:43 AM Uros Bizjak wrote:
> >
> > On Fri, May 24, 2019 at 7:16 AM Hongtao Liu wrote:
> > >
> > > Hi Uros and all:
> > > This patch is about to enable support for E
On Thu, May 30, 2019 at 3:23 AM Jeff Law wrote:
>
> On 5/9/19 10:54 PM, Hongtao Liu wrote:
> > On Fri, May 10, 2019 at 3:55 AM Jeff Law wrote:
> >>
> >> On 5/6/19 11:38 PM, Hongtao Liu wrote:
> >>> Hi Uros and GCC:
> >>> This patch is
On Sat, Jun 1, 2019 at 6:08 AM Jeff Law wrote:
>
> On 5/30/19 2:53 AM, Hongtao Liu wrote:
> > On Thu, May 30, 2019 at 3:23 AM Jeff Law wrote:
> >> On 5/9/19 10:54 PM, Hongtao Liu wrote:
> >>> On Fri, May 10, 2019 at 3:55 AM Jeff Law wrote:
> >&g
Hi Jeff:
The following patch adds forgotten avx512f fpclass instrinsics for
masked scalar operations.
Bootstrapped/regtested on x86_64-linux and i686-linux (on skylake-avx512),
ok for trunk?
Changelog:
gcc/
+2019-03-24 Hongtao Liu
+
+ PR target/89803
+ * config/i386/avx512dqintrin.h
On Mon, Jun 3, 2019 at 7:06 PM Jakub Jelinek wrote:
>
> On Mon, Jun 03, 2019 at 06:01:40PM +0800, Hongtao Liu wrote:
> > The following patch adds forgotten avx512f fpclass instrinsics for
> > masked scalar operations.
> >
> > Bootstrapped/regtested on x86_64-li
On Tue, Jun 4, 2019 at 3:59 PM Jakub Jelinek wrote:
>
> On Tue, Jun 04, 2019 at 03:38:08PM +0800, Hongtao Liu wrote:
> > --- gcc/ChangeLog (revision 271853)
> > +++ gcc/ChangeLog (working copy)
> > @@ -4706,6 +4706,26 @@
> > reprocessing. Always ca
On Tue, Jun 4, 2019 at 5:21 PM Jakub Jelinek wrote:
>
> On Tue, Jun 04, 2019 at 05:00:05PM +0800, Hongtao Liu wrote:
> > Thanks for reminding, Here is updated:
>
> You've missed some notes. Ok for trunk with:
> 1) the following patch applied on top of your patch
> 2
On Tue, Jun 4, 2019 at 5:56 PM Hongtao Liu wrote:
>
> On Tue, Jun 4, 2019 at 5:21 PM Jakub Jelinek wrote:
> >
> > On Tue, Jun 04, 2019 at 05:00:05PM +0800, Hongtao Liu wrote:
> > > Thanks for reminding, Here is updated:
> >
> > You've missed some not
ed on x86_64-linux and i686-linux (on skylake-avx512),
ok for trunk?
Changelog
gcc/
2019-06-05 Hongtao Liu
* config/i386/sse.md (define_mode_suffix vecmemsuffix): New.
(define_insn "avx512dq_fpclass"):
Enable memory operand for it.
(define_insn "avx512dq_vmfpclass"): Ditto.
On Thu, Jun 6, 2019 at 6:18 AM Jeff Law wrote:
>
> On 6/5/19 1:39 AM, Hongtao Liu wrote:
> > Hi Jeff and Jakub:
> > When adding new intrinsics(PR target/89803), i found vfpclassp[sd],
> > vfpclasss[sd] patterns didn't support memory operand which is
> > suppo
-instruction-set-extensions-programming-reference.pdf
Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
Changelog:
gcc/
+2019-06-06 Hongtao Liu
+ H.J. Lu
+ Olga Makhotina
+
+ * common/config/i386/i386-common.c
+ (OPTION_MASK_ISA_AVX512VP2INTERSECT_SET
On Sat, Jun 8, 2019 at 4:12 AM Uros Bizjak wrote:
>
> On 6/7/19, H.J. Lu wrote:
>
> >> > > +/* Register pair. */
> >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 2); /* P2QI */
> >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 4); /* P2HI P4QI */
> >> > >
> >> > > I think
> >> > >
> >> > > INT_MODE (P2QI,
On Thu, Jun 20, 2019 at 2:13 PM Uros Bizjak wrote:
>
> On Thu, Jun 20, 2019 at 7:36 AM Hongtao Liu wrote:
> >
> > On Sat, Jun 8, 2019 at 4:12 AM Uros Bizjak wrote:
> > >
> > > On 6/7/19, H.J. Lu wrote:
> > >
> > > >> > > +/* Re
On Thu, Jun 20, 2019 at 10:58 PM H.J. Lu wrote:
>
> On Thu, Jun 20, 2019 at 3:54 AM Hongtao Liu wrote:
> >
> > On Thu, Jun 20, 2019 at 2:13 PM Uros Bizjak wrote:
> > >
> > > On Thu, Jun 20, 2019 at 7:36 AM Hongtao Liu wrote:
> > > >
> >
On Thu, Jun 20, 2019 at 7:37 PM Uros Bizjak wrote:
>
> On Thu, Jun 20, 2019 at 12:54 PM Hongtao Liu wrote:
> >
> > On Thu, Jun 20, 2019 at 2:13 PM Uros Bizjak wrote:
> > >
> > > On Thu, Jun 20, 2019 at 7:36 AM Hongtao Liu wrote:
> > > >
> >
On Fri, Jun 21, 2019 at 1:56 PM Uros Bizjak wrote:
>
> On Fri, Jun 21, 2019 at 4:21 AM Hongtao Liu wrote:
> >
> > On Thu, Jun 20, 2019 at 10:58 PM H.J. Lu wrote:
> > >
> > > On Thu, Jun 20, 2019 at 3:54 AM Hongtao Liu wrote:
> > > >
> >
his or attach the patch instead.
> > >> >>
> > >> >> > Index: ChangeLog
> > >> >> > ===
> > >> >> > --- ChangeLog (revision 272668)
> > >> >>
On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote:
>
> 2019-08-28 Uroš Bizjak
>
> * config/i386/i386.c (ix86_register_move_cost): Do not
> limit the cost of moves to/from XMM register to minimum 8.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> Actually committe
On Fri, Aug 30, 2019 at 8:10 AM Hongtao Liu wrote:
>
> On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote:
> >
> > 2019-08-28 Uroš Bizjak
> >
> > * config/i386/i386.c (ix86_register_move_cost): Do not
> > limit the cost of moves to/from XMM registe
On Fri, Aug 30, 2019 at 2:18 PM Uros Bizjak wrote:
>
> On Fri, Aug 30, 2019 at 2:08 AM Hongtao Liu wrote:
> >
> > On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote:
> > >
> > > 2019-08-28 Uroš Bizjak
> > >
> > > * config/i386/i386.c
> which is not the case with core_cost (and similar with skylake_cost):
>
> 2, 2, 4,/* cost of moving XMM,YMM,ZMM register */
> {6, 6, 6, 6, 12},/* cost of loading SSE registers
>in 32,64,128,256 and 512-bit */
> {6, 6, 6, 6, 12},
On Mon, Sep 2, 2019 at 6:23 PM Richard Biener
wrote:
>
> On Mon, Sep 2, 2019 at 10:13 AM Hongtao Liu wrote:
> >
> > > which is not the case with core_cost (and similar with skylake_cost):
> > >
> > > 2, 2, 4,/* cost of moving XMM,YMM,
On Mon, Sep 2, 2019 at 4:41 PM Uros Bizjak wrote:
>
> On Mon, Sep 2, 2019 at 10:13 AM Hongtao Liu wrote:
> >
> > > which is not the case with core_cost (and similar with skylake_cost):
> > >
> > > 2, 2, 4,/* cost of moving XMM,YMM,
On Wed, Sep 4, 2019 at 12:50 AM Uros Bizjak wrote:
>
> On Tue, Sep 3, 2019 at 1:33 PM Richard Biener
> wrote:
>
> > > > Note:
> > > > Removing limit of cost would introduce lots of regressions in SPEC2017
> > > > as follow
> > > >
> > > > 531.deepsjeng_r -7.18%
On Wed, Sep 4, 2019 at 9:44 AM Hongtao Liu wrote:
>
> On Wed, Sep 4, 2019 at 12:50 AM Uros Bizjak wrote:
> >
> > On Tue, Sep 3, 2019 at 1:33 PM Richard Biener
> > wrote:
> >
> > > > > Note:
> > > > > Removing limit of cost would in
Hi Uros:
This patch extend pass rpad to handle AVX512F vcvtusi2ss/vcvtusi2sd.
538.image_r would be improved by 4% with single copy run on skylake
workstation.
Bootstrap ok. regression test for i386/x86 backend ok.
Ok for trunk?
Changelog
gcc/
* config/i386/i386.md
(*floatuns2_avx512)
to invent something like SPECIAL_INT_MODE, which would
> avoid mode promotion functionality (basically, it should not be listed
> in mode_wider and similar arrays). This would prevent mode promotion
> issues, while it would still allow to have mode, having the same width
> as existing mode, but with special properties.
>
> I'
On Wed, Jun 26, 2019 at 1:13 AM Uros Bizjak wrote:
>
> On Tue, Jun 25, 2019 at 4:44 AM Hongtao Liu wrote:
> >
> > On Sat, Jun 22, 2019 at 3:38 PM Uros Bizjak wrote:
> > >
> > > On Fri, Jun 21, 2019 at 8:38 PM H.J. Lu wrote:
> > >
>
On Wed, Jun 26, 2019 at 5:21 PM Martin Liška wrote:
>
> Hi.
>
> Started from r272668 I see:
>
> /tmp/ccqxwVjt.s: Assembler messages:
>
> /tmp/ccqxwVjt.s:22: Error: no such instruction: `vp2intersectq
> .LC1(%rip),%zmm0,%k0'
>
> /tmp/ccqxwVjt.s:33: Error: no such instruction: `vp2intersectd
> .LC
_avx512ifma { } {
> > return [check_no_compiler_messages avx512ifma object {
>
> as usual, the new effective-target keyword needs documenting in
> sourcebuild.texi.
Like this?
Index: ChangeLog
===
--- ChangeLog (revis
; single space. Please fix this or attach the patch instead.
>
> > Index: ChangeLog
> > ===
> > --- ChangeLog (revision 272668)
> > +++ ChangeLog (working copy)
> > @@ -1,3 +1,8 @@
> > +2019-06-27
gt; > =======
> >> > --- ChangeLog (revision 272668)
> >> > +++ ChangeLog (working copy)
> >> > @@ -1,3 +1,8 @@
> >> > +2019-06-27 Hongtao Liu
> >> > +
> >> >
269894)
+++ ChangeLog (working copy)
@@ -1,3 +1,16 @@
+2019-03-24 Hongtao Liu
+
+ PR target/89803
+ * config/i386/avx512dqintrin.h
+ (_mm_mask_fpclass_ss_mask,_mm_mask_fpclass_sd_mask):
+ New intrinsics.
+ * config/i386/i386-builtin.def
+ (__builtin_ia32_fpcla_mask
Hi Uros:
would you help to review this patch?
Regards,
Hongtao.
On Sun, Mar 24, 2019 at 8:13 PM Hongtao Liu wrote:
>
> Hi:
> The following patch adds forgotten avx512f fpclass instrinsics for
> masked scalar operations.
>
> Bootstrapped/regtested on x86_64-linux and i686
On Sat, Mar 30, 2019 at 5:34 AM Jeff Law wrote:
>
> On 3/28/19 1:38 AM, Uros Bizjak wrote:
> > On Thu, Mar 28, 2019 at 7:47 AM Hongtao Liu wrote:
> >>
> >> Hi Uros:
> >> would you help to review this patch?
> >
> > This is AVX512F patch, you w
On Fri, Apr 12, 2019 at 3:30 PM Uros Bizjak wrote:
>
> On Fri, Apr 12, 2019 at 9:09 AM Liu, Hongtao wrote:
> >
> > Hi :
> > This patch is about to enable support for bfloat16 which will be in
> > Future Cooper Lake, Please refer to
> > https://software.intel.com/en-us/download/intel-archite
On Tue, Apr 16, 2019 at 11:41 PM H.J. Lu wrote:
>
> On Tue, Apr 16, 2019 at 8:36 AM Martin Liška wrote:
> >
> > On 4/16/19 4:50 PM, H.J. Lu wrote:
> > > On Tue, Apr 16, 2019 at 1:28 AM Martin Liška wrote:
> > >>
> > >> On 4/15/19 5:09 PM, H.J. Lu wrote:
> > >>> On Mon, Apr 15, 2019 at 12:26 AM M
201 - 300 of 1387 matches
Mail list logo