On Tue, Aug 3, 2021 at 7:12 PM Hongtao Liu wrote:
>
> On Tue, Aug 3, 2021 at 6:20 PM Richard Biener
> wrote:
> >
> > On Tue, Aug 3, 2021 at 11:20 AM Richard Biener
> > wrote:
> > >
> > > On Wed, Jul 28, 2021 at 4:51 AM Hongtao Liu via Gcc-patches
On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> wrote:
> >
> > Richard Biener via Gcc-patches writes:
> > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt wrote:
> > >>
> > >> Hi:
> > >> ---
> > >> OK, I think sth is amiss he
On Tue, Aug 10, 2021 at 4:11 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch fixes some typo in amxbf16-dpbf16ps-2 test.
>
> Tested under sde/spr machine and passed.
>
> OK for master and backport to GCC 11?
Ok for master, and i don't think the backport is necessary.
>
> gcc/testsuite
On Tue, Aug 10, 2021 at 4:44 PM Jakub Jelinek wrote:
>
> Hi!
>
> On the following testcase we emit
> vmovdqa32 .LC0(%rip), %zmm1
> vpermd %zmm0, %zmm1, %zmm0
> and
> vmovdqa64 .LC1(%rip), %zmm1
> vpermq %zmm0, %zmm1, %zmm0
> instead of
> vshufi
On Tue, Aug 10, 2021 at 4:54 PM Jakub Jelinek wrote:
>
> Hi!
>
> When working on the PR, I've noticed we generate terrible code for
> V32HImode or V64QImode permutations for -mavx512f -mno-avx512bw.
> Generally we can't do much with such permutations, but since PR68655
> we can handle at least som
On Wed, Aug 11, 2021 at 3:58 PM Jakub Jelinek wrote:
>
> On Wed, Aug 11, 2021 at 02:43:06PM +0800, liuhongt wrote:
> > Add define_insn_and_split to combine avx_vec_concatv16si/2 and
> > avx512f_zero_extendv16hiv16si2_1 since the latter already zero_extend
> > the upper bits, similar for other pa
On Wed, Aug 11, 2021 at 7:16 PM Uros Bizjak wrote:
>
> On Wed, Aug 11, 2021 at 8:36 AM Uros Bizjak wrote:
> >
> > On Tue, Aug 10, 2021 at 2:13 PM liuhongt wrote:
> > >
> > > Hi:
> > > AVX512F supported vscalefs{s,d} which is the same as ldexp except the
> > > second operand should be floating
On Thu, Aug 12, 2021 at 12:05 PM liuhongt wrote:
>
> Hi:
> This is the patch i'm going to checkin.
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,};
>
>
> 2021-08-12 Uros Bizjak
>
> gcc/ChangeLog:
>
> PR target/98309
> * config/i386/i386.md (avx512f_scalef2): New
>
On Thu, Aug 12, 2021 at 3:49 PM Jakub Jelinek wrote:
>
> Hi!
>
> My patch from yesterday apparently broke some V32HImode permutations
> as the testcase shows.
> The first function assumed it would never be called in d->testing_p mode
> and so went right away into emitting the code.
> And the secon
On Thu, Aug 12, 2021 at 5:23 PM Jakub Jelinek wrote:
>
> On Thu, Aug 12, 2021 at 01:43:23PM +0800, liuhongt wrote:
> > Hi:
> > This is another patch to optimize vec_perm_expr to match vpmov{dw,dq,wb}
> > under AVX512.
> > For scenarios(like pr101846-2.c) where the upper half is not used, this
On Mon, Aug 16, 2021 at 3:11 PM Jakub Jelinek via Gcc-patches
wrote:
>
> On Mon, Aug 16, 2021 at 01:18:38PM +0800, liuhongt via Gcc-patches wrote:
> > + /* Accept VNxHImode and VNxQImode now. */
> > + if (!TARGET_AVX512VL && GET_MODE_SIZE (mode) < 64)
> > +return false;
> > +
> > + /* vper
On Mon, Aug 16, 2021 at 3:25 PM Hongtao Liu wrote:
>
> On Mon, Aug 16, 2021 at 3:11 PM Jakub Jelinek via Gcc-patches
> wrote:
> >
> > On Mon, Aug 16, 2021 at 01:18:38PM +0800, liuhongt via Gcc-patches wrote:
> > > + /* Accept VNxHImode and VNxQImode now. */
On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu wrote:
>
> On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> wrote:
> >
> > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> > wrote:
> > >
> > > Richard Biener via Gcc-patches writes:
&g
On Fri, Aug 6, 2021 at 2:06 PM Hongtao Liu wrote:
>
> On Tue, Aug 3, 2021 at 10:44 AM Hongtao Liu wrote:
> >
> > On Tue, Aug 3, 2021 at 3:34 AM Joseph Myers wrote:
> > >
> > > On Mon, 2 Aug 2021, liuhongt via Gcc-patches wrote:
> > >
> > > &
On Tue, Aug 17, 2021 at 8:56 PM H.J. Lu via Gcc-patches
wrote:
>
> On Tue, Aug 17, 2021 at 5:43 AM liuhongt via Gcc-patches
> wrote:
> >
> > This reverts commit 872da9a6f664a06d73c987aa0cb2e5b830158a10.
> >
> > PR target/101936
> > PR target/101929
> >
> > Bootstrapped and regtested on x86_64-l
On Tue, Aug 17, 2021 at 10:43 PM Richard Biener via Gcc-patches
wrote:
>
> On Tue, Aug 17, 2021 at 3:29 PM Richard Biener via Gcc-patches
> wrote:
> >
> > This is an attempt to start moving the x86 backend to use
> > standard pattern names for [mask_]gather_load and [mask_]scatter_store
> > rathe
On Tue, Aug 17, 2021 at 10:43 PM Richard Biener via Gcc-patches
wrote:
>
> On Tue, Aug 17, 2021 at 3:29 PM Richard Biener via Gcc-patches
> wrote:
> >
> > This is an attempt to start moving the x86 backend to use
> > standard pattern names for [mask_]gather_load and [mask_]scatter_store
> > rathe
On Wed, Aug 18, 2021 at 11:24 AM Hongtao Liu wrote:
>
> On Tue, Aug 17, 2021 at 10:43 PM Richard Biener via Gcc-patches
> wrote:
> >
> > On Tue, Aug 17, 2021 at 3:29 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > This is an attempt to start
On Tue, Aug 17, 2021 at 5:06 PM liuhongt wrote:
>
> Hi:
> This patch add a new x86 tune named X86_TUNE_V2DF_REDUCTION_PREFER_HADDPD
> to enable haddpd for v2df vector reduction, the tune is disabled by default.
>
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
> Ok for trunk?
>
Pushe
On Wed, Aug 18, 2021 at 4:32 PM Richard Biener wrote:
>
> On Wed, 18 Aug 2021, Hongtao Liu wrote:
>
> > On Wed, Aug 18, 2021 at 11:24 AM Hongtao Liu wrote:
> > >
> > > On Tue, Aug 17, 2021 at 10:43 PM Richard Biener via Gcc-patches
> > > wrote:
>
On Wed, Aug 18, 2021 at 5:54 PM Richard Biener wrote:
>
>
> So in the end I seem to be able to combine AVX & AVX512 arriving
> at the following which passes basic testing. I will now see to
> teach the vectorizer the required "promotion" to handle
> mask_gather_loadv4dfv4si and mask_gather_loadv4
On Wed, Aug 18, 2021 at 6:28 PM Richard Biener wrote:
>
> On Wed, 18 Aug 2021, Richard Biener wrote:
>
> >
> > So in the end I seem to be able to combine AVX & AVX512 arriving
> > at the following which passes basic testing. I will now see to
> > teach the vectorizer the required "promotion" to h
On Wed, Aug 18, 2021 at 7:30 PM Hongtao Liu wrote:
>
> On Wed, Aug 18, 2021 at 6:28 PM Richard Biener wrote:
> >
> > On Wed, 18 Aug 2021, Richard Biener wrote:
> >
> > >
> > > So in the end I seem to be able to combine AVX & AVX512 arriving
> &g
On Wed, Aug 18, 2021 at 7:37 PM Hongtao Liu wrote:
>
> On Wed, Aug 18, 2021 at 7:30 PM Hongtao Liu wrote:
> >
> > On Wed, Aug 18, 2021 at 6:28 PM Richard Biener wrote:
> > >
> > > On Wed, 18 Aug 2021, Richard Biener wrote:
> > >
> > > &g
On Sun, Aug 22, 2021 at 8:54 PM H.J. Lu via Gcc-patches
wrote:
>
> In vetor move pattern, replace nonimmediate_or_sse_const_operand with
> nonimmediate_or_sse_const_vector_operand to allow vector load from
> non-uniform CONST_VECTOR. Non-uniform CONST_VECTOR is enabled only in
> the combine pass
On Tue, Aug 24, 2021 at 9:01 AM H.J. Lu via Gcc-patches
wrote:
>
> Broadcast from integer to a pseudo vector register instead of a hard
> vector register to allow LRA to remove redundant move instruction after
> broadcast.
>
> gcc/
>
> PR target/102021
> * config/i386/i386-expand.c
On Mon, Aug 23, 2021 at 9:14 PM H.J. Lu wrote:
>
> On Mon, Aug 23, 2021 at 03:23:26PM +0800, Hongtao Liu wrote:
> > On Sun, Aug 22, 2021 at 8:54 PM H.J. Lu via Gcc-patches
> > wrote:
> > >
> > > In vetor move pattern, replace no
On Tue, Aug 24, 2021 at 9:43 AM H.J. Lu wrote:
>
> On Mon, Aug 23, 2021 at 6:17 PM Hongtao Liu wrote:
> >
> > On Tue, Aug 24, 2021 at 9:01 AM H.J. Lu via Gcc-patches
> > wrote:
> > >
> > > Broadcast from integer to a pseudo vector register instead of a
On Tue, Aug 17, 2021 at 9:53 AM Hongtao Liu wrote:
>
> On Fri, Aug 6, 2021 at 2:06 PM Hongtao Liu wrote:
> >
> > On Tue, Aug 3, 2021 at 10:44 AM Hongtao Liu wrote:
> > >
> > > On Tue, Aug 3, 2021 at 3:34 AM Joseph Myers
> > > wrote:
> > >
On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu wrote:
>
> On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu wrote:
> >
> > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
&g
On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu wrote:
>
> On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu wrote:
> >
> > On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu wrote:
> > >
> > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> > > wro
On Tue, Aug 24, 2021 at 9:36 AM liuhongt wrote:
>
> Also optimize below 3 forms to vpternlog, op1, op2, op3 are
> register_operand or unary_p as (not reg)
>
> A: (any_logic (any_logic op1 op2) op3)
> B: (any_logic (any_logic op1 op2) (any_logic op3 op4)) op3/op4 should
> be equal to op1/op2
> C: (
On Tue, Aug 24, 2021 at 6:25 PM liuhongt wrote:
>
> gcc/ChangeLog:
>
> PR target/101989
> * config/i386/sse.md (_vternlog):
> Enable avx512 embedded broadcast.
> (*_vternlog_all): Ditto.
> (_vternlog_mask): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> PR
On Tue, Aug 24, 2021 at 9:11 PM Bernhard Reutner-Fischer
wrote:
>
> On Tue, 24 Aug 2021 17:53:27 +0800
> Hongtao Liu via Gcc-patches wrote:
>
> > On Tue, Aug 24, 2021 at 9:36 AM liuhongt wrote:
> > >
> > > Also optimize below 3 forms to vpternlog, op1, op
On Wed, Aug 25, 2021 at 2:14 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> For avx512f_scattersi, mask operand only affect set src, we
> need to refine the pattern to let gcc know mask register also affect the dest.
> So we put mask operand into UNSPEC_VSIBADDR.
>
> Bootstrapped and regress
On Wed, Aug 25, 2021 at 5:14 AM Segher Boessenkool
wrote:
>
> Hi!
>
> On Tue, Aug 24, 2021 at 04:55:30PM +0800, liuhongt wrote:
> > This patch extend change_zero_ext to change illegitimate constant
> > into constant pool, this will enable simplification of below:
>
> It should be in a separate f
On Wed, Aug 25, 2021 at 5:14 AM Segher Boessenkool
wrote:
>
> Hi!
>
> On Tue, Aug 24, 2021 at 04:55:30PM +0800, liuhongt wrote:
> > This patch extend change_zero_ext to change illegitimate constant
> > into constant pool, this will enable simplification of below:
>
> It should be in a separate f
On Tue, Aug 24, 2021 at 7:39 PM Richard Biener
wrote:
>
> On Tue, Aug 24, 2021 at 11:38 AM Hongtao Liu wrote:
> >
> > On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu wrote:
> > >
> > > On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu wrote:
> > > >
&g
On Thu, Aug 26, 2021 at 7:16 AM Jeff Law wrote:
>
>
>
> On 8/24/2021 3:44 AM, Hongtao Liu via Gcc-patches wrote:
>
> On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu wrote:
>
> On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu wrote:
>
> On Mon, Aug 9, 2021 at 4:34 PM Hongtao
On Thu, Aug 26, 2021 at 12:57 PM liuhongt wrote:
>
> This patch is a follow-up to [1], it fold all shufps/shufpd builtins into
> gimple.
Of course for non-mask or mask all-ones version.
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/
On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
wrote:
>
> On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> wrote:
> >
> > Richard Biener via Gcc-patches writes:
> > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > wrote:
> > >>
> > >> Richard Biener via Gcc-patches
On Fri, Aug 27, 2021 at 10:03 AM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> For avx512f_scattersi, mask operand only affect set src, we need
> to refine the pattern to let gcc know mask register also affect the dest.
> So we put mask operand into UNSPEC_VSIBADDR.
>
> Bootstrapped and regre
On Tue, Aug 31, 2021 at 2:11 PM Richard Biener
wrote:
>
> On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu wrote:
> >
> > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandi
On Tue, Aug 31, 2021 at 2:30 PM Hongtao Liu wrote:
>
> On Tue, Aug 31, 2021 at 2:11 PM Richard Biener
> wrote:
> >
> > On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu wrote:
> > >
> > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
> > &
On Mon, Aug 30, 2021 at 8:25 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Aug 27, 2021 at 8:53 AM liuhongt wrote:
> >
> > When gimple simplifcation try to combine op and vec_cond_expr to cond_op,
> > it doesn't check if mask type matches. It causes an ICE when expand cond_op
> > with mi
On Mon, Sep 14, 2020 at 3:51 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Sep 11, 2020 at 11:19 PM Nathan Sidwell wrote:
> >
> > I noticed a compiler warning about out-of-bound access. Fixed thusly.
> >
> > gcc/
> > * config/i386/sse.md (mov): Fix operand indices.
> >
>
Hi:
This patch would avoid spill gprs to mask registers for non-AVX512
micro-architecture and fix regression in PR96744.
Bootstrap is ok, regression test for i386/x86-64 backend is ok.
No big performance impact on SPEC2017.
gcc/ChangeLog:
PR taregt/96744
* config/i386/x86-t
Hi:
Rtx cost of sse_to_integer would be used by pass_stv as a
measurement for the scalar-to-vector transformation. As
https://gcc.gnu.org/pipermail/gcc-patches/2019-August/528839.html
indicates, movement between sse regs and gprs should be much expensive
than movement inside gprs(which is 2 as de
Hi:
If -mavx implies -mxsave, then -mno-xsave should imply -mno-avx.
Current status is -mno-avx implies -mno-xsave which should be wrong.
Bootstrap is ok, Regression test is ok for i386/x86 backend.
Ok for trunk?
gcc/ChangeLog
* common/config/i386/i386-common.c
(OPTION_MASK_ISA_A
Thanks.
On Wed, Sep 16, 2020 at 8:54 PM Uros Bizjak wrote:
>
> > gcc/ChangeLog
> >
> > PR target/96861
> > * config/i386/x86-tune-costs.h (skylake_cost): increase rtx
> > cost of sse_to_integer from 2 to 6.
> >
> > gcc/testsuite
> >
> > * gcc.target/i386/pr95021-3.
Thanks!
On Wed, Sep 16, 2020 at 8:57 PM Uros Bizjak wrote:
>
> > gcc/ChangeLog
> >
> > * common/config/i386/i386-common.c
> > (OPTION_MASK_ISA_AVX_UNSET): Remove OPTION_MASK_ISA_XSAVE_UNSET.
> > (OPTION_MASK_ISA_XSAVE_UNSET): Add OPTION_MASK_ISA_AVX_UNSET.
> >
> > gcc/test
On Thu, Sep 17, 2020 at 12:10 PM Jeff Law wrote:
>
>
> On 9/15/20 9:20 PM, Hongtao Liu via Gcc-patches wrote:
> > Hi:
> > Rtx cost of sse_to_integer would be used by pass_stv as a
> > measurement for the scalar-to-vector transformation. As
> > https://gcc.g
Hi:
This is done in 2 steps:
1. Extend special memory constraint to handle non MEM_P cases, i.e.
(vec_duplicate:V4SF (mem:SF (addr)))
2. Refactor implementation of *_bcst{_1,_2,_3} patterns. Add new
predicate bcst_mem_operand and corresponding constraint "Br" to merge
"$(pattern)_bcst{_1,_2,_
Add new predicate bcst_mem_operand and corresponding constraint "Br"
to merge "$(pattern)_bcst{_1,_2,_3}" into "$(pattern)", also delete
those separate "*_bcst{_1,_2,_3}" patterns.
gcc/ChangeLog:
PR target/87767
* config/i386/constraints.md ("Br"): New special memory
con
On Thu, Sep 15, 2022 at 11:36 AM Kong, Lingling via Gcc-patches
wrote:
>
> Hi
>
> The patch is to fix vec_init_dup_v16bf, add correct handle for v16bf mode in
> ix86_expand_vector_init_duplicate.
> Add testcase with sse2 without avx2.
>
> OK for master?
>
> gcc/ChangeLog:
>
> PR target/10
On Fri, Sep 16, 2022 at 8:55 AM liuhongt wrote:
>
> For ifloor/lfloor/iceil/lceil/irint/lrint/iround/lround when size of
> in_mode is not equal out_mode, vectorizer doesn't go to internal fn
> way,still left that part in the ix86_builtin_vectorized_function.
>
> Remove others builtins and add corr
On Fri, Sep 16, 2022 at 9:09 AM liuhongt via Gcc-patches
wrote:
>
> There's peephole2 submit in 1990s which split cmp mem, 0 to load mem,
> reg + test reg, reg. I don't know exact reason why gcc do this.
>
> For latest x86 processors, ciscization should help processor frontend
> also codesize, for
On Tue, Sep 20, 2022 at 10:14 AM liuhongt wrote:
>
> Here's list the patch supported.
> rint/nearbyint/ceil/floor/trunc/lrint/lceil/lfloor/round/lround.
>
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/106910
> * config
On Fri, Sep 16, 2022 at 9:38 PM Alexander Monakov via Gcc-patches
wrote:
>
> On Fri, 16 Sep 2022, Uros Bizjak via Gcc-patches wrote:
>
> > On Fri, Sep 16, 2022 at 3:32 AM Jeff Law via Gcc-patches
> > wrote:
> > >
> > >
> > > On 9/15/22 19:06, liuhongt via Gcc-patches wrote:
> > > > There's peepho
On Tue, Sep 20, 2022 at 10:23 AM liuhongt wrote:
>
> The codes in vectorizable_induction for slp_node assume all phi_info
> have same induction type(vect_step_op_add), but since we support
> nonlinear induction, it could be wrong handled.
> So the patch return false when slp_node has mixed inducti
On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches
wrote:
>
> Hi!
>
> The following patch implements the compiler part of C++23
> P1467R9 - Extended floating-point types and standard names compiler part
> by introducing _Float{16,32,64,128} as keywords and builtin types
> like they are
+My intel folk phoebe working for llvm side.
On Tue, Sep 20, 2022 at 11:35 AM Hongtao Liu wrote:
>
> On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches
> wrote:
> >
> > Hi!
> >
> > The following patch implements the compiler part of C++23
> >
On Wed, Sep 21, 2022 at 3:41 PM Richard Biener via Gcc-patches
wrote:
>
> On Wed, Sep 21, 2022 at 1:41 AM liuhongt via Gcc-patches
> wrote:
> >
> > When init_expr is INTEGER_CST or REAL_CST, can_vec_perm_const_p is not
> > necessary since there's no real vec_perm needed, but
> > vec_gen_perm_mask
On Thu, Sep 22, 2022 at 9:17 AM liuhongt wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Verify 526.blend_r can be rebuilt with the fix.
>
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/106994
> * config/i386/mmx.md (floorv2sf2): Fix typo, use
> reg
On Thu, Sep 22, 2022 at 11:56 PM Jakub Jelinek wrote:
>
> On Tue, Sep 20, 2022 at 10:51:18AM +0200, Jakub Jelinek via Gcc-patches wrote:
> > On Tue, Sep 20, 2022 at 11:35:07AM +0800, Hongtao Liu wrote:
> > > > The question is (mainly for aarch64, arm and x86 backend mai
On Thu, Sep 22, 2022 at 3:20 PM Hu, Lin1 via Gcc-patches
wrote:
>
> Hi all,
>
> This patch aims to optimize code generation of
> __mm256_zextsi128_si256(__mm_set1_epi8(-1)). Reduce the number of
> instructions required to achieve the final result.
>
> Regtested on x86_64-pc-linux-gnu. Ok for tru
On Fri, Sep 23, 2022 at 11:07 AM Hu, Lin1 wrote:
>
> Hi, Hongtao
>
> I have modefied this patch and regtested on x86_64-pc-linux-gnu.
>
Ok.
> BRs.
> Lin
>
> -Original Message-
> From: Hongtao Liu
> Sent: Friday, September 23, 2022 9:48 AM
> To: Hu,
On Wed, Sep 28, 2022 at 7:35 AM H.J. Lu via Gcc-patches
wrote:
>
> encodekey128 and encodekey256 operations clear XMM4-XMM6. But it is
> documented that XMM4-XMM6 are reserved for future usages and software
> should not rely upon them being zeroed. Change encodekey128 and
Indeed. Ok for trunk an
This commit failed tests
FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq
FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq
FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq
FAIL: gcc.target/i386/pr92645.c scan-tree-dump-times optimized "vec_unpack_" 4
FAIL: gcc.target/i38
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107226
On Wed, Oct 12, 2022 at 9:55 AM Hongtao Liu wrote:
>
> This commit failed tests
>
> FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq
> FAIL: gcc.target/i386/pr101668.c scan-assembler vpmovsxdq
> FAIL: gcc.target/i3
On Tue, Mar 1, 2022 at 10:27 AM H.J. Lu via Gcc-patches
wrote:
>
> On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote:
> >
> > .. in ix86_expand_vector_move and
> > ix86_convert_const_wide_int_to_broadcast(called by the former).
> >
> > ix86_expand_vector_move is called by emit_move_insn which is use
On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches
wrote:
>
> On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu wrote:
> >
> > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote:
> > >
> > > .. in ix86_expand_vector_move and
> > > ix86_convert_const_wide_int_to_broadcast(called by the former).
> > >
> >
On Wed, Mar 2, 2022 at 6:49 AM H.J. Lu wrote:
>
> On Tue, Mar 1, 2022 at 7:06 AM H.J. Lu wrote:
> >
> > On Mon, Feb 28, 2022 at 9:36 PM Hongtao Liu wrote:
> > >
> > > On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches
> > > wrote:
> >
On Thu, Mar 3, 2022 at 10:22 PM H.J. Lu via Gcc-patches
wrote:
>
> ix86_gen_scratch_sse_rtx returns XMM7/XMM15/XMM31 as a scratch vector
> register to prevent RTL optimizers from removing vector register. It
> introduces a conflict with explicit XMM7/XMM15/XMM31 usage and when it
> is called by R
On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches
wrote:
>
> This is incremental patch based on [1], it enables optimization as below
>
> - vbroadcastss.LC1(%rip), %xmm0
> + movl$-45, %edx
> + vmovd %edx, %xmm0
> + vpshufd $0, %xmm0, %xmm0
>
> According to
On Sat, Mar 5, 2022 at 4:05 PM Jakub Jelinek wrote:
>
> Hi!
>
> The following testcase ICEs, because the cond_andv* expander
> has vector_operand predicates in both of the commutative inputs
> and calls gen_andv*_mask which calls ix86_binary_operator_ok
> in its condition, but nothing calls ix86_f
Met some problem in git send-email --cc=a,b,c, so manually CC.
On Mon, Mar 7, 2022 at 1:11 PM liuhongt via Gcc-patches
wrote:
>
> >What happens if you set preferred_for_speed to false for alternative 1?
> It works, and I've removed the newly added splitter in this patch.
> Also i tried to do simi
On Fri, Mar 4, 2022 at 3:28 PM liuhongt via Gcc-patches
wrote:
>
> For parameter passing through stack, vectorized load from parm_decl
> in callee may trigger serious STF issue. This is why GCC12 regresses
> 50% for cray at -O2 compared to GCC11.
>
> The patch add an extremely large number to stmt
ping^1
On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote:
>
> On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
> >
> > The patch fixes ICE in ix86_gimple_fold_builtin.
> >
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> Ok for main trunk?
>
> > g
On Mon, Mar 7, 2022 at 5:37 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Mar 4, 2022 at 8:27 AM liuhongt wrote:
> >
> > For parameter passing through stack, vectorized load from parm_decl
> > in callee may trigger serious STF issue. This is why GCC12 regresses
> > 50% for cray at -O2 comp
On Tue, Mar 8, 2022 at 9:30 AM Hongtao Liu wrote:
>
> ping^1
>
> On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote:
> >
> > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
> > >
> > > The patch fixes ICE in ix86_gimple_fold_builtin.
> > >
>
On Fri, Mar 11, 2022 at 8:43 PM Richard Biener via Gcc-patches
wrote:
>
> After accounting for GPR -> XMM move cost for vec_construct the
> base cost needs adjustments to not double-cost those. This also
> lowers the cost when such move is not necessary.
>
> This fixes the observed 538.imagick_r
On Sun, Mar 13, 2022 at 3:28 AM Jakub Jelinek wrote:
>
> Hi!
>
> These intrinsics are supposed to do an unaligned may_alias load
> of a 16-bit or 32-bit value and store it as the first element of
> a 128-bit integer vector, with all other elements cleared.
>
> The current _mm_storeu_* implementati
On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote:
>
> On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote:
> > LGTM, thanks for handling this.
>
> Thanks, committed.
>
> > > Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2,
> > > f
On Mon, Mar 14, 2022 at 8:20 PM Hongtao Liu wrote:
>
> On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote:
> >
> > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote:
> > > LGTM, thanks for handling this.
> >
> > Thanks, committed.
> >
On Tue, Mar 15, 2022 at 10:40 PM H.J. Lu wrote:
>
> On Mon, Mar 14, 2022 at 7:31 AM H.J. Lu wrote:
> >
> > Push target("general-regs-only") in if x87 is enabled.
> >
> > gcc/
> >
> > PR target/104890
> > * config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before
> > push
On Tue, Mar 15, 2022 at 10:52 PM Roger Sayle wrote:
>
>
> This simple i386 patch unblocks a more significant change. The testcase
> gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and
> alas the fix for PR target/94680 doesn't (yet) handle V2DF mode.
>
> For the first test fro
On Wed, Mar 16, 2022 at 5:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Wed, Mar 16, 2022 at 3:19 AM liuhongt wrote:
> >
> > This patch only handle pure-slp for by-value passed parameter which
> > has nothing to do with IPA but psABI. For by-reference passed
> > parameter IPA is required.
>
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote:
>
> Hi Hongtao,
>
> This patch is to correct march=sapphirerapids to base on icelake server.
> and update sapphirerapids in the documentation.
>
> OK for master and backport to GCC 11?
Ok.
>
>
> gcc/Changelog:
>
> PR target/104963
>
On Sat, Mar 19, 2022 at 8:06 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch fixes typo in subst for scalar complex mask_round operand.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
>
> Ok for master?
>
Ok.
> gcc/ChangeLog:
>
> PR target/104977
> * c
On Sat, Mar 19, 2022 at 8:09 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
> mask should be and by 1 to ensure the mask is bind to lowest byte.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
>
> Ok for master?
>
> gcc
m_mask_move_ss (__m128 src, __mmask8 k, __m128 a, __m128 b)
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=vmovss&ig_expand=3807,3081,3082,3084,3083,4837,4838
>
> LLVM generates mask & 1 for these intrinsics.
>
> Hongtao Liu via Gcc-patches 于20
On Mon, Mar 21, 2022 at 7:52 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
> mask should be and by 1 to ensure the mask is bind to lowest byte.
> Use masked vmovss to perform same operation which omits higher bits
> of mask.
>
> Bootst
On Mon, Mar 21, 2022 at 9:06 PM liuhongt wrote:
>
> Failed to match this instruction:
> (set (reg/v:SI 88 [ z ])
> (if_then_else:SI (eq (zero_extract:SI (reg:SI 92)
> (const_int 1 [0x1])
> (zero_extend:SI (subreg:QI (reg:SI 93) 0)))
> (const_int 0 [0
On Wed, Mar 23, 2022 at 2:05 PM liuhongt via Gcc-patches
wrote:
>
> In validate_subreg, both (subreg:V2HF (reg:SI) 0)
> and (subreg:V8HF (reg:V2HF) 0) are valid, but not
> for (subreg:V8HF (reg:SI) 0) which causes ICE.
>
> Ideally it should be handled in validate_subreg to support
> subreg for all
On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches
wrote:
>
> Since we're now vectorizing by default at -O2 issues like PR101908
> become more important where we apply basic-block vectorization to
> parts of the function covering loads from function parameters passed
> on the stack. S
On Fri, Mar 25, 2022 at 9:42 PM Richard Biener wrote:
>
> On Fri, 25 Mar 2022, Hongtao Liu wrote:
>
> > On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > Since we're now vectorizing by default at -O2 issues like P
On Sat, Mar 26, 2022 at 4:50 AM H.J. Lu via Gcc-patches
wrote:
>
> Since KL instructions have no AVX512 version, replace the "v" register
> constraint with the "x" register constraint.
>
> PR target/105058
> * config/i386/sse.md (loadiwkey): Replace "v" with "x".
> (aesu8):
On Sat, Mar 26, 2022 at 1:27 AM H.J. Lu via Gcc-patches
wrote:
>
> Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND
> have no AVX512 version, replace the "Yv" register constraint with the
> "x" register constraint.
LGTM, please backport to GCC10/GCC11 branch.
>
> PR t
On Sat, Mar 26, 2022 at 10:05 AM Hongyu Wang via Gcc-patches
wrote:
>
> > > Is it possible to create a test case that gas would throw an error for
> > > invalid operands?
> >
> > You can use -ffix-xmmN to disable XMM0-15.
>
> I mean can we create an intrinsic test for this PR that produces xmm16-3
601 - 700 of 1383 matches
Mail list logo