On Thu, Apr 24, 2025 at 12:50 AM Jan Hubicka wrote:
>
> > In some benchmark, I notice stv failed due to cost unprofitable, but the
> > igain
> > is inside the loop, but sse<->integer conversion is outside the loop,
> > current cost
> > model doesn't consider the frequency of those gain/cost.
> >
On Fri, Apr 25, 2025 at 1:26 PM Jan Hubicka wrote:
>
> > On Thu, Apr 24, 2025 at 6:27 PM Jan Hubicka wrote:
> > >
> > > > Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand
> > > > or vpandn.
> > > > Current register_operand/vector_operand could lose some optimization
> > >
>
> I am not so sure about this when it come to relatively common
> instructions. Hiding things in unspec prevents combine and other RTL
> passes from doing their job. I would say that it only makes sense for
> siutations where RTL equivalent is very inconvenient.
>
In the direction of using gener
On Mon, Apr 28, 2025 at 5:07 PM H.J. Lu wrote:
>
> On Mon, Apr 28, 2025 at 4:26 PM H.J. Lu wrote:
> >
>
> > > > This is what my patch does:
> > > But it iterates through vector_insns, using a def-ref chain to find
> > > those insns. I think we can just record those single_set with src as
> > > co
On Sun, Apr 27, 2025 at 10:58 AM H.J. Lu wrote:
>
> When passing 0xff as an unsigned char function argument with the C frontend
> promotion, expand_normal used to get
>
> constant
> 255>
>
> and returned the rtx value using the sign-extended representation:
>
> (const_int 255 [0xff])
>
> But aft
On Mon, Feb 17, 2025 at 9:51 AM Hongtao Liu wrote:
>
> On Thu, Feb 13, 2025 at 4:08 PM Haochen Jiang wrote:
> >
> > Hi all,
> >
> > According to the previous feedback on our RFC for AVX10 option adjustment
> > and discussion with LLVM, we finalized how we a
On Wed, Mar 5, 2025 at 3:23 PM Haochen Jiang wrote:
>
> Hi all,
>
> For bf8 -> pf16 convert, when dst is 256 bit, the mask should be
> 16 bit since 16*16=256, not the 8 bit in the current intrin. In
> 512 bit intrin, the mask bit is also halved. This patch will fix
> both of them.
>
> Ok for trunk
On Wed, Feb 26, 2025 at 6:01 AM H.J. Lu wrote:
>
> Move the TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P target hook from
> i386.h to i386.cc.
Ok for the patch, looks obvious.
>
> * config/i386/i386.h (TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P):
> Moved to ...
> * config/i386/i386.cc (TARGET_SMALL_REGI
On Tue, Mar 4, 2025 at 6:31 PM Richard Biener
wrote:
>
> On Tue, Mar 4, 2025 at 11:18 AM Richard Sandiford
> wrote:
> >
> > Richard Sandiford writes:
> > > Jan Hubicka writes:
> > >>>
> > >>> Thanks for running these. I saw poor results for perlbench with my
> > >>> initial aarch64 hooks becau
On Wed, Feb 19, 2025 at 9:06 PM Jan Hubicka wrote:
>
> Hi,
> this is a variant of a hook I benchmarked on cpu2016 with -Ofast -flto
> and -O2 -flto. For non -Os and no Windows ABI should be pratically the
> same as your variant that was simply returning mem_cost - 2.
>
I've tested O2/(Ofast march
On Fri, Mar 28, 2025 at 1:55 PM Hu, Lin1 wrote:
>
> For vaes patterns with jm constraint and gpr16 attr, it requires "isa"
> attr to distinct avx/avx512 alternatives in ix86_memory_address_reg_class.
> Also adds missing type and mode attributes for those vaes patterns.
Ok.
>
> gcc/ChangeLog:
>
>
On Fri, Mar 28, 2025 at 4:22 PM Haochen Jiang wrote:
>
> Hi all,
>
> For -march= handling, PTA_AVX10_1 will not imply PTA_AVX10_1_256,
> resulting in TARGET_AVX10_1 becoming true while TARGET_AVX10_1_256
> false. Since we will check TARGET_AVX10_1_256 in GCC 15 for AVX512
> feature enabling for AV
ngtao 于2025年4月2日周三 08:57写道:
> >
> >
> >
> > > -Original Message-
> > > From: Uros Bizjak
> > > Sent: Tuesday, April 1, 2025 5:24 PM
> > > To: Hongtao Liu
> > > Cc: Wang, Hongyu ; gcc-patches@gcc.gnu.org; Liu,
> > > Hongtao
> > &g
On Wed, Mar 26, 2025 at 9:50 AM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to ensure each alternative with constraint "jm" should
> set addr "gpr16", otherwise maybe raise ICE in reload pass.
>
> Bootstrapped and Regtested for x86_64-pc-linux-gnu{-m32,-m64}, ok for trunk?
Ok.
>
> BRs,
> Lin
>
On Mon, Mar 31, 2025 at 9:52 PM Richard Biener wrote:
>
> On Mon, 31 Mar 2025, Jakub Jelinek wrote:
>
> > On Mon, Mar 31, 2025 at 03:33:34PM +0200, Richard Biener wrote:
> > > On Mon, 31 Mar 2025, Jakub Jelinek wrote:
> > >
> > > > On Mon, Mar 31, 2025 at 03:12:56PM +0200, Richard Biener wrote:
>
On Thu, May 8, 2025 at 2:40 PM liuhongt wrote:
>
> The only part I changed is related to size_cost of sse_to_ineteger, as below
>
> 114+ /* Under TARGET_SSE4_1, it's vmovd + vpextrd/vpinsrd.
> 115+ W/o it, it's movd + psrlq/unpckldq + movd. */
> 116+ else if (!TARGET_64BIT && smode != SImod
On Wed, May 7, 2025 at 9:06 AM H.J. Lu wrote:
>
> On Tue, May 6, 2025 at 3:35 PM Hongtao Liu wrote:
> >
> > On Tue, May 6, 2025 at 3:06 PM H.J. Lu wrote:
> > >
> > > On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote:
> > > >
> > > >
On Wed, May 14, 2025 at 9:22 AM liuhongt wrote:
>
> The Intel Decimal Floating-Point Math Library is available as open-source on
> Netlib[1].
>
> [1] https://www.netlib.org/misc/intel/
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ready push to trunk.
>
> libgcc/config/libbid/Ch
On Fri, Apr 18, 2025 at 7:10 PM H.J. Lu wrote:
>
> Add preserve_none attribute which is similar to no_callee_saved_registers
> attribute, except on x86-64, r12, r13, r14, r15, rdi and rsi registers are
Could you split preserve_none into a separate patch,
It looks like it's different from clang's p
On Wed, May 14, 2025 at 3:29 PM Haochen Jiang wrote:
>
> Hi all,
>
> This is the v2 patch to remove -mavx10.1/256-512 and -mno-evex512. I suppose
> this time all the patches will not be held due to size.
>
> As mentioned in GCC 15, we will remove -mavx10.1-256/512 and -mno-evex512
> options in GCC
It's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119181
On Fri, May 16, 2025 at 10:02 AM liuhongt wrote:
>
> The patch tries to solve miss vectorization for below case.
>
> void
> foo (int* a, int* restrict b)
> {
> b[0] = a[0] * a[64];
> b[1] = a[65] * a[1];
> b[2] = a[2] * a[66];
>
On Mon, May 26, 2025 at 4:55 PM Hu, Lin1 wrote:
>
> Hi, all
>
> Enable -mapxf will change some patterns about adc/sbb.
>
> Hence gcc will raise an extra mov like
> movq8(%rdi), %rax
> adcq%rax, 8(%rsi), %rax
> movq%rax, 8(%rdi)
> rather than
> movq
On Thu, May 29, 2025 at 4:56 PM Hu, Lin1 wrote:
>
> Hi,
>
> The patch aims to optimize
> movb(%rdi), %al
> movq%rdi, %rbx
> xorl%esi, %eax, %edx
> movb%dl, (%rdi)
> cmpb%sil, %al
> jne
> to
> xorb%sil, (%rdi)
>
On Tue, Jun 3, 2025 at 2:59 PM H.J. Lu wrote:
>
> Extend the remove_redundant_vector pass to handle vector broadcasts from
> constant and variable scalars. When broadcasting from constants and
> function arguments, we can place a single widest vector broadcast at
> entry of the nearest common dom
On Thu, Jun 12, 2025 at 10:51 AM Hu, Lin1 wrote:
>
> Hi,
>
> This patch aims to set SRF issue rate to 4, GNR issue rate to 6. According to
> tests about spec2017, the patch has little effect on performance.
>
> For GRR, CWF, DMR, ARL and PTL, the patch set their issue rate to 6. Waiting
> for
> m
Ping
On Mon, May 19, 2025 at 10:06 AM liuhongt wrote:
>
> From: "hongtao.liu"
>
> AutoFDO profile is a scaled profile, as a result, 0 sample does not
> mean never executed. especially there's profile from function
> body. Prevent combine_with_ipa_count·(ipa_count) from zeroing all
> bb->count.
>
On Mon, May 26, 2025 at 2:30 PM H.J. Lu wrote:
>
> On Sun, May 25, 2025 at 7:02 PM H.J. Lu wrote:
> >
> > On Sun, May 25, 2025 at 8:12 AM H.J. Lu wrote:
> > >
> > > On Sun, May 25, 2025 at 7:47 AM H.J. Lu wrote:
> > > >
> > > > commit ef26c151c14a87177d46fd3d725e7f82e040e89f
> > > > Author: Rog
On Wed, Jun 18, 2025 at 2:39 PM H.J. Lu wrote:
>
> On Mon, Jun 16, 2025 at 4:14 PM Hongtao Liu wrote:
> >
> > >+enum redundant_load_kind
> > >+{
> > >+ LOAD_CONST0_VECTOR,
> > >+ LOAD_CONSTM1_VECTOR,
> > >+ LOAD_VECTOR
> >
>+enum redundant_load_kind
>+{
>+ LOAD_CONST0_VECTOR,
>+ LOAD_CONSTM1_VECTOR,
>+ LOAD_VECTOR
>+};
Perhaps rename to x86_cse_kind, X86_CSE_CONST0_VECTOR,
X86_CSE_CONSTM1_VECTOR, X86_CSE_VEC_DUP?
LOAD sounds a bit ambiguous.
Similar to ix86_get_vector_load_mode -> ix86_get_vector_cse_mode?
>+
On Mon, Jun 16, 2025 at 4:30 PM Hongtao Liu wrote:
>
> >+enum redundant_load_kind
> >+{
> >+ LOAD_CONST0_VECTOR,
> >+ LOAD_CONSTM1_VECTOR,
> >+ LOAD_VECTOR
> >+};
> Perhaps rename to x86_cse_kind, X86_CSE_CONST0_VECTOR,
> X86_CSE_CONSTM1_VECTOR, X
Drop this patch since
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686830.html could
be a better alternative.
On Tue, Jun 10, 2025 at 9:50 AM Hongtao Liu wrote:
>
> Ping
>
> On Mon, May 19, 2025 at 10:06 AM liuhongt wrote:
> >
> > From: "hongtao.liu"
&
On Wed, Jun 18, 2025 at 6:38 PM H.J. Lu wrote:
>
> commit ef26c151c14a87177d46fd3d725e7f82e040e89f
> Author: Roger Sayle
> Date: Thu Dec 23 12:33:07 2021 +
>
> x86: PR target/103773: Fix wrong-code with -Oz from pop to memory.
>
> added "*mov_and" and extended "*mov_or" to transform
> "
ping.
On Mon, May 8, 2023 at 9:59 AM liuhongt wrote:
>
> > > @@ -4799,7 +4800,8 @@ vect_create_vectorized_demotion_stmts (vec_info
> > > *vinfo, vec *vec_oprnds,
> > >stmt_vec_info stmt_info,
> > >vec &vec_dsts,
> >
On Tue, Jun 6, 2023 at 12:49 PM Andrew Pinski wrote:
>
> On Mon, Jun 5, 2023 at 9:34 PM liuhongt via Gcc-patches
> wrote:
> >
> > r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for
> > TYPE_MIN, but PABSB will store unsigned result into dst. The patch
> > uses ABSU_EXPR + VCE inst
On Tue, Jun 6, 2023 at 5:11 PM Uros Bizjak wrote:
>
> On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches
> wrote:
> >
> > r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for
> > TYPE_MIN, but PABSB will store unsigned result into dst. The patch
> > uses ABSU_EXPR + VCE instead
On Tue, Jun 6, 2023 at 10:36 PM Uros Bizjak wrote:
>
> On Tue, Jun 6, 2023 at 1:42 PM Hongtao Liu wrote:
> >
> > On Tue, Jun 6, 2023 at 5:11 PM Uros Bizjak wrote:
> > >
> > > On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches
> > > wrote:
>
On Wed, Jun 7, 2023 at 8:31 AM Hongtao Liu wrote:
>
> On Tue, Jun 6, 2023 at 10:36 PM Uros Bizjak wrote:
> >
> > On Tue, Jun 6, 2023 at 1:42 PM Hongtao Liu wrote:
> > >
> > > On Tue, Jun 6, 2023 at 5:11 PM Uros Bizjak wrote:
> > > >
> &g
On Tue, Jun 6, 2023 at 4:23 PM liuhongt wrote:
>
> > I think this is a better patch and will always be correct and still
> > get folded at the gimple level (correctly):
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index d4ff56ee8dd..02bf5ba93a5 100644
> > --- a/gcc/config
On Mon, Jun 5, 2023 at 9:26 AM liuhongt wrote:
>
> This patch only support vec_pack/unpacks optabs for vector modes whose lenth
> >= 128.
> For 32/64-bit vector, they're more hanlded by BB vectorizer with
> truncmn2/extendmn2/fix{,uns}_truncmn2.
>
> Bootstrapped and regtested on x86_64-pc-linux-g
On Wed, Jun 14, 2023 at 1:55 PM Jan Beulich via Gcc-patches
wrote:
>
> Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on
> double precision floating values - is more appropriate to use here, and
> it can also result in shorter insn encodings when source is memory or
> %xmm0...%x
On Wed, Jun 14, 2023 at 1:56 PM Jan Beulich via Gcc-patches
wrote:
>
> gcc/
>
> * config/i386/constraints.md: Mention k and r for B.
Ok.
>
> --- a/gcc/config/i386/constraints.md
> +++ b/gcc/config/i386/constraints.md
> @@ -162,7 +162,9 @@
> ;; g GOT memory operand.
> ;; m Vector memo
On Wed, Jun 14, 2023 at 1:58 PM Jan Beulich via Gcc-patches
wrote:
>
> ... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are
> never longer (yet sometimes shorter) than the corresponding VSHUFPS /
> VPSHUFD, due to the immediate operand of the shuffle insns balancing the
> need for
On Wed, Jun 14, 2023 at 1:59 PM Jan Beulich via Gcc-patches
wrote:
>
> There's no reason to constrain this to AVX512VL, as the wider operation
> is not usable for more narrow operands only when the possible memory
But this may require more resources (on AMD znver4 processor a zmm
instruction will
On Tue, Jun 13, 2023 at 10:07 AM Kewen Lin via Gcc-patches
wrote:
>
> This patch adjusts the cost handling on
> VMAT_CONTIGUOUS_PERMUTE in function vectorizable_load. We
> don't call function vect_model_load_cost for it any more.
>
> As the affected test case gcc.target/i386/pr70021.c shows,
> th
On Wed, Jun 14, 2023 at 5:03 PM Jan Beulich wrote:
>
> On 14.06.2023 09:41, Hongtao Liu wrote:
> > On Wed, Jun 14, 2023 at 1:58 PM Jan Beulich via Gcc-patches
> > wrote:
> >>
> >> ... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are
> >&
On Wed, Jun 14, 2023 at 5:32 PM Jan Beulich wrote:
>
> On 14.06.2023 10:10, Hongtao Liu wrote:
> > On Wed, Jun 14, 2023 at 1:59 PM Jan Beulich via Gcc-patches
> > wrote:
> >>
> >> There's no reason to constrain this to AVX512VL, as the wider operation
&g
On Thu, Jun 15, 2023 at 1:23 PM Hongtao Liu wrote:
>
> On Wed, Jun 14, 2023 at 5:03 PM Jan Beulich wrote:
> >
> > On 14.06.2023 09:41, Hongtao Liu wrote:
> > > On Wed, Jun 14, 2023 at 1:58 PM Jan Beulich via Gcc-patches
> > > wrote:
> > >&g
On Thu, Jun 15, 2023 at 2:41 PM Jan Beulich wrote:
>
> On 15.06.2023 07:23, Hongtao Liu wrote:
> > On Wed, Jun 14, 2023 at 5:03 PM Jan Beulich wrote:
> >>
> >> On 14.06.2023 09:41, Hongtao Liu wrote:
> >>> On Wed, Jun 14, 2023 at 1:58 P
On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches
> wrote:
> >
> > The input constraint for the %vmovddup alternative was wrong, as the
> > upper 16 XMM registers require AVX512VL to be used with this insn. To
> > co
On Wed, Jul 12, 2023 at 4:57 AM Roger Sayle wrote:
>
>
> > From: Hongtao Liu
> > Sent: 28 June 2023 04:23
> > > From: Roger Sayle
> > > Sent: 27 June 2023 20:28
> > >
> > > I've also come up with an alternate/complementary/supplemen
ping.
On Mon, May 22, 2023 at 4:08 PM Hongtao Liu wrote:
>
> ping.
>
> On Sat, May 13, 2023 at 5:20 PM liuhongt wrote:
> >
> > > I think this could be simplified if you use either EnumSet or
> > > EnumBitSet instead in common.opt for `-fcf-protection=`.
>
On Wed, Jul 12, 2023 at 9:37 PM Richard Biener via Gcc-patches
wrote:
>
> The PRs ask for optimizing of
>
> _1 = BIT_FIELD_REF ;
> result_4 = BIT_INSERT_EXPR ;
>
> to a vector permutation. The following implements this as
> match.pd pattern, improving code generation on x86_64.
>
> On the RTL
On Thu, Jul 13, 2023 at 10:47 AM Hongtao Liu wrote:
>
> On Wed, Jul 12, 2023 at 9:37 PM Richard Biener via Gcc-patches
> wrote:
> >
> > The PRs ask for optimizing of
> >
> > _1 = BIT_FIELD_REF ;
> > result_4 = BIT_INSERT_EXPR ;
> >
> > to a
On Thu, Jul 13, 2023 at 2:32 PM Richard Biener wrote:
>
> On Thu, 13 Jul 2023, Hongtao Liu wrote:
>
> > On Thu, Jul 13, 2023 at 10:47?AM Hongtao Liu wrote:
> > >
> > > On Wed, Jul 12, 2023 at 9:37?PM Richard Biener via Gcc-patches
> > > wrote:
>
On Thu, Jul 13, 2023 at 2:06 PM Haochen Jiang via Gcc-patches
wrote:
>
> From: Kong Lingling
>
> gcc/ChangeLog
>
> * common/config/i386/cpuinfo.h (get_available_features): Detect
> avxvnniint16.
> * common/config/i386/i386-common.cc
> (OPTION_MASK_ISA2_AVXVNNIINT16
On Thu, Jul 13, 2023 at 2:06 PM Haochen Jiang via Gcc-patches
wrote:
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h (get_available_features):
> Detect SHA512.
> * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SHA512_SET,
> OPTION_MASK_ISA2_SHA512_UNSET)
On Thu, Jul 13, 2023 at 2:04 PM Haochen Jiang via Gcc-patches
wrote:
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h (get_available_features):
> Detect SM3.
> * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SM3_SET,
> OPTION_MASK_ISA2_SM3_UNSET): New.
>
On Thu, Jul 13, 2023 at 2:04 PM Haochen Jiang via Gcc-patches
wrote:
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h (get_available_features):
> Detech SM4.
> * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SM4_SET,
> OPTION_MASK_ISA2_SM4_UNSET): New.
>
On Fri, Jul 14, 2023 at 10:55 AM Mo, Zewei via Gcc-patches
wrote:
>
> Hi all,
>
> This patch is to add initial support for Lunar Lake, Arrow Lake and Arrow Lake
> S for GCC.
>
> This link of related information is listed below:
> https://www.intel.com/content/www/us/en/develop/download/intel-archi
On Fri, Jul 14, 2023 at 5:40 PM Jan Beulich via Gcc-patches
wrote:
>
> Introduce a new alternative permitting all 32 registers to be used as
> source without AVX512VL, by broadcasting to the full 512 bits in that
> case. (The insn would also permit all registers to be used as
> destination, but V2
On Fri, Jul 14, 2023 at 5:42 PM Jan Beulich via Gcc-patches
wrote:
>
> In the (however unlikely) event that no insn can be found for the
> requested mode, using maybe_gen_...() without (really) checking its
> result for being a null rtx would lead to silent bad code generation.
Ok.
>
> gcc/
>
>
On Mon, Jul 17, 2023 at 2:20 PM Jan Beulich wrote:
>
> On 17.07.2023 08:09, Hongtao Liu wrote:
> > On Fri, Jul 14, 2023 at 5:40 PM Jan Beulich via Gcc-patches
> > wrote:
> >>
> >> Introduce a new alternative permitting all 32 registers to be used as
> >
Ping.
On Tue, Jul 11, 2023 at 5:16 PM liuhongt via Gcc-patches
wrote:
>
> Similar like we did for CMPXCHG, but extended to all
> ix86_comparison_int_operator since CMPCCXADD set EFLAGS exactly same
> as CMP.
>
> When operand order in CMP insn is same as that in CMPCCXADD,
> CMP insn can be elimin
I'd like to ping for this patch (only patch 1/2, for patch 2/2, I
think that may not be necessary).
On Mon, May 15, 2023 at 9:20 AM Hongtao Liu wrote:
>
> ping.
>
> On Fri, Apr 21, 2023 at 9:55 PM liuhongt wrote:
> >
> > > > + if (!TARGET_SSE2)
> > &g
On Mon, Jul 17, 2023 at 7:38 PM Uros Bizjak wrote:
>
> On Mon, Jul 17, 2023 at 10:28 AM Hongtao Liu wrote:
> >
> > I'd like to ping for this patch (only patch 1/2, for patch 2/2, I
> > think that may not be necessary).
> >
> > On Mon, May 15, 2023 at 9:2
On Wed, Jul 12, 2023 at 3:27 PM Hongtao Liu wrote:
>
> ping.
>
> On Mon, May 22, 2023 at 4:08 PM Hongtao Liu wrote:
> >
> > ping.
> >
> > On Sat, May 13, 2023 at 5:20 PM liuhongt wrote:
> > >
> > > > I think this could be simplified i
On Thu, Jul 20, 2023 at 4:11 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, Jul 20, 2023 at 9:35 AM liuhongt wrote:
> >
> > For Intel processors, after TARGET_AVX, vmovdqu is optimized as fast
> > as vlddqu, UNSPEC_LDDQU can be removed to enable more optimizations.
> > Can someone confirm this
On Sat, Jul 29, 2023 at 11:55 AM haochen.jiang via Gcc-regression
wrote:
>
> On Linux/x86_64,
>
> b9d7140c80bd3c7355b8291bb46f0895dcd8c3cb is the first bad commit
> commit b9d7140c80bd3c7355b8291bb46f0895dcd8c3cb
> Author: Jan Hubicka
> Date: Fri Jul 28 09:16:09 2023 +0200
>
> loop-split im
On Fri, Jul 1, 2022 at 2:42 AM Roger Sayle wrote:
>
>
> This patch is a follow-up to Hongtao's fix for PR target/105854. That
> fix is perfectly correct, but the thing that caught my eye was why is
> the compiler generating a shift by zero at all. Digging deeper it
> turns out that we can easily
I think this can be taken as an obvious fix without prior approval.
"Obvious fixes can be committed without prior approval. Just check in
the fix and copy it to gcc-patches."
Quoted from https://gcc.gnu.org/gitwrite.html
On Fri, Jul 1, 2022 at 10:02 AM Haochen Jiang via Gcc-patches
wrote:
>
> Hi
On Fri, Jul 1, 2022 at 10:12 AM Hongtao Liu wrote:
>
> On Fri, Jul 1, 2022 at 2:42 AM Roger Sayle wrote:
> >
> >
> > This patch is a follow-up to Hongtao's fix for PR target/105854. That
> > fix is perfectly correct, but the thing that caught my eye was w
; This revised patch has been tested on x86_64-pc-linux-gnu with make
> bootstrap and make -k check, both with and with --target_board=unix{-32},
> with no new failures. Is this revised version Ok for mainline?
Ok.
>
>
> 2022-07-04 Roger Sayle
> Hongtao Liu
>
On Mon, Jul 11, 2022 at 7:47 PM Richard Biener via Gcc-patches
wrote:
>
> On Mon, Jul 11, 2022 at 5:44 AM liuhongt wrote:
> >
> > The patch only handles load/store(including ctor/permutation, except
> > gather/scatter) for complex type, other operations don't needs to be
> > handled since they wi
On Mon, Jul 11, 2022 at 4:03 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Mon, Jul 11, 2022 at 3:15 AM liuhongt wrote:
> >
> > And split it to GPR-version instruction after reload.
> >
> > This will enable below optimization for 16/32/64-bit vector bit_op
> >
> > - movd(%rdi), %xmm0
> >
On Tue, Jul 12, 2022 at 10:12 PM Richard Biener
wrote:
>
> On Tue, Jul 12, 2022 at 6:11 AM Hongtao Liu wrote:
> >
> > On Mon, Jul 11, 2022 at 7:47 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > On Mon, Jul 11, 2022 at 5:44 AM liuhongt wro
On Thu, Jul 14, 2022 at 4:20 PM Richard Biener
wrote:
>
> On Wed, Jul 13, 2022 at 9:34 AM Richard Biener
> wrote:
> >
> > On Wed, Jul 13, 2022 at 6:47 AM Hongtao Liu wrote:
> > >
> > > On Tue, Jul 12, 2022 at 10:12 PM Richard Biener
> > > wrote:
On Thu, Jul 14, 2022 at 4:53 PM Hongtao Liu wrote:
>
> On Thu, Jul 14, 2022 at 4:20 PM Richard Biener
> wrote:
> >
> > On Wed, Jul 13, 2022 at 9:34 AM Richard Biener
> > wrote:
> > >
> > > On Wed, Jul 13, 2022 at 6:47 AM Hongtao Liu wrote:
>
On Thu, Jul 14, 2022 at 3:22 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, Jul 14, 2022 at 7:33 AM liuhongt wrote:
> >
> > And split it to GPR-version instruction after reload.
> >
> > > ?r was introduced under the assumption that we want vector values
> > > mostly in vector registers. Curren
On Thu, Jul 14, 2022 at 2:11 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> The patch is to fix _mm_[u]comixx_{ss,sd} codegen and add PF result. These
> intrinsics have changed over time, like `_mm_comieq_ss ` old operation is
> `RETURN ( a[31:0] == b[31:0] ) ? 1 : 0`, and new operation u
On Fri, Jul 15, 2022 at 1:44 AM H.J. Lu via Gcc-patches
wrote:
>
> When shadow stack is enabled, function with indirect_return attribute
> may return via indirect jump. In this case, we need to disable sibcall
> if caller doesn't have indirect_return attribute and indirect branch
> tracking is en
On Sat, Jul 16, 2022 at 10:08 PM Roger Sayle wrote:
>
>
> This AVX512 specific patch to sse.md is split out from an earlier patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596199.html
>
> The new splitters proposed in that patch interfere with AVX512's
> kunpckdq instruction which is
On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote:
> >
> > And split it after reload.
> >
> > > You will need ix86_binary_operator_ok insn constraint here with
> > > corresponding expander using ix86_fixup_binary_operands_no_copy
On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote:
>
> On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote:
> >
> > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
> > wrote:
> > >
> > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote:
On Wed, Jul 20, 2022 at 2:18 PM Uros Bizjak wrote:
>
> On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak wrote:
> >
> > On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu wrote:
> > >
> > > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote:
> > > >
> >
On Wed, Jul 20, 2022 at 3:18 PM Uros Bizjak wrote:
>
> On Wed, Jul 20, 2022 at 8:54 AM Hongtao Liu wrote:
> >
> > On Wed, Jul 20, 2022 at 2:18 PM Uros Bizjak wrote:
> > >
> > > On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak wrote:
> > > >
> >
a.c.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
>
> OK.
>
> Are there cases left your vectorizer patch handles over this one?
No.
>
> Thanks,
> Richard.
>
> > 2022-07-20 Richard Biener
> > H
On Wed, Jul 20, 2022 at 3:59 PM Richard Biener via Gcc-patches
wrote:
>
> On Wed, Jul 20, 2022 at 4:20 AM liuhongt wrote:
> >
> > __builtin_cexpi can't be vectorized since there's gap between it and
> > vectorized sincos version(In libmvec, it passes a double and two
> > double pointer and return
On Wed, Aug 3, 2022 at 4:41 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> Old patch has some mistake in `*movbf_internal` , now disable BFmode constant
> double move in `*movbf_internal`.
LGTM.
>
> Thanks,
> Lingling
>
> > -Original Message-
> > From: Kong, Lingling
> > Sent: Tues
On Thu, Aug 4, 2022 at 4:19 PM Richard Biener via Gcc-patches
wrote:
>
> On Thu, Aug 4, 2022 at 6:29 AM liuhongt via Gcc-patches
> wrote:
> >
> > For neg, the patch create a vec_init as [ a, -a, a, -a, ... ] and no
> > vec_step is needed to update vectorized iv since vf is always multiple
> > of
On Fri, Aug 4, 2023 at 1:30 AM Alexander Monakov wrote:
>
>
> On Thu, 27 Jul 2023, Liu, Hongtao via Gcc-patches wrote:
>
> > > +;; If the first and the second operands of ternlog are invariant and ;;
> > > +the third operand is memory ;; then we should add load third operand
> > > +from memory to
On Thu, Aug 3, 2023 at 4:10 PM Jan Beulich via Gcc-patches
wrote:
>
> Drop SSE5 leftovers from both its comment and its default calculation.
> A value of 2 simply cannot occur anymore. Instead extend the comment to
> mention the use of the attribute in "length_vex", clarifying why
> "prefix_extra"
On Thu, Aug 3, 2023 at 4:10 PM Jan Beulich via Gcc-patches
wrote:
>
> Record common properties in other attributes' default calculations:
> There's always a 1-byte immediate, and they're always encoded in a VEX3-
> like manner (note that "prefix_extra" already evaluates to 1 in this
> case). The d
On Thu, Aug 3, 2023 at 4:11 PM Jan Beulich via Gcc-patches
wrote:
>
> They're all VEX3- (also covering XOP) or EVEX-encoded. Express that in
> the default calculation of "prefix". FMA4 insns also all have a 1-byte
> immediate operand.
>
> Where the default calculation is not sufficient / applicabl
On Thu, Aug 3, 2023 at 4:14 PM Jan Beulich via Gcc-patches
wrote:
>
> In the rdrand and rdseed cases "prefix_0f" is meant instead. For
> mmx_floatv2siv2sf2 1 is correct only for the first alternative. For
> the integer min/max cases 1 uniformly applies to legacy and VEX
> encodings (the UB and SW
On Thu, Aug 3, 2023 at 4:16 PM Jan Beulich via Gcc-patches
wrote:
>
> While the attribute is relevant for legacy- and VEX-encoded insns, it is
> of no relevance for EVEX-encoded ones.
>
> While there in avx512dq_broadcast_1 add
> the missing "length_immediate".
Ok.
>
> gcc/
>
> * config/i3
On Thu, Aug 3, 2023 at 4:11 PM Jan Beulich via Gcc-patches
wrote:
>
> In the three remaining instances separate "prefix_0f" and "prefix_rep"
> are what is wanted instead.
Ok.
>
> gcc/
>
> * config/i386/i386.md (rdbase): Add "prefix_0f" and
> "prefix_rep". Drop "prefix_extra".
>
On Thu, Aug 3, 2023 at 4:14 PM Jan Beulich via Gcc-patches
wrote:
>
> When first added explicitly in 3ddffba914b2 ("i386.md
> (sse4_1_round2): Add avx512f alternative"), "*" should not have
> been used for the pre-existing alternative. The attribute was plain
> missing. Subsequent changes adding m
On Thu, Aug 3, 2023 at 4:14 PM Jan Beulich via Gcc-patches
wrote:
>
> Many were lacking "prefix" and "prefix_extra", some had a bogus value of
> 2 for "prefix_extra" (presumably inherited from their SSE5 counterparts,
> which are long gone) and a meaningless "prefix_data16" one. Where
> missing, "
On Thu, Aug 3, 2023 at 4:17 PM Jan Beulich via Gcc-patches
wrote:
>
> The attribute defaults to 1 for TI-mode insns of type sselog, sselog1,
> sseiadd, sseimul, and sseishft.
>
> In *v8hi3 [smaxmin] and *v16qi3 [umaxmin] also drop the
> similarly stray "prefix_extra" at this occasion. These two ma
On Thu, Aug 3, 2023 at 4:16 PM Jan Beulich via Gcc-patches
wrote:
>
> gcc/
>
> * config/i386/sse.md
> (__): Add
> "prefix" attribute.
>
> (avx512fp16_sh_v8hf):
> Likewise.
Ok.
> ---
> Talking of "prefix": Shouldn't at least V32HF and V32BF have it also
> de
401 - 500 of 1403 matches
Mail list logo