On Tue, Jul 15, 2025 at 2:36 PM Haochen Jiang wrote:
>
> Hi all,
>
> In ISE058, the AVX10.2 imply is removed from AMX-AVX512. This
> leads to re-consideration on the imply for AMX-AVX512.
>
> Since it is using zmm register and using zmm register only, we
> need to at least imply AVX512F. AVX512VL
> -Original Message-
> From: Jiang, Haochen
> Sent: Monday, July 14, 2025 10:59 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Remove KEYLOCKER related feature since Panther Lake
> and Clearwater Forest
>
>
> -Original Message-
> From: Hu, Lin1
> Sent: Wednesday, June 4, 2025 3:26 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Add a new peeophole2 for PR91384 under APX_F
>
> gcc/ChangeLog:
>
> PR targ
On Mon, Jul 7, 2025 at 3:27 PM Hongtao Liu wrote:
>
> On Tue, Jun 24, 2025 at 2:11 PM H.J. Lu wrote:
> >
> > On Mon, Jun 23, 2025 at 2:24 PM H.J. Lu wrote:
> > >
> > > On Wed, Jun 18, 2025 at 3:17 PM H.J. Lu wrote:
> > > >
> > > >
On Tue, Jun 24, 2025 at 2:11 PM H.J. Lu wrote:
>
> On Mon, Jun 23, 2025 at 2:24 PM H.J. Lu wrote:
> >
> > On Wed, Jun 18, 2025 at 3:17 PM H.J. Lu wrote:
> > >
> > > 1. Don't generate the loop if the loop count is 1.
> > > 2. For memset with vector on small size, use vector if small size supports
On Mon, Jul 7, 2025 at 3:18 PM Hongtao Liu wrote:
>
> On Fri, Jul 4, 2025 at 5:45 PM Richard Biener wrote:
> >
> > The following adds a x86 tuning to enable the use of AVX512 masked
> > epilogues in cases we heuristically determine it to be not detrimental
> &
On Fri, Jul 4, 2025 at 5:45 PM Richard Biener wrote:
>
> The following adds a x86 tuning to enable the use of AVX512 masked
> epilogues in cases we heuristically determine it to be not detrimental
> by high chance. Basically problematic cases are when there are
> data streams that are both stored
> -Original Message-
> From: Jiang, Haochen
> Sent: Wednesday, July 2, 2025 11:10 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Change Diamond Rapids feature detect when model
> number could not be distinguished
On Mon, Jun 30, 2025 at 11:46 AM H.J. Lu wrote:
>
> On Mon, Jun 30, 2025 at 11:17 AM H.J. Lu wrote:
> >
> > On Mon, Jun 30, 2025 at 10:41 AM Hongtao Liu wrote:
> > >
> > > On Mon, Jun 30, 2025 at 10:37 AM Hongtao Liu wrote:
> > > >
> &
On Mon, Jun 30, 2025 at 11:16 AM H.J. Lu wrote:
>
> On Mon, Jun 30, 2025 at 10:37 AM Hongtao Liu wrote:
> >
> > On Sat, Jun 28, 2025 at 8:30 PM H.J. Lu wrote:
> > >
> > > Update functions with no_callee_saved_registers/preserve_none attribute
> > > t
On Sat, Jun 28, 2025 at 8:30 PM H.J. Lu wrote:
>
> Update functions with no_callee_saved_registers/preserve_none attribute
> to preserve frame pointer since caller may use it to save the current
> stack:
>
> pushq %rbp
> movq %rsp, %rbp
> ...
> call function
> ...
> leave
> ret
>
> If callee chang
On Mon, Jun 30, 2025 at 10:37 AM Hongtao Liu wrote:
>
> On Sat, Jun 28, 2025 at 8:30 PM H.J. Lu wrote:
> >
> > Update functions with no_callee_saved_registers/preserve_none attribute
> > to preserve frame pointer since caller may use it to save the current
> > stac
On Thu, Jun 26, 2025 at 2:17 PM H.J. Lu wrote:
>
> On Thu, Jun 26, 2025 at 2:11 PM Hongtao Liu wrote:
> >
> > On Thu, Jun 26, 2025 at 1:59 PM H.J. Lu wrote:
> > >
> > > Use the inner scalar mode of vector broadcast source in:
> > >
> > >
On Thu, Jun 26, 2025 at 2:02 PM H.J. Lu wrote:
>
> Since float vector constant
>
> (const_vector:V4SF [(const_double:SF -QNaN [-QNaN]) repeated x4])
>
> is an all 1s float vector constant, update the remove_redundant_vector
> pass to replace
>
> (insn 20 18 21 2 (set (reg:V4SF 124)
> (cons
On Thu, Jun 26, 2025 at 1:59 PM H.J. Lu wrote:
>
> Use the inner scalar mode of vector broadcast source in:
>
> (set (reg:V8DF 394)
>(vec_duplicate:V8DF (reg:V2DF 190 [ alpha ])))
>
> to compute the vector mode for broadcast from vector source.
ix86_get_vector_cse_mode (unsigned int si
On Thu, Jun 26, 2025 at 1:56 PM H.J. Lu wrote:
>
> On Thu, Jun 26, 2025 at 1:24 PM Hongtao Liu wrote:
> >
> > On Thu, Jun 26, 2025 at 6:20 AM H.J. Lu wrote:
> > >
> > > For tcpsock_test.go in libgo tests,
> > >
> > > commit aba3b9d3
On Thu, Jun 26, 2025 at 6:20 AM H.J. Lu wrote:
>
> For tcpsock_test.go in libgo tests,
>
> commit aba3b9d3a48a0703fd565f7c5f0caf604f59970b
> Author: H.J. Lu
> Date: Fri May 9 07:17:07 2025 +0800
>
> x86: Extend the remove_redundant_vector pass
>
> added an instruction:
>
> (insn 501 101 102
On Wed, Jun 25, 2025 at 3:35 PM H.J. Lu wrote:
>
> Add preserve_none attribute which is similar to no_callee_saved_registers
> attribute, except on x86-64, r12, r13, r14, r15, rdi and rsi registers are
> used for integer parameter passing. This can be used in an interpreter
> to avoid saving/rest
On Thu, Jun 26, 2025 at 6:21 AM H.J. Lu wrote:
>
> On Tue, Jun 24, 2025 at 2:21 PM H.J. Lu wrote:
> >
> > Add debug dump for the remove_redundant_vector pass with the following
> > output:
> >
> > Replace:
> >
> > (insn 7 4 8 2 (set (reg:V2DI 103)
> > (const_vector:V2DI [
> >
On Tue, Jun 17, 2025 at 8:54 PM Cui, Lili wrote:
>
>
>
> > -Original Message-
> > From: H.J. Lu
> > Sent: Monday, June 16, 2025 10:08 PM
> > To: Jan Hubicka
> > Cc: Uros Bizjak ; Cui, Lili ; gcc-
> > patc...@gcc.gnu.org; Liu, Hongtao ;
>
On Fri, May 23, 2025 at 1:56 PM H.J. Lu wrote:
>
> Add preserve_none attribute which is similar to no_callee_saved_registers
> attribute, except on x86-64, r12, r13, r14, r15, rdi and rsi registers are
> used for integer parameter passing. This can be used in an interpreter
> to avoid saving/rest
On Wed, Jun 25, 2025 at 1:06 PM H.J. Lu wrote:
>
> -mtune=intel is used to generate a single binary to run well on both big
> core and small core, similar to hybrid CPUs. Update -mtune=intel to tune
> for Diamond Rapids and Clearwater Forest, instead of Silvermont.
>
> PR target/120815
> * common
On Tue, Jun 24, 2025 at 1:26 PM H.J. Lu wrote:
>
> On Mon, Jun 23, 2025 at 4:53 PM Hongtao Liu wrote:
> >
> > On Mon, Jun 23, 2025 at 4:45 PM H.J. Lu wrote:
> > >
> > > On Mon, Jun 23, 2025 at 4:10 PM H.J. Lu wrote:
> > > >
> >
On Thu, Jun 19, 2025 at 10:25 AM H.J. Lu wrote:
>
> Extend the remove_redundant_vector pass to handle vector broadcasts from
> constant and variable scalars. When broadcasting from constants and
> function arguments, we can place a single widest vector broadcast at
> entry of the nearest common d
On Mon, Jun 23, 2025 at 4:45 PM H.J. Lu wrote:
>
> On Mon, Jun 23, 2025 at 4:10 PM H.J. Lu wrote:
> >
> > On Mon, Jun 23, 2025 at 3:11 PM Hongtao Liu wrote:
> > >
> > > On Thu, Jun 19, 2025 at 10:25 AM H.J. Lu wrote:
> > > >
> > > &
On Mon, Jun 23, 2025 at 4:10 PM H.J. Lu wrote:
>
> On Mon, Jun 23, 2025 at 3:11 PM Hongtao Liu wrote:
> >
> > On Thu, Jun 19, 2025 at 10:25 AM H.J. Lu wrote:
> > >
> > > Extend the remove_redundant_vector pass to handle vector broadcasts from
> &
On Thu, Jun 19, 2025 at 10:25 AM H.J. Lu wrote:
>
> Extend the remove_redundant_vector pass to handle vector broadcasts from
> constant and variable scalars. When broadcasting from constants and
> function arguments, we can place a single widest vector broadcast at
> entry of the nearest common d
On Sat, Jun 21, 2025 at 11:09 PM H.J. Lu wrote:
>
> On Fri, Jun 20, 2025 at 4:12 PM H.J. Lu wrote:
> >
> > Don't use vmovdqu16/vmovdqu8 with non-EVEX registers even if AVX512BW is
> > available.
> >
> > gcc/
> >
> > PR target/120728
> > * config/i386/i386.cc (ix86_get_ssemov): Use vmovdqu16/vmovd
On Mon, Jun 23, 2025 at 11:03 AM H.J. Lu wrote:
>
> Add a PROCESSOR_XXX comment to each entry in processor_cost_table to
> describe which processor the cost enry is applied to.
Ok as obvious.
>
> * config/i386/i386-options.cc (processor_cost_table): Add a
> PROCESSOR_XXX comment to each entry.
>
>
On Fri, Jun 20, 2025 at 10:04 AM Haochen Jiang wrote:
>
> Hi all,
>
> CLDEMOTE is not enabled on clients according to SDM. SDM only mentioned
> it will be enabled on Xeon and Atom servers, not clients. Remove them
> since Alder Lake (where it is introduced).
>
> Also will backport this patch to GC
On Wed, Jun 18, 2025 at 6:38 PM H.J. Lu wrote:
>
> commit ef26c151c14a87177d46fd3d725e7f82e040e89f
> Author: Roger Sayle
> Date: Thu Dec 23 12:33:07 2021 +
>
> x86: PR target/103773: Fix wrong-code with -Oz from pop to memory.
>
> added "*mov_and" and extended "*mov_or" to transform
> "
On Wed, Jun 18, 2025 at 2:39 PM H.J. Lu wrote:
>
> On Mon, Jun 16, 2025 at 4:14 PM Hongtao Liu wrote:
> >
> > >+enum redundant_load_kind
> > >+{
> > >+ LOAD_CONST0_VECTOR,
> > >+ LOAD_CONSTM1_VECTOR,
> > >+ LOAD_VECTOR
> >
On Mon, May 26, 2025 at 2:30 PM H.J. Lu wrote:
>
> On Sun, May 25, 2025 at 7:02 PM H.J. Lu wrote:
> >
> > On Sun, May 25, 2025 at 8:12 AM H.J. Lu wrote:
> > >
> > > On Sun, May 25, 2025 at 7:47 AM H.J. Lu wrote:
> > > >
> > > > commit ef26c151c14a87177d46fd3d725e7f82e040e89f
> > > > Author: Rog
Drop this patch since
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686830.html could
be a better alternative.
On Tue, Jun 10, 2025 at 9:50 AM Hongtao Liu wrote:
>
> Ping
>
> On Mon, May 19, 2025 at 10:06 AM liuhongt wrote:
> >
> > From: "hongtao.liu"
&
On Mon, Jun 16, 2025 at 4:30 PM Hongtao Liu wrote:
>
> >+enum redundant_load_kind
> >+{
> >+ LOAD_CONST0_VECTOR,
> >+ LOAD_CONSTM1_VECTOR,
> >+ LOAD_VECTOR
> >+};
> Perhaps rename to x86_cse_kind, X86_CSE_CONST0_VECTOR,
> X86_CSE_CONSTM1_VECTOR, X
>+enum redundant_load_kind
>+{
>+ LOAD_CONST0_VECTOR,
>+ LOAD_CONSTM1_VECTOR,
>+ LOAD_VECTOR
>+};
Perhaps rename to x86_cse_kind, X86_CSE_CONST0_VECTOR,
X86_CSE_CONSTM1_VECTOR, X86_CSE_VEC_DUP?
LOAD sounds a bit ambiguous.
Similar to ix86_get_vector_load_mode -> ix86_get_vector_cse_mode?
>+
On Thu, Jun 12, 2025 at 10:51 AM Hu, Lin1 wrote:
>
> Hi,
>
> This patch aims to set SRF issue rate to 4, GNR issue rate to 6. According to
> tests about spec2017, the patch has little effect on performance.
>
> For GRR, CWF, DMR, ARL and PTL, the patch set their issue rate to 6. Waiting
> for
> m
Ping
On Mon, May 19, 2025 at 10:06 AM liuhongt wrote:
>
> From: "hongtao.liu"
>
> AutoFDO profile is a scaled profile, as a result, 0 sample does not
> mean never executed. especially there's profile from function
> body. Prevent combine_with_ipa_count·(ipa_count) from zeroing all
> bb->count.
>
On Tue, Jun 3, 2025 at 2:59 PM H.J. Lu wrote:
>
> Extend the remove_redundant_vector pass to handle vector broadcasts from
> constant and variable scalars. When broadcasting from constants and
> function arguments, we can place a single widest vector broadcast at
> entry of the nearest common dom
>= 0 always yields true (it's unsigned on
Windows)
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
On Thu, May 29, 2025 at 4:56 PM Hu, Lin1 wrote:
>
> Hi,
>
> The patch aims to optimize
> movb(%rdi), %al
> movq%rdi, %rbx
> xorl%esi, %eax, %edx
> movb%dl, (%rdi)
> cmpb%sil, %al
> jne
> to
> xorb%sil, (%rdi)
>
On Mon, May 26, 2025 at 4:55 PM Hu, Lin1 wrote:
>
> Hi, all
>
> Enable -mapxf will change some patterns about adc/sbb.
>
> Hence gcc will raise an extra mov like
> movq8(%rdi), %rax
> adcq%rax, 8(%rsi), %rax
> movq%rax, 8(%rdi)
> rather than
> movq
在 2025-5-16 16:50, LIU Hao 写道:
This is a leftover of d6d7afcdbc04adb0ec42a44b2d7e05600945af42. After this change, configuration files of
all three thread models are in 'libgcc/config/mingw/'.
The patch has been bootstrapped on {x86_64,i686}-w64-mingw32. ARM64 port is still working i
在 2025-5-13 17:18, LIU Hao 写道:
Hello,
Attached is a patch for PR 53929, but is also required by PR 80881.
Ping.
Also I just notice that Clang also quotes mangled MSVC++ symbols in this way, at least since Clang 3.5,
so it's accepted by both GAS and LLVM:
(https://gcc.godbolt.
On Wed, May 14, 2025 at 3:29 PM Haochen Jiang wrote:
>
> Hi all,
>
> This is the v2 patch to remove -mavx10.1/256-512 and -mno-evex512. I suppose
> this time all the patches will not be held due to size.
>
> As mentioned in GCC 15, we will remove -mavx10.1-256/512 and -mno-evex512
> options in GCC
NWIND_INFO in
gcc/config/i386/cygming.h
diff --git a/libgcc/config/i386/t-mingw-mcfgthread
b/libgcc/config/mingw/t-mingw-mcfgthread
similarity index 100%
rename from libgcc/config/i386/t-mingw-mcfgthread
rename to libgcc/config/mingw/t-mingw-mcfgthread
--
2.49.0
From b48e41b58158d6311906010954c987
It's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119181
On Fri, May 16, 2025 at 10:02 AM liuhongt wrote:
>
> The patch tries to solve miss vectorization for below case.
>
> void
> foo (int* a, int* restrict b)
> {
> b[0] = a[0] * a[64];
> b[1] = a[65] * a[1];
> b[2] = a[2] * a[66];
>
On Fri, Apr 18, 2025 at 7:10 PM H.J. Lu wrote:
>
> Add preserve_none attribute which is similar to no_callee_saved_registers
> attribute, except on x86-64, r12, r13, r14, r15, rdi and rsi registers are
Could you split preserve_none into a separate patch,
It looks like it's different from clang's p
On Wed, May 14, 2025 at 9:22 AM liuhongt wrote:
>
> The Intel Decimal Floating-Point Math Library is available as open-source on
> Netlib[1].
>
> [1] https://www.netlib.org/misc/intel/
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ready push to trunk.
>
> libgcc/config/libbid/Ch
syntax, as some Linux headers contain inline assembly with
only AT&T templates. It is however possible to bootstrap GCC on {i686,x86_64}-w64-mingw32.
--
Best regards,
LIU Hao
From d733676c742f9af9b9ab34317433db242128e53d Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Sat, 22 Feb 2025 13
On Thu, May 8, 2025 at 2:40 PM liuhongt wrote:
>
> The only part I changed is related to size_cost of sse_to_ineteger, as below
>
> 114+ /* Under TARGET_SSE4_1, it's vmovd + vpextrd/vpinsrd.
> 115+ W/o it, it's movd + psrlq/unpckldq + movd. */
> 116+ else if (!TARGET_64BIT && smode != SImod
在 2025-5-10 20:48, Jonathan Yong 写道:
On 5/9/25 4:26 PM, LIU Hao wrote:
在 2025-5-3 20:52, LIU Hao 写道:
在 2025-5-2 01:25, LIU Hao 写道:
Remove `STACK_REALIGN_DEFAULT` for this target, because now the default value of
`incoming_stack_boundary` equals `MIN_STACK_BOUNDARY` and it doesn't ha
在 2025-5-3 20:52, LIU Hao 写道:
在 2025-5-2 01:25, LIU Hao 写道:
Remove `STACK_REALIGN_DEFAULT` for this target, because now the default value of
`incoming_stack_boundary` equals `MIN_STACK_BOUNDARY` and it doesn't have an effect any more.
I suddenly realized the previous patch was for G
On Wed, May 7, 2025 at 9:06 AM H.J. Lu wrote:
>
> On Tue, May 6, 2025 at 3:35 PM Hongtao Liu wrote:
> >
> > On Tue, May 6, 2025 at 3:06 PM H.J. Lu wrote:
> > >
> > > On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote:
> > > >
> > > >
On Tue, May 6, 2025 at 3:06 PM H.J. Lu wrote:
>
> On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote:
> >
> >
> >
> > > -Original Message-
> > > From: H.J. Lu
> > > Sent: Tuesday, May 6, 2025 2:16 PM
> > > To: Liu, Hongtao
> -Original Message-
> From: H.J. Lu
> Sent: Tuesday, May 6, 2025 2:16 PM
> To: Liu, Hongtao
> Cc: GCC Patches ; Uros Bizjak
>
> Subject: Re: [PATCH] x86: Skip if the mode size is smaller than its natural
> size
>
> On Tue, May 6, 2025 at
> -Original Message-
> From: H.J. Lu
> Sent: Thursday, May 1, 2025 6:39 AM
> To: GCC Patches ; Uros Bizjak
> ; Liu, Hongtao
> Subject: [PATCH] x86: Skip if the mode size is smaller than its natural size
>
> When generating a SUBREG from V16QI to V2HF, validate_
在 2025-4-28 14:43, LIU Hao 写道:
Hello, I'm sending this patch again after GCC 15 has been released.
This patch was sent in February and but there were no comments:
https://patchwork.sourceware.org/project/gcc/patch/eca6660c-6578-4e39-8aa9-be9fdd013...@126.com/
Ping.
--
Best regards
在 2025-4-28 15:05, LIU Hao 写道:
This is a response to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14940#c57
The patch was submitted to MSYS2 for testing in 2022-5. No issue reports have
been received so far:
* https://github.com/msys2/MINGW-packages/blob
在 2025-5-2 01:25, LIU Hao 写道:
Remove `STACK_REALIGN_DEFAULT` for this target, because now the default value of
`incoming_stack_boundary` equals `MIN_STACK_BOUNDARY` and it doesn't have an effect any more.
I suddenly realized the previous patch was for GCC 15 branch. Here's
ly an ABI break
for code that uses `__thread`, `_Thread_local` or `thread_local`.
Other than that, this patch seems mostly fine.
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
Remove `STACK_REALIGN_DEFAULT` for this target, because now the default value of
`incoming_stack_boundary` equals `MIN_STACK_BOUNDARY` and it doesn't have an effect any more.
--
Best regards,
LIU Hao
From eeb30bf621baa3af1a73e8e91bff297ef478 Mon Sep 17 00:00:00 2001
From: LIU Hao
not always aligned to 16 bytes, but I don't
have any system with such a configuration, so can't test that for now.
--
Best regards,
LIU Hao
From 1c101f4903a9be7d56efa8d97be603284f6bd4d4 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Tue, 29 Apr 2025 10:43:06 +0800
Subject: [PATCH] i3
> -Original Message-
> From: Jan Hubicka
> Sent: Wednesday, April 30, 2025 4:11 AM
> To: gcc-patches@gcc.gnu.org; Liu, Hongtao ;
> ro...@nextmovesoftware.com; ubiz...@gmail.com
> Subject: Make ix86 cost of VEC_SELECT equivalent to SUBREG same as of
> SUBREG
在 2025-4-29 13:03, LIU Hao 写道:
This fixes a long-standing issue that GCC used to assume 16-byte stack alignment on i686-w64-mingw32,
which is not always the case for callbacks from system libraries.
CC Zeb Figura
This patch looks a bit risky. The overall effect of `__attribute__
> -Original Message-
> From: H.J. Lu
> Sent: Tuesday, April 29, 2025 2:59 PM
> To: Hongtao Liu
> Cc: GCC Patches ; Liu, Hongtao
> ; Uros Bizjak
> Subject: [PATCH v3] x86: Add a pass to remove redundant all 0s/1s vector
> load
>
> On Tue, Apr 29, 2
> -Original Message-
> From: H.J. Lu
> Sent: Tuesday, April 29, 2025 1:58 PM
> To: Hongtao Liu
> Cc: GCC Patches ; Uros Bizjak
> ; Liu, Hongtao
> Subject: Re: [PATCH] i386: Add
> ix86_expand_unsigned_small_int_cst_argument
>
> On Tue, Apr 29,
This fixes a long-standing issue that GCC used to assume 16-byte stack alignment on i686-w64-mingw32,
which is not always the case for callbacks from system libraries.
--
Best regards,
LIU Hao
From 1b92f8105dbece1694dd3ab398cfb5e3ce2c15d9 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Tue
On Sun, Apr 27, 2025 at 10:58 AM H.J. Lu wrote:
>
> When passing 0xff as an unsigned char function argument with the C frontend
> promotion, expand_normal used to get
>
> constant
> 255>
>
> and returned the rtx value using the sign-extended representation:
>
> (const_int 255 [0xff])
>
> But aft
On Mon, Apr 28, 2025 at 5:07 PM H.J. Lu wrote:
>
> On Mon, Apr 28, 2025 at 4:26 PM H.J. Lu wrote:
> >
>
> > > > This is what my patch does:
> > > But it iterates through vector_insns, using a def-ref chain to find
> > > those insns. I think we can just record those single_set with src as
> > > co
-Allow-a-PCH-to-be-mapped-to-a-different-addr.patch
--
Best regards,
LIU Hao
From 5239275bb4df0e79bc4b2af57d90c2d10ad44863 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Wed, 11 May 2022 22:42:53 +0800
Subject: [PATCH] Allow a PCH to be mapped to a different address
First, try mapping the PCH
Hello, I'm sending this patch again after GCC 15 has been released.
This patch was sent in February and but there were no comments:
https://patchwork.sourceware.org/project/gcc/patch/eca6660c-6578-4e39-8aa9-be9fdd013...@126.com/
--
Best regards,
LIU Hao
, it's always necessary to
realign the stack, as what Solaris does.
Reference: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=07#c14
Signed-off-by: LIU Hao
gcc/ChangeLog:
PR target/07
* config/i386/cygming.h (STACK_REALIGN_DEFAULT): Copy from sol2.h.
---
gcc/config
>
> I am not so sure about this when it come to relatively common
> instructions. Hiding things in unspec prevents combine and other RTL
> passes from doing their job. I would say that it only makes sense for
> siutations where RTL equivalent is very inconvenient.
>
In the direction of using gener
On Fri, Apr 25, 2025 at 1:26 PM Jan Hubicka wrote:
>
> > On Thu, Apr 24, 2025 at 6:27 PM Jan Hubicka wrote:
> > >
> > > > Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand
> > > > or vpandn.
> > > > Current register_operand/vector_operand could lose some optimization
> > >
> -Original Message-
> From: Jan Hubicka
> Sent: Friday, April 25, 2025 12:27 AM
> To: Liu, Hongtao
> Cc: gcc-patches@gcc.gnu.org; crazy...@gmail.com; hjl.to...@gmail.com
> Subject: Re: [PATCH] Accept allones or 0 operand for vcond_mask op1.
>
> > Since
On Thu, Apr 24, 2025 at 12:54 AM Jan Hubicka wrote:
>
> > From: "hongtao.liu"
> >
> > When FMA is available, N-R step can be rewritten with
> >
> > a / b = (a - (rcp(b) * a * b)) * rcp(b) + rcp(b) * a
> >
> > which have 2 fma generated.[1]
> >
> > [1] https://bugs.llvm.org/show_bug.cgi?id=21385
>
On Thu, Apr 24, 2025 at 12:50 AM Jan Hubicka wrote:
>
> > In some benchmark, I notice stv failed due to cost unprofitable, but the
> > igain
> > is inside the loop, but sse<->integer conversion is outside the loop,
> > current cost
> > model doesn't consider the frequency of those gain/cost.
> >
On Mon, Apr 21, 2025 at 2:52 PM liuhongt wrote:
>
> Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand
> or vpandn.
> Current register_operand/vector_operand could lose some optimization
> opportunity.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for tru
On Tue, Apr 22, 2025 at 10:30 AM Hongtao Liu wrote:
>
> On Tue, Apr 22, 2025 at 12:46 AM Jan Hubicka wrote:
> >
> > Hi,
> > this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR,
> > MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_E
On Tue, Apr 22, 2025 at 12:46 AM Jan Hubicka wrote:
>
> Hi,
> this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR,
> MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_EXPR and
> ABSU_EXPR
> but it was only correct for FP variant (wehre it corresponds to andss clea
On Mon, Apr 21, 2025 at 4:30 PM H.J. Lu wrote:
>
> On Mon, Apr 21, 2025 at 11:29 AM Hongtao Liu wrote:
> >
> > On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote:
> > >
> > > On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote:
> > > >
> > > >
On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote:
>
> On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote:
> >
> > For all different modes of all 0s/1s vectors, we can use the single widest
> > all 0s/1s vector register for all 0s/1s vector uses in the whole function.
> > Add a pass to generate a single wi
On Tue, Apr 8, 2025 at 3:52 AM H.J. Lu wrote:
>
> Simplify memcpy and memset inline strategies to avoid branches for
> -mtune=generic:
>
> 1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
>load and store for up to 16 * 16 (256) bytes when the data size is
>fixed and kn
sysv abi, the argument should go in esi
+/* { dg-final { scan-assembler-times "movl\[\\t \]*\\\$20,\[\\t \[]*%esi" 2 }
} */
+
+
ditto.
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
On Mon, Apr 14, 2025 at 8:56 PM H.J. Lu wrote:
>
> On Mon, Apr 14, 2025 at 2:39 AM Uros Bizjak wrote:
> >
> > On Mon, Apr 14, 2025 at 8:54 AM Hongtao Liu wrote:
> > >
> > > On Mon, Apr 14, 2025 at 7:36 AM H.J. Lu wrote:
> > > >
> >
On Mon, Apr 14, 2025 at 7:36 AM H.J. Lu wrote:
>
> Don't use red-zone when there are no caller-saved registers and APX is
> enabled since 128-byte red-zone is too small for 31 GPRs.
>
> gcc/
>
> PR target/119784
> * config/i386/i386.cc (ix86_using_red_zone): Don't use red-zone
>
> -Original Message-
> From: Uros Bizjak
> Sent: Tuesday, April 1, 2025 5:24 PM
> To: Hongtao Liu
> Cc: Wang, Hongyu ; gcc-patches@gcc.gnu.org; Liu,
> Hongtao
> Subject: Re: [PATCH] APX: add nf counterparts for rotl split pattern [PR
> 119539]
>
> O
On Mon, Mar 31, 2025 at 9:52 PM Richard Biener wrote:
>
> On Mon, 31 Mar 2025, Jakub Jelinek wrote:
>
> > On Mon, Mar 31, 2025 at 03:33:34PM +0200, Richard Biener wrote:
> > > On Mon, 31 Mar 2025, Jakub Jelinek wrote:
> > >
> > > > On Mon, Mar 31, 2025 at 03:12:56PM +0200, Richard Biener wrote:
>
On Wed, Apr 2, 2025 at 2:58 PM Hongyu Wang wrote:
>
> > Can we just change the output in original pattern, I think combine
> > will still match the pattern even w/ clobber flags.
>
> Yes, adjusted and updated the patch in attachment.
Ok.
>
> Liu, Ho
On Tue, Apr 1, 2025 at 4:40 PM Hongyu Wang wrote:
>
> Hi,
>
> For spiltter after 3_mask it now splits the pattern
> to *3_mask, causing the splitter doesn't generate
> nf variant. Add corresponding nf counterpart for define_insn_and_split
> to make the splitter also works for nf insn.
>
> Bootstra
On Tue, Apr 1, 2025 at 3:56 PM Jakub Jelinek wrote:
>
> On Tue, Apr 01, 2025 at 01:36:23PM +0800, Hongtao Liu wrote:
> > >Changing ix86_valid_target_attribute_inner_p might be even better because
> > >OPT_msse4 is RejectNegative option, so !value for it looks weird.
On Fri, Mar 28, 2025 at 1:55 PM Hu, Lin1 wrote:
>
> For vaes patterns with jm constraint and gpr16 attr, it requires "isa"
> attr to distinct avx/avx512 alternatives in ix86_memory_address_reg_class.
> Also adds missing type and mode attributes for those vaes patterns.
Ok.
>
> gcc/ChangeLog:
>
>
On Fri, Mar 28, 2025 at 4:22 PM Haochen Jiang wrote:
>
> Hi all,
>
> For -march= handling, PTA_AVX10_1 will not imply PTA_AVX10_1_256,
> resulting in TARGET_AVX10_1 becoming true while TARGET_AVX10_1_256
> false. Since we will check TARGET_AVX10_1_256 in GCC 15 for AVX512
> feature enabling for AV
This is a minor change, bootstrapped on x86_64-w64-mingw32.
--
Best regards,
LIU Hao
From 83c3e90432f9ebc97785d81be7a94066d9923920 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Sat, 29 Mar 2025 22:47:54 +0800
Subject: [PATCH] gcc/mingw: Align `.refptr.` to 8-byte boundaries for 64-bit
targets
On Wed, Mar 26, 2025 at 9:50 AM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to ensure each alternative with constraint "jm" should
> set addr "gpr16", otherwise maybe raise ICE in reload pass.
>
> Bootstrapped and Regtested for x86_64-pc-linux-gnu{-m32,-m64}, ok for trunk?
Ok.
>
> BRs,
> Lin
>
> -Original Message-
> From: Hu, Lin1
> Sent: Tuesday, March 25, 2025 4:23 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: RE: [PATCH v2] i386: Add "s_" as Saturation for AVX10.2 Converting
> Intrinsics.
>
> Mor
On Thu, Mar 20, 2025 at 3:14 PM Hu, Lin1 wrote:
>
> Hi,
>
> res_ref will be modified after MASK_ZERO, init res_ref2 for rounding
> control intrinsics.
>
> Bootstrapped and regtested on x86-64-pc-linux-gnu{-m32,-m64}, OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/testsuite/ChangeLog:
>
> * gcc.t
> -Original Message-
> From: Liu, Hongtao
> Sent: Thursday, March 20, 2025 9:29 AM
> To: Hu, Lin1 ; gcc-patches@gcc.gnu.org
> Cc: ubiz...@gmail.com
> Subject: RE: [PATCH 0/4] Fix AVX10.2 SAT CVT.
>
>
>
> > -Original Message-
> > From:
> -Original Message-
> From: Jiang, Haochen
> Sent: Wednesday, March 19, 2025 3:38 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH 00/27] Use avx10.x as the only option for AVX10 with 512 bit
> vector support while remove a
1 - 100 of 1240 matches
Mail list logo