>= 0 always yields true (it's unsigned on
Windows)
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
On Thu, May 29, 2025 at 4:56 PM Hu, Lin1 wrote:
>
> Hi,
>
> The patch aims to optimize
> movb(%rdi), %al
> movq%rdi, %rbx
> xorl%esi, %eax, %edx
> movb%dl, (%rdi)
> cmpb%sil, %al
> jne
> to
> xorb%sil, (%rdi)
>
On Mon, May 26, 2025 at 4:55 PM Hu, Lin1 wrote:
>
> Hi, all
>
> Enable -mapxf will change some patterns about adc/sbb.
>
> Hence gcc will raise an extra mov like
> movq8(%rdi), %rax
> adcq%rax, 8(%rsi), %rax
> movq%rax, 8(%rdi)
> rather than
> movq
在 2025-5-16 16:50, LIU Hao 写道:
This is a leftover of d6d7afcdbc04adb0ec42a44b2d7e05600945af42. After this change, configuration files of
all three thread models are in 'libgcc/config/mingw/'.
The patch has been bootstrapped on {x86_64,i686}-w64-mingw32. ARM64 port is still working i
在 2025-5-13 17:18, LIU Hao 写道:
Hello,
Attached is a patch for PR 53929, but is also required by PR 80881.
Ping.
Also I just notice that Clang also quotes mangled MSVC++ symbols in this way, at least since Clang 3.5,
so it's accepted by both GAS and LLVM:
(https://gcc.godbolt.
On Wed, May 14, 2025 at 3:29 PM Haochen Jiang wrote:
>
> Hi all,
>
> This is the v2 patch to remove -mavx10.1/256-512 and -mno-evex512. I suppose
> this time all the patches will not be held due to size.
>
> As mentioned in GCC 15, we will remove -mavx10.1-256/512 and -mno-evex512
> options in GCC
NWIND_INFO in
gcc/config/i386/cygming.h
diff --git a/libgcc/config/i386/t-mingw-mcfgthread
b/libgcc/config/mingw/t-mingw-mcfgthread
similarity index 100%
rename from libgcc/config/i386/t-mingw-mcfgthread
rename to libgcc/config/mingw/t-mingw-mcfgthread
--
2.49.0
From b48e41b58158d6311906010954c987
It's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119181
On Fri, May 16, 2025 at 10:02 AM liuhongt wrote:
>
> The patch tries to solve miss vectorization for below case.
>
> void
> foo (int* a, int* restrict b)
> {
> b[0] = a[0] * a[64];
> b[1] = a[65] * a[1];
> b[2] = a[2] * a[66];
>
On Fri, Apr 18, 2025 at 7:10 PM H.J. Lu wrote:
>
> Add preserve_none attribute which is similar to no_callee_saved_registers
> attribute, except on x86-64, r12, r13, r14, r15, rdi and rsi registers are
Could you split preserve_none into a separate patch,
It looks like it's different from clang's p
On Wed, May 14, 2025 at 9:22 AM liuhongt wrote:
>
> The Intel Decimal Floating-Point Math Library is available as open-source on
> Netlib[1].
>
> [1] https://www.netlib.org/misc/intel/
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ready push to trunk.
>
> libgcc/config/libbid/Ch
syntax, as some Linux headers contain inline assembly with
only AT&T templates. It is however possible to bootstrap GCC on {i686,x86_64}-w64-mingw32.
--
Best regards,
LIU Hao
From d733676c742f9af9b9ab34317433db242128e53d Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Sat, 22 Feb 2025 13
On Thu, May 8, 2025 at 2:40 PM liuhongt wrote:
>
> The only part I changed is related to size_cost of sse_to_ineteger, as below
>
> 114+ /* Under TARGET_SSE4_1, it's vmovd + vpextrd/vpinsrd.
> 115+ W/o it, it's movd + psrlq/unpckldq + movd. */
> 116+ else if (!TARGET_64BIT && smode != SImod
在 2025-5-10 20:48, Jonathan Yong 写道:
On 5/9/25 4:26 PM, LIU Hao wrote:
在 2025-5-3 20:52, LIU Hao 写道:
在 2025-5-2 01:25, LIU Hao 写道:
Remove `STACK_REALIGN_DEFAULT` for this target, because now the default value of
`incoming_stack_boundary` equals `MIN_STACK_BOUNDARY` and it doesn't ha
在 2025-5-3 20:52, LIU Hao 写道:
在 2025-5-2 01:25, LIU Hao 写道:
Remove `STACK_REALIGN_DEFAULT` for this target, because now the default value of
`incoming_stack_boundary` equals `MIN_STACK_BOUNDARY` and it doesn't have an effect any more.
I suddenly realized the previous patch was for G
On Wed, May 7, 2025 at 9:06 AM H.J. Lu wrote:
>
> On Tue, May 6, 2025 at 3:35 PM Hongtao Liu wrote:
> >
> > On Tue, May 6, 2025 at 3:06 PM H.J. Lu wrote:
> > >
> > > On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote:
> > > >
> > > >
On Tue, May 6, 2025 at 3:06 PM H.J. Lu wrote:
>
> On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote:
> >
> >
> >
> > > -Original Message-
> > > From: H.J. Lu
> > > Sent: Tuesday, May 6, 2025 2:16 PM
> > > To: Liu, Hongtao
> -Original Message-
> From: H.J. Lu
> Sent: Tuesday, May 6, 2025 2:16 PM
> To: Liu, Hongtao
> Cc: GCC Patches ; Uros Bizjak
>
> Subject: Re: [PATCH] x86: Skip if the mode size is smaller than its natural
> size
>
> On Tue, May 6, 2025 at
> -Original Message-
> From: H.J. Lu
> Sent: Thursday, May 1, 2025 6:39 AM
> To: GCC Patches ; Uros Bizjak
> ; Liu, Hongtao
> Subject: [PATCH] x86: Skip if the mode size is smaller than its natural size
>
> When generating a SUBREG from V16QI to V2HF, validate_
在 2025-4-28 14:43, LIU Hao 写道:
Hello, I'm sending this patch again after GCC 15 has been released.
This patch was sent in February and but there were no comments:
https://patchwork.sourceware.org/project/gcc/patch/eca6660c-6578-4e39-8aa9-be9fdd013...@126.com/
Ping.
--
Best regards
在 2025-4-28 15:05, LIU Hao 写道:
This is a response to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14940#c57
The patch was submitted to MSYS2 for testing in 2022-5. No issue reports have
been received so far:
* https://github.com/msys2/MINGW-packages/blob
在 2025-5-2 01:25, LIU Hao 写道:
Remove `STACK_REALIGN_DEFAULT` for this target, because now the default value of
`incoming_stack_boundary` equals `MIN_STACK_BOUNDARY` and it doesn't have an effect any more.
I suddenly realized the previous patch was for GCC 15 branch. Here's
ly an ABI break
for code that uses `__thread`, `_Thread_local` or `thread_local`.
Other than that, this patch seems mostly fine.
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
Remove `STACK_REALIGN_DEFAULT` for this target, because now the default value of
`incoming_stack_boundary` equals `MIN_STACK_BOUNDARY` and it doesn't have an effect any more.
--
Best regards,
LIU Hao
From eeb30bf621baa3af1a73e8e91bff297ef478 Mon Sep 17 00:00:00 2001
From: LIU Hao
not always aligned to 16 bytes, but I don't
have any system with such a configuration, so can't test that for now.
--
Best regards,
LIU Hao
From 1c101f4903a9be7d56efa8d97be603284f6bd4d4 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Tue, 29 Apr 2025 10:43:06 +0800
Subject: [PATCH] i3
> -Original Message-
> From: Jan Hubicka
> Sent: Wednesday, April 30, 2025 4:11 AM
> To: gcc-patches@gcc.gnu.org; Liu, Hongtao ;
> ro...@nextmovesoftware.com; ubiz...@gmail.com
> Subject: Make ix86 cost of VEC_SELECT equivalent to SUBREG same as of
> SUBREG
在 2025-4-29 13:03, LIU Hao 写道:
This fixes a long-standing issue that GCC used to assume 16-byte stack alignment on i686-w64-mingw32,
which is not always the case for callbacks from system libraries.
CC Zeb Figura
This patch looks a bit risky. The overall effect of `__attribute__
> -Original Message-
> From: H.J. Lu
> Sent: Tuesday, April 29, 2025 2:59 PM
> To: Hongtao Liu
> Cc: GCC Patches ; Liu, Hongtao
> ; Uros Bizjak
> Subject: [PATCH v3] x86: Add a pass to remove redundant all 0s/1s vector
> load
>
> On Tue, Apr 29, 2
> -Original Message-
> From: H.J. Lu
> Sent: Tuesday, April 29, 2025 1:58 PM
> To: Hongtao Liu
> Cc: GCC Patches ; Uros Bizjak
> ; Liu, Hongtao
> Subject: Re: [PATCH] i386: Add
> ix86_expand_unsigned_small_int_cst_argument
>
> On Tue, Apr 29,
This fixes a long-standing issue that GCC used to assume 16-byte stack alignment on i686-w64-mingw32,
which is not always the case for callbacks from system libraries.
--
Best regards,
LIU Hao
From 1b92f8105dbece1694dd3ab398cfb5e3ce2c15d9 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Tue
On Sun, Apr 27, 2025 at 10:58 AM H.J. Lu wrote:
>
> When passing 0xff as an unsigned char function argument with the C frontend
> promotion, expand_normal used to get
>
> constant
> 255>
>
> and returned the rtx value using the sign-extended representation:
>
> (const_int 255 [0xff])
>
> But aft
On Mon, Apr 28, 2025 at 5:07 PM H.J. Lu wrote:
>
> On Mon, Apr 28, 2025 at 4:26 PM H.J. Lu wrote:
> >
>
> > > > This is what my patch does:
> > > But it iterates through vector_insns, using a def-ref chain to find
> > > those insns. I think we can just record those single_set with src as
> > > co
-Allow-a-PCH-to-be-mapped-to-a-different-addr.patch
--
Best regards,
LIU Hao
From 5239275bb4df0e79bc4b2af57d90c2d10ad44863 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Wed, 11 May 2022 22:42:53 +0800
Subject: [PATCH] Allow a PCH to be mapped to a different address
First, try mapping the PCH
Hello, I'm sending this patch again after GCC 15 has been released.
This patch was sent in February and but there were no comments:
https://patchwork.sourceware.org/project/gcc/patch/eca6660c-6578-4e39-8aa9-be9fdd013...@126.com/
--
Best regards,
LIU Hao
, it's always necessary to
realign the stack, as what Solaris does.
Reference: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=07#c14
Signed-off-by: LIU Hao
gcc/ChangeLog:
PR target/07
* config/i386/cygming.h (STACK_REALIGN_DEFAULT): Copy from sol2.h.
---
gcc/config
>
> I am not so sure about this when it come to relatively common
> instructions. Hiding things in unspec prevents combine and other RTL
> passes from doing their job. I would say that it only makes sense for
> siutations where RTL equivalent is very inconvenient.
>
In the direction of using gener
On Fri, Apr 25, 2025 at 1:26 PM Jan Hubicka wrote:
>
> > On Thu, Apr 24, 2025 at 6:27 PM Jan Hubicka wrote:
> > >
> > > > Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand
> > > > or vpandn.
> > > > Current register_operand/vector_operand could lose some optimization
> > >
> -Original Message-
> From: Jan Hubicka
> Sent: Friday, April 25, 2025 12:27 AM
> To: Liu, Hongtao
> Cc: gcc-patches@gcc.gnu.org; crazy...@gmail.com; hjl.to...@gmail.com
> Subject: Re: [PATCH] Accept allones or 0 operand for vcond_mask op1.
>
> > Since
On Thu, Apr 24, 2025 at 12:54 AM Jan Hubicka wrote:
>
> > From: "hongtao.liu"
> >
> > When FMA is available, N-R step can be rewritten with
> >
> > a / b = (a - (rcp(b) * a * b)) * rcp(b) + rcp(b) * a
> >
> > which have 2 fma generated.[1]
> >
> > [1] https://bugs.llvm.org/show_bug.cgi?id=21385
>
On Thu, Apr 24, 2025 at 12:50 AM Jan Hubicka wrote:
>
> > In some benchmark, I notice stv failed due to cost unprofitable, but the
> > igain
> > is inside the loop, but sse<->integer conversion is outside the loop,
> > current cost
> > model doesn't consider the frequency of those gain/cost.
> >
On Mon, Apr 21, 2025 at 2:52 PM liuhongt wrote:
>
> Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand
> or vpandn.
> Current register_operand/vector_operand could lose some optimization
> opportunity.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for tru
On Tue, Apr 22, 2025 at 10:30 AM Hongtao Liu wrote:
>
> On Tue, Apr 22, 2025 at 12:46 AM Jan Hubicka wrote:
> >
> > Hi,
> > this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR,
> > MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_E
On Tue, Apr 22, 2025 at 12:46 AM Jan Hubicka wrote:
>
> Hi,
> this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR,
> MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_EXPR and
> ABSU_EXPR
> but it was only correct for FP variant (wehre it corresponds to andss clea
On Mon, Apr 21, 2025 at 4:30 PM H.J. Lu wrote:
>
> On Mon, Apr 21, 2025 at 11:29 AM Hongtao Liu wrote:
> >
> > On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote:
> > >
> > > On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote:
> > > >
> > > >
On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote:
>
> On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote:
> >
> > For all different modes of all 0s/1s vectors, we can use the single widest
> > all 0s/1s vector register for all 0s/1s vector uses in the whole function.
> > Add a pass to generate a single wi
On Tue, Apr 8, 2025 at 3:52 AM H.J. Lu wrote:
>
> Simplify memcpy and memset inline strategies to avoid branches for
> -mtune=generic:
>
> 1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
>load and store for up to 16 * 16 (256) bytes when the data size is
>fixed and kn
sysv abi, the argument should go in esi
+/* { dg-final { scan-assembler-times "movl\[\\t \]*\\\$20,\[\\t \[]*%esi" 2 }
} */
+
+
ditto.
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
On Mon, Apr 14, 2025 at 8:56 PM H.J. Lu wrote:
>
> On Mon, Apr 14, 2025 at 2:39 AM Uros Bizjak wrote:
> >
> > On Mon, Apr 14, 2025 at 8:54 AM Hongtao Liu wrote:
> > >
> > > On Mon, Apr 14, 2025 at 7:36 AM H.J. Lu wrote:
> > > >
> >
On Mon, Apr 14, 2025 at 7:36 AM H.J. Lu wrote:
>
> Don't use red-zone when there are no caller-saved registers and APX is
> enabled since 128-byte red-zone is too small for 31 GPRs.
>
> gcc/
>
> PR target/119784
> * config/i386/i386.cc (ix86_using_red_zone): Don't use red-zone
>
> -Original Message-
> From: Uros Bizjak
> Sent: Tuesday, April 1, 2025 5:24 PM
> To: Hongtao Liu
> Cc: Wang, Hongyu ; gcc-patches@gcc.gnu.org; Liu,
> Hongtao
> Subject: Re: [PATCH] APX: add nf counterparts for rotl split pattern [PR
> 119539]
>
> O
On Mon, Mar 31, 2025 at 9:52 PM Richard Biener wrote:
>
> On Mon, 31 Mar 2025, Jakub Jelinek wrote:
>
> > On Mon, Mar 31, 2025 at 03:33:34PM +0200, Richard Biener wrote:
> > > On Mon, 31 Mar 2025, Jakub Jelinek wrote:
> > >
> > > > On Mon, Mar 31, 2025 at 03:12:56PM +0200, Richard Biener wrote:
>
On Wed, Apr 2, 2025 at 2:58 PM Hongyu Wang wrote:
>
> > Can we just change the output in original pattern, I think combine
> > will still match the pattern even w/ clobber flags.
>
> Yes, adjusted and updated the patch in attachment.
Ok.
>
> Liu, Ho
On Tue, Apr 1, 2025 at 4:40 PM Hongyu Wang wrote:
>
> Hi,
>
> For spiltter after 3_mask it now splits the pattern
> to *3_mask, causing the splitter doesn't generate
> nf variant. Add corresponding nf counterpart for define_insn_and_split
> to make the splitter also works for nf insn.
>
> Bootstra
On Tue, Apr 1, 2025 at 3:56 PM Jakub Jelinek wrote:
>
> On Tue, Apr 01, 2025 at 01:36:23PM +0800, Hongtao Liu wrote:
> > >Changing ix86_valid_target_attribute_inner_p might be even better because
> > >OPT_msse4 is RejectNegative option, so !value for it looks weird.
On Fri, Mar 28, 2025 at 1:55 PM Hu, Lin1 wrote:
>
> For vaes patterns with jm constraint and gpr16 attr, it requires "isa"
> attr to distinct avx/avx512 alternatives in ix86_memory_address_reg_class.
> Also adds missing type and mode attributes for those vaes patterns.
Ok.
>
> gcc/ChangeLog:
>
>
On Fri, Mar 28, 2025 at 4:22 PM Haochen Jiang wrote:
>
> Hi all,
>
> For -march= handling, PTA_AVX10_1 will not imply PTA_AVX10_1_256,
> resulting in TARGET_AVX10_1 becoming true while TARGET_AVX10_1_256
> false. Since we will check TARGET_AVX10_1_256 in GCC 15 for AVX512
> feature enabling for AV
This is a minor change, bootstrapped on x86_64-w64-mingw32.
--
Best regards,
LIU Hao
From 83c3e90432f9ebc97785d81be7a94066d9923920 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Sat, 29 Mar 2025 22:47:54 +0800
Subject: [PATCH] gcc/mingw: Align `.refptr.` to 8-byte boundaries for 64-bit
targets
On Wed, Mar 26, 2025 at 9:50 AM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to ensure each alternative with constraint "jm" should
> set addr "gpr16", otherwise maybe raise ICE in reload pass.
>
> Bootstrapped and Regtested for x86_64-pc-linux-gnu{-m32,-m64}, ok for trunk?
Ok.
>
> BRs,
> Lin
>
> -Original Message-
> From: Hu, Lin1
> Sent: Tuesday, March 25, 2025 4:23 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: RE: [PATCH v2] i386: Add "s_" as Saturation for AVX10.2 Converting
> Intrinsics.
>
> Mor
On Thu, Mar 20, 2025 at 3:14 PM Hu, Lin1 wrote:
>
> Hi,
>
> res_ref will be modified after MASK_ZERO, init res_ref2 for rounding
> control intrinsics.
>
> Bootstrapped and regtested on x86-64-pc-linux-gnu{-m32,-m64}, OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/testsuite/ChangeLog:
>
> * gcc.t
> -Original Message-
> From: Liu, Hongtao
> Sent: Thursday, March 20, 2025 9:29 AM
> To: Hu, Lin1 ; gcc-patches@gcc.gnu.org
> Cc: ubiz...@gmail.com
> Subject: RE: [PATCH 0/4] Fix AVX10.2 SAT CVT.
>
>
>
> > -Original Message-
> > From:
> -Original Message-
> From: Jiang, Haochen
> Sent: Wednesday, March 19, 2025 3:38 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH 00/27] Use avx10.x as the only option for AVX10 with 512 bit
> vector support while remove a
> -Original Message-
> From: Hu, Lin1
> Sent: Wednesday, March 19, 2025 3:49 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH 0/4] Fix AVX10.2 SAT CVT.
>
> Hi, all
>
> This series of patches fixes three issues in
On Tue, Mar 11, 2025 at 2:29 PM Haochen Jiang wrote:
>
> Hi all,
>
> After commit r15-4510, the following testcases also do not need XFAIL.
>
> Ok for trunk?
Ok.
>
> Thx,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512f-pr103750-1.c: Remove XFAIL.
> * gcc.target
On Wed, Mar 5, 2025 at 3:23 PM Haochen Jiang wrote:
>
> Hi all,
>
> For bf8 -> pf16 convert, when dst is 256 bit, the mask should be
> 16 bit since 16*16=256, not the 8 bit in the current intrin. In
> 512 bit intrin, the mask bit is also halved. This patch will fix
> both of them.
>
> Ok for trunk
On Tue, Mar 4, 2025 at 6:31 PM Richard Biener
wrote:
>
> On Tue, Mar 4, 2025 at 11:18 AM Richard Sandiford
> wrote:
> >
> > Richard Sandiford writes:
> > > Jan Hubicka writes:
> > >>>
> > >>> Thanks for running these. I saw poor results for perlbench with my
> > >>> initial aarch64 hooks becau
On Mon, Feb 17, 2025 at 9:51 AM Hongtao Liu wrote:
>
> On Thu, Feb 13, 2025 at 4:08 PM Haochen Jiang wrote:
> >
> > Hi all,
> >
> > According to the previous feedback on our RFC for AVX10 option adjustment
> > and discussion with LLVM, we finalized how we a
On Wed, Feb 26, 2025 at 6:01 AM H.J. Lu wrote:
>
> Move the TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P target hook from
> i386.h to i386.cc.
Ok for the patch, looks obvious.
>
> * config/i386/i386.h (TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P):
> Moved to ...
> * config/i386/i386.cc (TARGET_SMALL_REGI
> -Original Message-
> From: Jiang, Haochen
> Sent: Wednesday, February 26, 2025 4:18 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Treat Granite Rapids/Granite Rapids-D/Diamond Rapids
> similar as Sapphire R
Patch v2:
The special treatment about + and - seems to be specific to `ASM_OUTPUT_SYMBOL_REF()`. Neither
operator is passed to `ASM_OUTPUT_LABELREF()`, so it's not necessary to check for them in
`ix86_asm_output_labelref()`.
--
Best regards,
LIU Hao
model, both patched to use Intel syntax by default. I have
also bootstrapped on x86_64-linux-gnu with default AT&T syntax, and verified that it produces
expected assembly with `-masm=intel`.
--
Best regards,
LIU Hao
From 07baacbc7de1f5dc5db9e834b030c1b642774a37 Mon Sep 17 00:00:00 2001
On Wed, Feb 19, 2025 at 9:06 PM Jan Hubicka wrote:
>
> Hi,
> this is a variant of a hook I benchmarked on cpu2016 with -Ofast -flto
> and -O2 -flto. For non -Os and no Windows ABI should be pratically the
> same as your variant that was simply returning mem_cost - 2.
>
I've tested O2/(Ofast march
On Thu, Feb 13, 2025 at 4:08 PM Haochen Jiang wrote:
>
> Hi all,
>
> According to the previous feedback on our RFC for AVX10 option adjustment
> and discussion with LLVM, we finalized how we are going to handle that.
>
> The overall direction is to re-alias avx10.x alias to 512 bit and only
> usin
On Fri, Feb 14, 2025 at 9:56 AM Haochen Jiang wrote:
>
> Hi all,
>
> When AVX512 is not explicitly set, we should not take EVEX512 bit into
> consideration when checking vector size. It will solve the intrin header
> file reporting warnings when compiling with -Wsystem-headers.
>
> However, there
On Tue, Feb 11, 2025 at 4:27 PM H.J. Lu wrote:
>
> On Tue, Feb 11, 2025 at 4:13 PM Hongtao Liu wrote:
> >
> > > PR117081 is about regression in povray. The reducted testcase:
> > Just for clarification. PR117081 is not about regression in povray.
> > it's re
> PR117081 is about regression in povray. The reducted testcase:
Just for clarification. PR117081 is not about regression in povray.
it's related to FAIL: gcc.target/i386/pr91384.c scan-assembler-not
testl
The pr91384.c is added by r12-7417 which is peephole optimization
expecting some specific ins
> -Original Message-
> From: Jiang, Haochen
> Sent: Monday, February 10, 2025 2:10 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Fix AVX512BW intrin header with __OPTIMIZE__ [PR
> 118813]
>
> Hi all,
>
On Mon, Feb 10, 2025 at 1:43 PM liuhongt wrote:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108707#c9
>
> >Pranav Gorantla 2025-02-06 04:30:05 UTC
> >Facing similar issue in gcc-13. Is it possible to backport the fix of this
> >Bug 108707 and Bug 109610 to gcc-13, gcc-12 as well.
>
> This se
On Fri, Feb 7, 2025 at 1:57 PM H.J. Lu wrote:
>
> For
>
> ---
> int f(int);
>
> int advance(int dz)
> {
> if (dz > 0)
> return (dz + dz) * dz;
> else
> return dz * f(dz);
> }
> ---
>
> Before r15-1619-g3b9b8d6cfdf593
>
> advance(int):
> pushrbx
> mov
> -Original Message-
> From: Jakub Jelinek
> Sent: Friday, February 7, 2025 4:08 PM
> To: Liu, Hongtao
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH] i386: Fix ICE with conditional QI/HI vector maxmin
> [PR118776]
>
> Hi!
>
> The following testcas
On Wed, Jan 22, 2025 at 11:13 AM Haochen Jiang wrote:
>
> Hi all,
>
> These two testcases are misses on previous addition for
> -march=x86-64-v3 to silence warning for -march=native tests.
>
> Ok for trunk?
Ok.
>
> Thx,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/vnniint16
On Tue, Jan 21, 2025 at 4:42 PM Haochen Jiang wrote:
>
> Hi all,
>
> Recently, DMR ISAs got lots of changes in mnemonics. The detailed change
> are:
>
> - NE would be removed for all AVX10.2 new insns
> - VCOMSBF16 -> VCOMISBF16
> - P for packed omitted for AI data types (BF16, TF32, FP8)
>
> -Original Message-
> From: Jiang, Haochen
> Sent: Friday, January 3, 2025 4:55 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Change mnemonics from TCVTROWPS2PBF16[H,L] to
> TCVTROWPS2BF16[H,L]
>
> Hi
> -Original Message-
> From: Gerald Pfeifer
> Sent: Wednesday, December 25, 2024 11:40 AM
> To: Liu, Hongtao
> Cc: gcc-patches@gcc.gnu.org; hjl.to...@gmail.com
> Subject: Re: [PATCH] Document refactoring of the option -fcf-protection=x.
>
> On Fri, 12 Jan 20
On Thu, Dec 19, 2024 at 12:01 AM Richard Sandiford
wrote:
>
> In a later patch, I need to add "@" to a pattern that uses subst
> iterators. This combination is problematic for two reasons:
>
> (1) define_substs are applied and filtered at a later stage than the
> handling of "@" patterns, so
On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote:
>
> For all different modes of all 0s/1s vectors, we can use the single widest
> all 0s/1s vector register for all 0s/1s vector uses in the whole function.
> Add a pass to generate a single widest all 0s/1s vector set instruction at
> entry of the near
在 2024-11-29 23:50, Jonathan Wakely 写道:
It looks like your patch is against gcc-14 not trunk, the
GLIBCXX_15.1.0 version is already there.
Sorry, I mean GLIBCXX_3.4.34 for 15.1.0
Oops that's what I used to test the patch. Reapplied to master now.
--
Best regards,
LIU Hao
#if.. ?
Please add full stops (periods) to the ChangeLog entry, to make
complete sentences.
Is "PR libstdc++/, target/" valid like that? I don't think it
is, it should be two separate lines:
PR libstdc++/
PR target/
Fixed now.
--
Best regard
--
Best regards,
LIU Hao
From 78ae9cacdfea8bab4fcc8a18068ad30401eb588d Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Fri, 29 Nov 2024 17:17:01 +0800
Subject: [PATCH] libstdc++: Hide TLS variables in `std::call_once`
This is a transitional change for PR80881, because on Windows, thread-local
On Thu, Nov 28, 2024 at 4:57 PM Richard Biener
wrote:
>
> On Thu, Nov 28, 2024 at 3:04 AM Hongtao Liu wrote:
> >
> > On Wed, Nov 27, 2024 at 9:43 PM Richard Biener
> > wrote:
> > >
> > > On Wed, Nov 27, 2024 at 4:26 AM liuhongt wrote:
> > >
On Wed, Nov 27, 2024 at 8:50 PM Richard Biener wrote:
>
> On Wed, 27 Nov 2024, Jakub Jelinek wrote:
>
> > Hi!
> >
> > The r15-4833-ge9ab41b79933 patch had among tons of config/i386
> > specific changes also important change to the generic code, allowing
> > also 2 as valid value of the second argu
On Wed, Nov 27, 2024 at 9:43 PM Richard Biener
wrote:
>
> On Wed, Nov 27, 2024 at 4:26 AM liuhongt wrote:
> >
> > When loop requires any kind of versioning which could increase register
> > pressure too much, and it's in a deeply nest big loop, don't do
> > vectorization.
> >
> > I tested the pat
On Mon, Nov 25, 2024 at 2:32 PM Kong, Lingling wrote:
>
> Hi,
>
> LGTM.
> Now Hongyu and Hongtao are working on APX.
Ok.
>
> Thanks,
> Lingling
>
> > -Original Message-
> > From: Gregory Kanter
> > Sent: Saturday, November 23, 2024 8:16 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Kong, Lin
On Fri, Nov 22, 2024 at 4:08 PM Haochen Jiang wrote:
>
> Hi all,
>
> Under FP8, we should not use AVX512F_LEN_HALF to get the mask size since
> it will get 16 instead of 8 and drop into wrong if condition. Correct
> the usage for vcvtneph2[b,h]f8[,s] runtime test.
>
> Tested under sde. Ok for trun
On Wed, Nov 20, 2024 at 8:03 PM Cui, Lili wrote:
>
> Hi, all
>
> This patch aims to handle certain vector shuffle operations using pand, pandn
> and por more efficiently.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
Although it's stage 3, I think this one is low risk, so O
On Sun, Nov 24, 2024 at 8:05 PM Richard Biener wrote:
>
>
>
> > Am 24.11.2024 um 09:17 schrieb Hongtao Liu :
> >
> > On Fri, Nov 22, 2024 at 9:33 PM Richard Biener wrote:
> >>
> >> Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables
On Fri, Nov 22, 2024 at 9:16 PM Richard Biener wrote:
>
> On Fri, 22 Nov 2024, liuhongt wrote:
>
> > It could cause weired spill in RA when register pressure is high.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > BTW, It's difficult to get a decent tes
> -Original Message-
> From: Li, Pan2
> Sent: Monday, November 25, 2024 10:01 AM
> To: gcc-patches@gcc.gnu.org
> Cc: ubiz...@gmail.com; Liu, Hongtao ; Li, Pan2
>
> Subject: [PATCH v1] I386: Add more testcases for unsigned SAT_ADD vector
> pattern
>
> Fro
On Fri, Nov 22, 2024 at 9:33 PM Richard Biener wrote:
>
> Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables
> an extra 128bit SSE vector epilouge when doing 512bit AVX512
> vectorization in the main loop the following allows a 64bit SSE
> vector epilogue to be generated when the pr
On Fri, Nov 22, 2024 at 2:40 PM Haochen Jiang wrote:
>
> Hi all,
>
> When -avx10.2 meet -march with AVX512 enabled, it will report warning
> for vector size conflict. The warning will prevent the test to run on
> GCC with arch native build on those platforms when
> check_effective_target.
>
> Remo
On Thu, Nov 21, 2024 at 2:40 PM Haochen Jiang wrote:
>
> Hi all,
>
> Under -fno-omit-frame-pointer, %ebp will be used, which is the
> Solaris/x86 default. Both check %ebp and %esp to avoid error on that.
>
> Tested under -m32 w/ and w/o -fno-omit-frame-pointer. Ok for trunk?
Ok.
>
> Thx,
> Haochen
1 - 100 of 1201 matches
Mail list logo