On Mon, Apr 21, 2025 at 2:52 PM liuhongt wrote:
>
> Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand
> or vpandn.
> Current register_operand/vector_operand could lose some optimization
> opportunity.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for tru
On Tue, Apr 22, 2025 at 10:30 AM Hongtao Liu wrote:
>
> On Tue, Apr 22, 2025 at 12:46 AM Jan Hubicka wrote:
> >
> > Hi,
> > this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR,
> > MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_E
On Tue, Apr 22, 2025 at 12:46 AM Jan Hubicka wrote:
>
> Hi,
> this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR,
> MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_EXPR and
> ABSU_EXPR
> but it was only correct for FP variant (wehre it corresponds to andss clea
On Mon, Apr 21, 2025 at 4:30 PM H.J. Lu wrote:
>
> On Mon, Apr 21, 2025 at 11:29 AM Hongtao Liu wrote:
> >
> > On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote:
> > >
> > > On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote:
> > > >
> > > >
On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote:
>
> On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote:
> >
> > For all different modes of all 0s/1s vectors, we can use the single widest
> > all 0s/1s vector register for all 0s/1s vector uses in the whole function.
> > Add a pass to generate a single wi
On Tue, Apr 8, 2025 at 3:52 AM H.J. Lu wrote:
>
> Simplify memcpy and memset inline strategies to avoid branches for
> -mtune=generic:
>
> 1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
>load and store for up to 16 * 16 (256) bytes when the data size is
>fixed and kn
sysv abi, the argument should go in esi
+/* { dg-final { scan-assembler-times "movl\[\\t \]*\\\$20,\[\\t \[]*%esi" 2 }
} */
+
+
ditto.
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
On Mon, Apr 14, 2025 at 8:56 PM H.J. Lu wrote:
>
> On Mon, Apr 14, 2025 at 2:39 AM Uros Bizjak wrote:
> >
> > On Mon, Apr 14, 2025 at 8:54 AM Hongtao Liu wrote:
> > >
> > > On Mon, Apr 14, 2025 at 7:36 AM H.J. Lu wrote:
> > > >
> >
On Mon, Apr 14, 2025 at 7:36 AM H.J. Lu wrote:
>
> Don't use red-zone when there are no caller-saved registers and APX is
> enabled since 128-byte red-zone is too small for 31 GPRs.
>
> gcc/
>
> PR target/119784
> * config/i386/i386.cc (ix86_using_red_zone): Don't use red-zone
>
> -Original Message-
> From: Uros Bizjak
> Sent: Tuesday, April 1, 2025 5:24 PM
> To: Hongtao Liu
> Cc: Wang, Hongyu ; gcc-patches@gcc.gnu.org; Liu,
> Hongtao
> Subject: Re: [PATCH] APX: add nf counterparts for rotl split pattern [PR
> 119539]
>
> O
On Mon, Mar 31, 2025 at 9:52 PM Richard Biener wrote:
>
> On Mon, 31 Mar 2025, Jakub Jelinek wrote:
>
> > On Mon, Mar 31, 2025 at 03:33:34PM +0200, Richard Biener wrote:
> > > On Mon, 31 Mar 2025, Jakub Jelinek wrote:
> > >
> > > > On Mon, Mar 31, 2025 at 03:12:56PM +0200, Richard Biener wrote:
>
On Wed, Apr 2, 2025 at 2:58 PM Hongyu Wang wrote:
>
> > Can we just change the output in original pattern, I think combine
> > will still match the pattern even w/ clobber flags.
>
> Yes, adjusted and updated the patch in attachment.
Ok.
>
> Liu, Ho
On Tue, Apr 1, 2025 at 4:40 PM Hongyu Wang wrote:
>
> Hi,
>
> For spiltter after 3_mask it now splits the pattern
> to *3_mask, causing the splitter doesn't generate
> nf variant. Add corresponding nf counterpart for define_insn_and_split
> to make the splitter also works for nf insn.
>
> Bootstra
On Tue, Apr 1, 2025 at 3:56 PM Jakub Jelinek wrote:
>
> On Tue, Apr 01, 2025 at 01:36:23PM +0800, Hongtao Liu wrote:
> > >Changing ix86_valid_target_attribute_inner_p might be even better because
> > >OPT_msse4 is RejectNegative option, so !value for it looks weird.
On Fri, Mar 28, 2025 at 1:55 PM Hu, Lin1 wrote:
>
> For vaes patterns with jm constraint and gpr16 attr, it requires "isa"
> attr to distinct avx/avx512 alternatives in ix86_memory_address_reg_class.
> Also adds missing type and mode attributes for those vaes patterns.
Ok.
>
> gcc/ChangeLog:
>
>
On Fri, Mar 28, 2025 at 4:22 PM Haochen Jiang wrote:
>
> Hi all,
>
> For -march= handling, PTA_AVX10_1 will not imply PTA_AVX10_1_256,
> resulting in TARGET_AVX10_1 becoming true while TARGET_AVX10_1_256
> false. Since we will check TARGET_AVX10_1_256 in GCC 15 for AVX512
> feature enabling for AV
This is a minor change, bootstrapped on x86_64-w64-mingw32.
--
Best regards,
LIU Hao
From 83c3e90432f9ebc97785d81be7a94066d9923920 Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Sat, 29 Mar 2025 22:47:54 +0800
Subject: [PATCH] gcc/mingw: Align `.refptr.` to 8-byte boundaries for 64-bit
targets
On Wed, Mar 26, 2025 at 9:50 AM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to ensure each alternative with constraint "jm" should
> set addr "gpr16", otherwise maybe raise ICE in reload pass.
>
> Bootstrapped and Regtested for x86_64-pc-linux-gnu{-m32,-m64}, ok for trunk?
Ok.
>
> BRs,
> Lin
>
> -Original Message-
> From: Hu, Lin1
> Sent: Tuesday, March 25, 2025 4:23 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: RE: [PATCH v2] i386: Add "s_" as Saturation for AVX10.2 Converting
> Intrinsics.
>
> Mor
On Thu, Mar 20, 2025 at 3:14 PM Hu, Lin1 wrote:
>
> Hi,
>
> res_ref will be modified after MASK_ZERO, init res_ref2 for rounding
> control intrinsics.
>
> Bootstrapped and regtested on x86-64-pc-linux-gnu{-m32,-m64}, OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/testsuite/ChangeLog:
>
> * gcc.t
> -Original Message-
> From: Liu, Hongtao
> Sent: Thursday, March 20, 2025 9:29 AM
> To: Hu, Lin1 ; gcc-patches@gcc.gnu.org
> Cc: ubiz...@gmail.com
> Subject: RE: [PATCH 0/4] Fix AVX10.2 SAT CVT.
>
>
>
> > -Original Message-
> > From:
> -Original Message-
> From: Jiang, Haochen
> Sent: Wednesday, March 19, 2025 3:38 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH 00/27] Use avx10.x as the only option for AVX10 with 512 bit
> vector support while remove a
> -Original Message-
> From: Hu, Lin1
> Sent: Wednesday, March 19, 2025 3:49 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH 0/4] Fix AVX10.2 SAT CVT.
>
> Hi, all
>
> This series of patches fixes three issues in
On Tue, Mar 11, 2025 at 2:29 PM Haochen Jiang wrote:
>
> Hi all,
>
> After commit r15-4510, the following testcases also do not need XFAIL.
>
> Ok for trunk?
Ok.
>
> Thx,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512f-pr103750-1.c: Remove XFAIL.
> * gcc.target
On Wed, Mar 5, 2025 at 3:23 PM Haochen Jiang wrote:
>
> Hi all,
>
> For bf8 -> pf16 convert, when dst is 256 bit, the mask should be
> 16 bit since 16*16=256, not the 8 bit in the current intrin. In
> 512 bit intrin, the mask bit is also halved. This patch will fix
> both of them.
>
> Ok for trunk
On Tue, Mar 4, 2025 at 6:31 PM Richard Biener
wrote:
>
> On Tue, Mar 4, 2025 at 11:18 AM Richard Sandiford
> wrote:
> >
> > Richard Sandiford writes:
> > > Jan Hubicka writes:
> > >>>
> > >>> Thanks for running these. I saw poor results for perlbench with my
> > >>> initial aarch64 hooks becau
On Mon, Feb 17, 2025 at 9:51 AM Hongtao Liu wrote:
>
> On Thu, Feb 13, 2025 at 4:08 PM Haochen Jiang wrote:
> >
> > Hi all,
> >
> > According to the previous feedback on our RFC for AVX10 option adjustment
> > and discussion with LLVM, we finalized how we a
On Wed, Feb 26, 2025 at 6:01 AM H.J. Lu wrote:
>
> Move the TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P target hook from
> i386.h to i386.cc.
Ok for the patch, looks obvious.
>
> * config/i386/i386.h (TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P):
> Moved to ...
> * config/i386/i386.cc (TARGET_SMALL_REGI
> -Original Message-
> From: Jiang, Haochen
> Sent: Wednesday, February 26, 2025 4:18 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Treat Granite Rapids/Granite Rapids-D/Diamond Rapids
> similar as Sapphire R
Patch v2:
The special treatment about + and - seems to be specific to `ASM_OUTPUT_SYMBOL_REF()`. Neither
operator is passed to `ASM_OUTPUT_LABELREF()`, so it's not necessary to check for them in
`ix86_asm_output_labelref()`.
--
Best regards,
LIU Hao
model, both patched to use Intel syntax by default. I have
also bootstrapped on x86_64-linux-gnu with default AT&T syntax, and verified that it produces
expected assembly with `-masm=intel`.
--
Best regards,
LIU Hao
From 07baacbc7de1f5dc5db9e834b030c1b642774a37 Mon Sep 17 00:00:00 2001
On Wed, Feb 19, 2025 at 9:06 PM Jan Hubicka wrote:
>
> Hi,
> this is a variant of a hook I benchmarked on cpu2016 with -Ofast -flto
> and -O2 -flto. For non -Os and no Windows ABI should be pratically the
> same as your variant that was simply returning mem_cost - 2.
>
I've tested O2/(Ofast march
On Thu, Feb 13, 2025 at 4:08 PM Haochen Jiang wrote:
>
> Hi all,
>
> According to the previous feedback on our RFC for AVX10 option adjustment
> and discussion with LLVM, we finalized how we are going to handle that.
>
> The overall direction is to re-alias avx10.x alias to 512 bit and only
> usin
On Fri, Feb 14, 2025 at 9:56 AM Haochen Jiang wrote:
>
> Hi all,
>
> When AVX512 is not explicitly set, we should not take EVEX512 bit into
> consideration when checking vector size. It will solve the intrin header
> file reporting warnings when compiling with -Wsystem-headers.
>
> However, there
On Tue, Feb 11, 2025 at 4:27 PM H.J. Lu wrote:
>
> On Tue, Feb 11, 2025 at 4:13 PM Hongtao Liu wrote:
> >
> > > PR117081 is about regression in povray. The reducted testcase:
> > Just for clarification. PR117081 is not about regression in povray.
> > it's re
> PR117081 is about regression in povray. The reducted testcase:
Just for clarification. PR117081 is not about regression in povray.
it's related to FAIL: gcc.target/i386/pr91384.c scan-assembler-not
testl
The pr91384.c is added by r12-7417 which is peephole optimization
expecting some specific ins
> -Original Message-
> From: Jiang, Haochen
> Sent: Monday, February 10, 2025 2:10 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Fix AVX512BW intrin header with __OPTIMIZE__ [PR
> 118813]
>
> Hi all,
>
On Mon, Feb 10, 2025 at 1:43 PM liuhongt wrote:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108707#c9
>
> >Pranav Gorantla 2025-02-06 04:30:05 UTC
> >Facing similar issue in gcc-13. Is it possible to backport the fix of this
> >Bug 108707 and Bug 109610 to gcc-13, gcc-12 as well.
>
> This se
On Fri, Feb 7, 2025 at 1:57 PM H.J. Lu wrote:
>
> For
>
> ---
> int f(int);
>
> int advance(int dz)
> {
> if (dz > 0)
> return (dz + dz) * dz;
> else
> return dz * f(dz);
> }
> ---
>
> Before r15-1619-g3b9b8d6cfdf593
>
> advance(int):
> pushrbx
> mov
> -Original Message-
> From: Jakub Jelinek
> Sent: Friday, February 7, 2025 4:08 PM
> To: Liu, Hongtao
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH] i386: Fix ICE with conditional QI/HI vector maxmin
> [PR118776]
>
> Hi!
>
> The following testcas
On Wed, Jan 22, 2025 at 11:13 AM Haochen Jiang wrote:
>
> Hi all,
>
> These two testcases are misses on previous addition for
> -march=x86-64-v3 to silence warning for -march=native tests.
>
> Ok for trunk?
Ok.
>
> Thx,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/vnniint16
On Tue, Jan 21, 2025 at 4:42 PM Haochen Jiang wrote:
>
> Hi all,
>
> Recently, DMR ISAs got lots of changes in mnemonics. The detailed change
> are:
>
> - NE would be removed for all AVX10.2 new insns
> - VCOMSBF16 -> VCOMISBF16
> - P for packed omitted for AI data types (BF16, TF32, FP8)
>
> -Original Message-
> From: Jiang, Haochen
> Sent: Friday, January 3, 2025 4:55 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Change mnemonics from TCVTROWPS2PBF16[H,L] to
> TCVTROWPS2BF16[H,L]
>
> Hi
> -Original Message-
> From: Gerald Pfeifer
> Sent: Wednesday, December 25, 2024 11:40 AM
> To: Liu, Hongtao
> Cc: gcc-patches@gcc.gnu.org; hjl.to...@gmail.com
> Subject: Re: [PATCH] Document refactoring of the option -fcf-protection=x.
>
> On Fri, 12 Jan 20
On Thu, Dec 19, 2024 at 12:01 AM Richard Sandiford
wrote:
>
> In a later patch, I need to add "@" to a pattern that uses subst
> iterators. This combination is problematic for two reasons:
>
> (1) define_substs are applied and filtered at a later stage than the
> handling of "@" patterns, so
On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote:
>
> For all different modes of all 0s/1s vectors, we can use the single widest
> all 0s/1s vector register for all 0s/1s vector uses in the whole function.
> Add a pass to generate a single widest all 0s/1s vector set instruction at
> entry of the near
在 2024-11-29 23:50, Jonathan Wakely 写道:
It looks like your patch is against gcc-14 not trunk, the
GLIBCXX_15.1.0 version is already there.
Sorry, I mean GLIBCXX_3.4.34 for 15.1.0
Oops that's what I used to test the patch. Reapplied to master now.
--
Best regards,
LIU Hao
#if.. ?
Please add full stops (periods) to the ChangeLog entry, to make
complete sentences.
Is "PR libstdc++/, target/" valid like that? I don't think it
is, it should be two separate lines:
PR libstdc++/
PR target/
Fixed now.
--
Best regard
--
Best regards,
LIU Hao
From 78ae9cacdfea8bab4fcc8a18068ad30401eb588d Mon Sep 17 00:00:00 2001
From: LIU Hao
Date: Fri, 29 Nov 2024 17:17:01 +0800
Subject: [PATCH] libstdc++: Hide TLS variables in `std::call_once`
This is a transitional change for PR80881, because on Windows, thread-local
On Thu, Nov 28, 2024 at 4:57 PM Richard Biener
wrote:
>
> On Thu, Nov 28, 2024 at 3:04 AM Hongtao Liu wrote:
> >
> > On Wed, Nov 27, 2024 at 9:43 PM Richard Biener
> > wrote:
> > >
> > > On Wed, Nov 27, 2024 at 4:26 AM liuhongt wrote:
> > >
On Wed, Nov 27, 2024 at 8:50 PM Richard Biener wrote:
>
> On Wed, 27 Nov 2024, Jakub Jelinek wrote:
>
> > Hi!
> >
> > The r15-4833-ge9ab41b79933 patch had among tons of config/i386
> > specific changes also important change to the generic code, allowing
> > also 2 as valid value of the second argu
On Wed, Nov 27, 2024 at 9:43 PM Richard Biener
wrote:
>
> On Wed, Nov 27, 2024 at 4:26 AM liuhongt wrote:
> >
> > When loop requires any kind of versioning which could increase register
> > pressure too much, and it's in a deeply nest big loop, don't do
> > vectorization.
> >
> > I tested the pat
On Mon, Nov 25, 2024 at 2:32 PM Kong, Lingling wrote:
>
> Hi,
>
> LGTM.
> Now Hongyu and Hongtao are working on APX.
Ok.
>
> Thanks,
> Lingling
>
> > -Original Message-
> > From: Gregory Kanter
> > Sent: Saturday, November 23, 2024 8:16 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Kong, Lin
On Fri, Nov 22, 2024 at 4:08 PM Haochen Jiang wrote:
>
> Hi all,
>
> Under FP8, we should not use AVX512F_LEN_HALF to get the mask size since
> it will get 16 instead of 8 and drop into wrong if condition. Correct
> the usage for vcvtneph2[b,h]f8[,s] runtime test.
>
> Tested under sde. Ok for trun
On Wed, Nov 20, 2024 at 8:03 PM Cui, Lili wrote:
>
> Hi, all
>
> This patch aims to handle certain vector shuffle operations using pand, pandn
> and por more efficiently.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
Although it's stage 3, I think this one is low risk, so O
On Sun, Nov 24, 2024 at 8:05 PM Richard Biener wrote:
>
>
>
> > Am 24.11.2024 um 09:17 schrieb Hongtao Liu :
> >
> > On Fri, Nov 22, 2024 at 9:33 PM Richard Biener wrote:
> >>
> >> Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables
On Fri, Nov 22, 2024 at 9:16 PM Richard Biener wrote:
>
> On Fri, 22 Nov 2024, liuhongt wrote:
>
> > It could cause weired spill in RA when register pressure is high.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > BTW, It's difficult to get a decent tes
> -Original Message-
> From: Li, Pan2
> Sent: Monday, November 25, 2024 10:01 AM
> To: gcc-patches@gcc.gnu.org
> Cc: ubiz...@gmail.com; Liu, Hongtao ; Li, Pan2
>
> Subject: [PATCH v1] I386: Add more testcases for unsigned SAT_ADD vector
> pattern
>
> Fro
On Fri, Nov 22, 2024 at 9:33 PM Richard Biener wrote:
>
> Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables
> an extra 128bit SSE vector epilouge when doing 512bit AVX512
> vectorization in the main loop the following allows a 64bit SSE
> vector epilogue to be generated when the pr
On Fri, Nov 22, 2024 at 2:40 PM Haochen Jiang wrote:
>
> Hi all,
>
> When -avx10.2 meet -march with AVX512 enabled, it will report warning
> for vector size conflict. The warning will prevent the test to run on
> GCC with arch native build on those platforms when
> check_effective_target.
>
> Remo
On Thu, Nov 21, 2024 at 2:40 PM Haochen Jiang wrote:
>
> Hi all,
>
> Under -fno-omit-frame-pointer, %ebp will be used, which is the
> Solaris/x86 default. Both check %ebp and %esp to avoid error on that.
>
> Tested under -m32 w/ and w/o -fno-omit-frame-pointer. Ok for trunk?
Ok.
>
> Thx,
> Haochen
> -Original Message-
> From: Mayshao-oc
> Sent: Wednesday, November 20, 2024 2:43 PM
> To: Hongtao Liu
> Cc: Liu, Hongtao ; Xi Ruoyao ;
> gcc-patches@gcc.gnu.org; hubi...@ucw.cz; ubiz...@gmail.com;
> richard.guent...@gmail.com; Tim Hu(WH-RD) ; Silvia
> Zhao
On Wed, Nov 13, 2024 at 10:00 AM Hongyu Wang wrote:
>
> Hi,
>
> For cstorebf4 it uses comparison_operator for BFmode compare, which is
> incorrect when directly uses ix86_expand_setcc as it does not canonicalize
> the input comparison to correct the compare code by swapping operands.
> Since the o
On Wed, Nov 13, 2024 at 8:29 AM H.J. Lu wrote:
>
> On Wed, Nov 13, 2024 at 5:57 AM H.J. Lu wrote:
> >
> > On Tue, Nov 12, 2024 at 9:30 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Nov 12, 2024 at 1:49 PM H.J. Lu wrote:
> > > >
> > > > When passing 0xff as an unsigned char function argument,
On Mon, Nov 11, 2024 at 8:20 PM Richard Biener wrote:
>
> The following adds X86_TUNE_AVX512_TWO_EPILOGUES tuning and directs the
> vectorizer to produce both a vector AVX2 and SSE epilogue for AVX512
> vectorized loops when set. The tuning is enabled by default for Zen4
> and Zen5 where I benchm
On Fri, Nov 8, 2024 at 10:33 AM liuhongt wrote:
>
> hw instruction doesn't raise exceptions, turns sNAN into qNAN quietly,
> and always round to nearest (even). Output denormals are always
> flushed to zero and input denormals are always treated as zero. MXCSR
> is not consulted nor updated.
> W/o
On Fri, Nov 8, 2024 at 3:18 PM Uros Bizjak wrote:
>
> On Fri, Nov 8, 2024 at 6:52 AM Hongtao Liu wrote:
>
> > > > > PR target/117418
> > > > > * config/i386/i386-options.cc
> > > > > (ix86_option_override_internal): raise
On Fri, Nov 8, 2024 at 1:21 PM Hongtao Liu wrote:
>
> On Fri, Nov 8, 2024 at 12:18 PM H.J. Lu wrote:
> >
> > On Fri, Nov 8, 2024 at 10:41 AM Hu, Lin1 wrote:
> > >
> > > Hi, all
> > >
> > > -maddress-mode=long will let Pmode = DI_mode, but -
On Fri, Nov 8, 2024 at 12:18 PM H.J. Lu wrote:
>
> On Fri, Nov 8, 2024 at 10:41 AM Hu, Lin1 wrote:
> >
> > Hi, all
> >
> > -maddress-mode=long will let Pmode = DI_mode, but -mx32 request x32 ABI.
> > So raise an error to avoid ICE.
> >
> > Bootstrapped and regtested, OK for trunk?
> >
> > BRs,
>
On Fri, Nov 8, 2024 at 10:21 AM Mayshao-oc wrote:
>
> > > -Original Message-
> > > From: Xi Ruoyao
> > > Sent: Thursday, November 7, 2024 1:12 PM
> > > To: Liu, Hongtao ; Mayshao-oc > > o...@zhaoxin.com>; Hongtao Liu
> > > Cc: g
On Fri, Nov 8, 2024 at 1:58 AM Robin Dapp wrote:
>
> From: Robin Dapp
>
> gcc/ChangeLog:
>
> * config/i386/sse.md (maskload):
> Call maskload..._1.
> (maskload_1): Rename.
Ok for x86 part.
> ---
> gcc/config/i386/sse.md | 21 ++---
> 1 file changed, 18 ins
On Thu, Nov 7, 2024 at 3:52 PM Jakub Jelinek wrote:
>
> On Thu, Nov 07, 2024 at 01:57:21PM +0800, Hongtao Liu wrote:
> > > Does it turn the sNaNs into infinities or qNaNs silently?
> > Yes.
>
> Into infinities?
Into qNaNs(Sorry, I didn't see it clea
> -Original Message-
> From: Hu, Lin1
> Sent: Thursday, November 7, 2024 2:35 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Modify regexp of pr117304-1.c
>
> OK, so just modify the regexp.
>
> Since the
On Thu, Nov 7, 2024 at 2:04 PM Hu, Lin1 wrote:
>
> > -Original Message-
> > From: Liu, Hongtao
> > Sent: Thursday, November 7, 2024 11:41 AM
> > To: Hu, Lin1 ; gcc-patches@gcc.gnu.org
> > Cc: ubiz...@gmail.com
> > Subject: RE: [PATCH
> -Original Message-
> From: Xi Ruoyao
> Sent: Thursday, November 7, 2024 1:12 PM
> To: Liu, Hongtao ; Mayshao-oc o...@zhaoxin.com>; Hongtao Liu
> Cc: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; ubiz...@gmail.com;
> richard.guent...@gmail.com; Tim Hu(WH-RD) ; Silvia
On Tue, Nov 5, 2024 at 5:19 PM Jakub Jelinek wrote:
>
> On Tue, Nov 05, 2024 at 05:12:56PM +0800, Hongtao Liu wrote:
> > Yes, there's a mismatch between scalar and vector code, I assume users
> > may not care much about precision/NAN/INF/denormal behaviors for
> >
> -Original Message-
> From: Mayshao-oc
> Sent: Thursday, November 7, 2024 11:13 AM
> To: Hongtao Liu
> Cc: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; Liu, Hongtao
> ; ubiz...@gmail.com; richard.guent...@gmail.com;
> Tim Hu(WH-RD) ; Silvia Zhao(BJ-RD)
> ; Louis
> -Original Message-
> From: Hu, Lin1
> Sent: Thursday, November 7, 2024 11:03 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Add -mavx512vl for pr117304-1.c
>
> Hi, all
>
> Testing pr117304-1.c in a ma
On Thu, Nov 7, 2024 at 10:29 AM MayShao-oc wrote:
>
> Hi all:
>For zhaoxin, I find no improvement when enable pass_align_tight_loops,
> and have performance drop in some cases.
>This patch add a new tunable to bypass pass_align_tight_loops in zhaoxin.
>
>Bootstrapped X86_64.
>Ok fo
On Wed, Nov 6, 2024 at 4:59 PM Jakub Jelinek wrote:
>
> On Fri, Oct 18, 2024 at 02:05:59PM -0400, Antoni Boucher wrote:
> > PR target/116725
> > * gcc.target/i386/pr116725.c: Add test using those AVX builtins.
>
> This test FAILs for me, as I don't have the latest gas aroun
> -Original Message-
> From: H.J. Lu
> Sent: Wednesday, November 6, 2024 4:17 PM
> To: Liu, Hongtao ; GCC Patches patc...@gcc.gnu.org>; Uros Bizjak
> Subject: [PATCH] avx10_2-comibf-2.c: Require AVX10.2 support
>
> Since avx10_2-comibf-2.c is a run test,
On Wed, Nov 6, 2024 at 10:35 AM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to add OPTION_MASK_ISA2_EVEX512 for all avx512 512-bits
> builtin functions, raise error when these builtin functions are used with
> -mno-evex512.
>
> Bootstrapped and Regtested on x86-64-pc-linux-gnu, OK for trunk an
On Tue, Nov 5, 2024 at 5:50 PM Mayshao-oc wrote:
>
>
> >
> >
> > On Tue, Nov 5, 2024 at 2:34 PM Liu, Hongtao wrote:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: MayShao-oc
> > > > Sent:
On Wed, Nov 6, 2024 at 8:19 AM H.J. Lu wrote:
>
> Since x32 uses (%edi), instead of (%rdi), also scan (%edi).
>
> * gcc.target/i386/apx-ndd.c: Also scan (%edi).
Ok.
>
> --
> H.J.
--
BR,
Hongtao
On Wed, Nov 6, 2024 at 8:21 AM H.J. Lu wrote:
>
> Since x32 uses (%reg32), instead of (%r.x), also scan (%e.x).
>
> * gcc.target/i386/avx10_2-512-movrs-1.c: Also scan (%e.x).
> * gcc.target/i386/avx10_2-movrs-1.c: Likewise.
> * gcc.target/i386/movrs-1.c: Likewise.
Ok.
>
> --
> H.J.
--
BR,
Hong
On Tue, Nov 5, 2024 at 5:33 PM Richard Biener
wrote:
>
> On Tue, Nov 5, 2024 at 8:12 AM Hongtao Liu wrote:
> >
> > On Tue, Nov 5, 2024 at 2:34 PM Liu, Hongtao wrote:
> > >
> > >
> > >
> > > > -Original Message-
> > > &g
On Tue, Nov 5, 2024 at 4:46 PM Jakub Jelinek wrote:
>
> On Tue, Oct 29, 2024 at 07:19:38PM -0700, liuhongt wrote:
> > Generate native instruction whenever possible, otherwise use vector
> > permutation with odd indices.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ready pu
On Tue, Nov 5, 2024 at 2:34 PM Liu, Hongtao wrote:
>
>
>
> > -Original Message-
> > From: MayShao-oc
> > Sent: Tuesday, November 5, 2024 11:20 AM
> > To: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; Liu, Hongtao
> > ; ubiz...@gmail.com
> > Cc: ti
> -Original Message-
> From: MayShao-oc
> Sent: Tuesday, November 5, 2024 11:20 AM
> To: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; Liu, Hongtao
> ; ubiz...@gmail.com
> Cc: ti...@zhaoxin.com; silviaz...@zhaoxin.com; loui...@zhaoxin.com;
> cobec...@zhaoxin.com
>
On Tue, Nov 5, 2024 at 2:41 PM Hu, Lin1 wrote:
>
> > -Original Message-
> > From: Hu, Lin1
> > Sent: Tuesday, November 5, 2024 1:34 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Liu, Hongtao ; ubiz...@gmail.com
> > Subject: [PATC
On Tue, Nov 5, 2024 at 10:52 AM Hu, Lin1 wrote:
>
> Hi, all
>
> __builtin_ia32_prefetch's op1 should be between 0 and 2. So add an error
> handler.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, there is a unrelated FAIL
> that has yet to be found root cause, just send patch for review.
>
On Fri, Nov 1, 2024 at 11:24 AM Haochen Jiang wrote:
>
> Hi all,
>
> I have just landed new ISA patches on trunk. The next step will
> be the arch support for ISE055 mentioned CPUs.
>
> There are two changes in ISE055 on CPUs:
>
> - A new model number is added for Arrow Lake.
> - Diamond Rapid
On Fri, Nov 1, 2024 at 8:33 AM Hongyu Wang wrote:
>
> From: Levy Hsu
>
> This patch enables the use of the VCOMSBF16 instruction from AVX10.2 for
> efficient BF16 comparisons.
>
> Bootstrapped & regtested on x86-64-pc-linux-gnu.
> Ok for trunk?
Ok.
>
> gcc/ChangeLog:
>
> * config/i386/i38
On Sat, Nov 2, 2024 at 8:58 PM Robin Dapp wrote:
>
> From: Robin Dapp
>
> This patch adds a zero else operand to masked loads, in particular the
> masked gather load builtins that are used for gather vectorization.
>
> gcc/ChangeLog:
>
> * config/i386/i386-expand.cc (ix86_expand_special_a
On Thu, Jul 4, 2024 at 11:00 AM Hongtao Liu wrote:
>
> On Tue, Jul 2, 2024 at 11:24 AM Hongyu Wang wrote:
> >
> > Hi,
> >
> > According to APX spec, the pushp/popp pairs should be matched,
> > otherwise the PPX hint cannot take effect and ca
On Fri, Oct 18, 2024 at 10:23 PM Robin Dapp wrote:
>
> This patch adds a zero else operand to masked loads, in particular the
> masked gather load builtins that are used for gather vectorization.
>
> gcc/ChangeLog:
>
> * config/i386/i386-expand.cc (ix86_expand_special_args_builtin):
>
On Tue, Oct 29, 2024 at 5:04 PM Haochen Jiang wrote:
>
> Hi all,
>
> Since Binutils haven't fully merged all AVX10.2 insts, only testing
> one inst/intrin in AVX10.2 is never sufficient for check_effective_target.
> Like APX_F, use inline asm to do the target check.
>
> Testes w/ and w/o Binutils
On Tue, Oct 22, 2024 at 2:31 PM Haochen Jiang wrote:
>
> Hi all,
>
> ISE054 has just been released and you can find doc from here:
>
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> Diamond Rapids features are added in this ISE, including AMX
> related instructions, SM4 EVEX extension and MO
On Fri, Oct 25, 2024 at 12:19 AM Antoni Boucher wrote:
>
> Thanks.
> Did you review the new patch?
> Can I push it to master?
Ok.
>
> Le 2024-10-20 à 22 h 01, Hongtao Liu a écrit :
> > On Sat, Oct 19, 2024 at 2:06 AM Antoni Boucher wrote:
> >>
> >> Than
On Sat, Oct 19, 2024 at 2:06 AM Antoni Boucher wrote:
>
> Thanks for the review.
> Here's the updated patch.
>
> Le 2024-10-17 à 21 h 50, Hongtao Liu a écrit :
> > On Fri, Oct 18, 2024 at 9:08 AM Antoni Boucher wrote:
> >>
> >> Hi.
> >> This i
1 - 100 of 1162 matches
Mail list logo