On Tue, Dec 12, 2023 at 1:47 PM Jiang, Haochen via Gcc-regression
wrote:
>
> > -Original Message-
> > From: Jiang, Haochen
> > Sent: Tuesday, December 12, 2023 9:11 AM
> > To: Andrew Pinski (QUIC) ; haochen.jiang
> > ; gcc-regress...@gcc.gnu.org; gcc-
> > patc...@gcc.gnu.org
> > Subject: R
On Tue, Dec 12, 2023 at 10:38 PM Jan Hubicka wrote:
>
> Hi,
> this patch disables use of FMA in matrix multiplication loop for generic (for
> x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U.
>
> For Intel this is neutral both on the matrix multiplication microbenchmark
> (att
On Wed, Dec 13, 2023 at 4:44 PM Jakub Jelinek wrote:
>
> Hi!
>
> The following patch fixes ICE on the testcase in similar way to how
> other folded builtins are handled in ix86_gimple_fold_builtin when
> they don't have a lhs; these builtins are const or pure, so normally
> DCE would remove them l
On Tue, Nov 7, 2023 at 10:27 AM Haochen Jiang wrote:
>
> Hi all,
>
> This patch aims fo fix the wrong isa attribute which caused regression
> on PR111907.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
>
> Thx,
> Haochen
>
> gcc/ChangeLog:
>
> PR target/111907
> * config/i386/
On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
wrote:
>
> On Tue, Nov 7, 2023 at 7:08 AM liuhongt wrote:
> >
> > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > loop invariant which is not the case when it's INTEGER_CST.
> >
> > Bootstrapped and regtseted on x86_64-pc-li
On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
wrote:
>
> On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu wrote:
> >
> > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt wrote:
> > >
On Tue, Nov 7, 2023 at 3:33 PM Hongyu Wang wrote:
>
> Hi,
>
> When APX EGPR enabled, the TImode move pattern *movti_internal allows
> move between gpr and sse reg using constraint pair ("r","Yd"). Then a
> post-reload splitter transform such move to vec_extractv2di, while under
> -msse4.1 -mno-avx
On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
wrote:
>
> On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu wrote:
> >
> > On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu wrote:
> > &g
On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote:
>
> This patch aims to avoid generate vblendps with ymm16+, And have
> bootstrapped and tested on x86_64-pc-linux-gnu{-m32,-m64}. Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/112435
> * config/i386/sse.md: Adding constraints to restr
On Fri, Nov 10, 2023 at 10:11 AM Andrew Pinski wrote:
>
> On Thu, Nov 9, 2023 at 5:52 PM liuhongt wrote:
> >
> > When I'm working on PR112443, I notice there's some misoptimizations: after
> > we
> > fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend fails to combine
> > it
> > back to v
On Sat, Nov 11, 2023 at 4:11 AM Jakub Jelinek wrote:
>
> On Thu, Nov 09, 2023 at 03:27:11PM +0800, Hongtao Liu wrote:
> > On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote:
> > >
> > > This patch aims to avoid generate vblendps with ymm16+, And have
> > > boo
On Fri, Nov 10, 2023 at 6:15 PM Richard Biener
wrote:
>
> On Fri, Nov 10, 2023 at 2:42 AM Haochen Jiang wrote:
> >
> > Hi all,
> >
> > This RFC patch aims to add AVX10.1 options. After we added -m[no-]evex512
> > support, it makes a lot easier to add them comparing to the August version.
> > Deta
On Fri, Nov 10, 2023 at 2:14 PM liuhongt wrote:
>
> When I'm working on PR112443, I notice there's some misoptimizations:
> after we fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend
> fails to combine it back to v{,p}blendv{v,ps,pd} since the pattern is
> too complicated, so I think maybe
On Fri, Nov 10, 2023 at 5:12 PM Richard Biener
wrote:
>
> On Wed, Nov 8, 2023 at 9:22 AM Hongtao Liu wrote:
> >
> > On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
> > wrote:
> > >
> > > On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu wrote:
> > >
On Mon, Nov 13, 2023 at 4:45 PM Jakub Jelinek wrote:
>
> On Mon, Nov 13, 2023 at 02:27:35PM +0800, Hongtao Liu wrote:
> > > 1) if it isn't better to use separate alternative instead of
> > >x86_evex_reg_mentioned_p, like in the patch below
> > vblendps doe
On Mon, Nov 13, 2023 at 7:25 PM Richard Biener
wrote:
>
> On Mon, Nov 13, 2023 at 7:58 AM Hongtao Liu wrote:
> >
> > On Fri, Nov 10, 2023 at 6:15 PM Richard Biener
> > wrote:
> > >
> > > On Fri, Nov 10, 2023 at 2:42 AM Haochen Jiang
> > >
On Tue, Nov 14, 2023 at 5:01 PM Lehua Ding wrote:
>
> Hi,
>
> This little patch adjust the assert in apx-spill_to_egprs-1.c testcase.
> The -mapxf compilation option allows more registers to be used, which in
> turn eliminates the need for local variables to be stored in stack memory.
> Therefore,
On Wed, Nov 15, 2023 at 5:43 PM Hongyu Wang wrote:
>
> Hi,
>
> For vextract/insert{if}128 they cannot adopt EGPR in their memory operand, all
> related pattern should be adjusted to disable EGPR usage on them.
> Also fix a wrong gpr16 attr for insertps.
>
> Bootstrapped/regtested on x86-64-pc-linu
On Fri, Nov 10, 2023 at 9:42 AM Haochen Jiang wrote:
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h (get_available_features):
> Add avx10_set and version and detect avx10.1.
> (cpu_indicator_init): Handle avx10.1-512.
> * common/config/i386/i386-common.cc
>
On Fri, Nov 17, 2023 at 3:26 PM Hongyu Wang wrote:
>
> Intel APX PPX feature has been released in [1].
>
> PPX stands for Push-Pop Acceleration. PUSH/PUSH2 and its corresponding POP
> can be marked with a 1-bit hint to indicate that the POP reads the
> value written by the PUSH from the stack. The
.
>
> Yes, such change also worked and no cfa adjustment required then,
> thanks for the suggestion.
> Updated patch with just 1 new UNSPEC and removed cfa handling.
LGTM.
>
> Hongtao Liu 于2023年11月20日周一 14:46写道:
> >
> > On Fri, Nov 17, 2023 at 3:26 PM Hongyu Wang wrote:
On Wed, Nov 22, 2023 at 11:31 AM Hongyu Wang wrote:
>
> Hi,
>
> The push2/pop2 operand order does not match the binutils implementation
> for AT&T syntax that it will first push operands[2] then operands[1].
> Correct it by reverse operand order for AT&T syntax.
>
> Bootstrapped/regtested on x86-6
On Thu, Nov 23, 2023 at 2:10 PM Haochen Jiang wrote:
>
> Hi all,
>
> This patch should be able to fix the current issue mentioned in PR112643.
>
> Also, I fixed some legacy issues in code related to AVX512/AVX10.
>
> Ok for trunk?
Ok
>
> Thx,
> Haochen
>
> gcc/ChangeLog:
>
> PR target/1126
On Tue, Nov 28, 2023 at 9:51 PM Hongyu Wang wrote:
>
> Hi,
>
> On linux x86-64, -fomit-frame-pointer was by default enabled so the
> push2pop2 tests cfi scans are based on it. On other target with
> -fno-omit-frame-pointer the cfi scan will be wrong as the frame pointer
> is pushed at first. Add -
On Wed, Nov 29, 2023 at 9:23 AM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to fix the wrong CPUID of USER_MSR, its correct CPUID is
> (0x7, 0x1).EDX[15], But I set it as (0x7, 0x0).EDX[15]. And the patch modefied
> testcase for give the user a better example.
>
> It has been bootstrapped and
On Wed, Nov 29, 2023 at 3:47 PM Richard Biener
wrote:
>
> On Tue, Nov 28, 2023 at 8:54 AM liuhongt wrote:
> >
> > For vec_contruct, the components must be live at the same time if
> > they're not loaded from memory, when the number of those components
> > exceeds available registers, spill happen
Any comments?
On Wed, Nov 22, 2023 at 12:17 PM liuhongt wrote:
>
> From: "Zhang, Annita"
>
> Avoid_fma_chain was enabled in m_SAPPHIRERAPIDS, m_ALDERLAKE and
> m_CORE_HYBRID. It can also be enabled in m_GENERIC to improve the
> performance of -march=x86-64-v3/v4 with -mtune=generic set by
> defa
On Thu, Sep 21, 2023 at 3:22 PM Hu, Lin1 wrote:
>
> Hi all,
>
> After previous discussion, instead of supporting option -mavx10.1, we
> will first introduct option -m[no-]evex512, which will enable/disable
> 512 bit register and 64 bit mask register.
>
> It will not change the current option behav
On Fri, Sep 22, 2023 at 6:56 PM Hongyu Wang wrote:
>
> Hi,
>
> This is a v2 patch for APX support which follows-up previous discussion in
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628904.html
>
> As discussed in previous thread, the inverse approach to extend base/index reg
> support
On Thu, Jun 13, 2024 at 3:32 PM Alexandre Oliva wrote:
>
>
> The first patch for PR113719 regressed gcc.dg/ipa/iinline-attr.c on
> toolchains configured to --enable-frame-pointer, because the
> optimization node created within handle_optimize_attribute had
> flag_omit_frame_pointer incorrectly set
On Thu, Jun 27, 2024 at 9:23 AM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to refactor vcvttps2qq/vcvtqq2ps patterns for remove redundant
> round_*_modev8sf_condition.
>
> Bootstrapped and regtested on x86-64-linux-gnu, OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> * config/
On Thu, Jun 27, 2024 at 4:29 PM Roger Sayle wrote:
>
>
> This patch is another round of refinements to fine tune the new ternlog
> infrastructure in i386's sse.md. This patch tweaks ix86_ternlog_idx
> to allow multiple MEM/CONST_VECTOR/VEC_DUPLICATE operands prior to
> splitting (before reload),
On Sun, Jun 30, 2024 at 7:29 PM Roger Sayle wrote:
>
>
> This patch fixes the 4 FAILs of gcc.target/i386/pr192464-vrndscaleph.c
> with --target_board='unix{-m32}' on RedHat 7.x. The issue is that this
> AVX512 test includes the system math.h, and on older systems this provides
> inline versions o
On Mon, Jul 1, 2024 at 6:14 AM Roger Sayle wrote:
>
>
> As promised here's the final ternlog clean-up, that deletes the now
> obsolete legacy patterns and mode iterators from sse.md. It also updates
> the surviving ternlog patterns to consistently use decimal immediate
> operands (instead of hexa
> >
> > gcc/testsuite/ChangeLog
> > * gcc.target/i386/pr100711-6.c: Update to check for decimal
> > immediate operand in ternlog, not hexadecimal.
> I got an ICE when bootstrapped with --enable-checking=yes,rtl,extra
>
The ICE can be walked around with 2 separate define_predicates,
On Mon, Jul 1, 2024 at 4:51 PM kong lingling wrote:
>
> Add some missing APX NF and NDD support for imul and mul.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
>
> Ok for trunk?
Ok.
>
>
> gcc/ChangeLog:
>
> * config/i386/i386.md (*imulhizu): Added APX
> NF support.
On Wed, Jul 3, 2024 at 2:10 AM Andi Kleen wrote:
>
> liuhongt writes:
>
> > From: "H.J. Lu"
> >
> > According to Intel® 64 and IA-32 Architectures Optimization Reference
> > Manual[1], Branch Hint is updated for Redwood Cove.
> >
> > cut from [1]-
> > Starting wit
On Thu, Jul 4, 2024 at 6:17 AM H.J. Lu wrote:
>
>
> On Wed, Jul 3, 2024, 9:37 PM Richard Biener
> wrote:
>>
>> On Wed, Jul 3, 2024 at 9:25 AM liuhongt wrote:
>> >
>> > The patch can avoid SIGILL on non-AVX512 machine due to kmovd is
>> > generated in dynamic check.
>> >
>> > Committed as an obv
On Thu, Jul 4, 2024 at 9:41 AM H.J. Lu wrote:
>
>
> On Thu, Jul 4, 2024, 9:12 AM Hongtao Liu wrote:
>>
>> On Thu, Jul 4, 2024 at 6:17 AM H.J. Lu wrote:
>> >
>> >
>> > On Wed, Jul 3, 2024, 9:37 PM Richard Biener
>> > wrote:
&
On Tue, Jul 2, 2024 at 11:24 AM Hongyu Wang wrote:
>
> Hi,
>
> According to APX spec, the pushp/popp pairs should be matched,
> otherwise the PPX hint cannot take effect and cause performance loss.
>
> In the ix86_expand_epilogue, there are several optimizations that may
> cause the epilogue using
On Fri, Jul 5, 2024 at 2:54 AM Roger Sayle wrote:
>
>
> This patch fixes a problem with splitting of complex AVX512 ternlog
> instructions on x86_64. A recent change allows the ternlog pattern
> to have multiple mem-like operands prior to reload, by emitting any
> "reloads" as necessary during sp
On Fri, Jul 5, 2024 at 8:06 AM Hongtao Liu wrote:
>
> On Fri, Jul 5, 2024 at 2:54 AM Roger Sayle wrote:
> >
> >
> > This patch fixes a problem with splitting of complex AVX512 ternlog
> > instructions on x86_64. A recent change allows the ternlog pattern
> >
On Sun, Jul 7, 2024 at 5:00 PM Roger Sayle wrote:
>
>
> Hi Hongtao,
> This should address concerns about the remaining use of force_reg.
>
51@@ -25793,15 +25792,20 @@ ix86_expand_ternlog_binop (enum rtx_code
code, machine_mode mode,
52 if (GET_MODE (op1) != mode)
53 op1 = gen_lowpart (mod
On Thu, Jul 4, 2024 at 9:30 AM liuhongt wrote:
>
> From: "H.J. Lu"
>
> >The above reads like it would be worth splitting branc_prediction_hits
> >into branch_prediction_hints_taken and branch_prediction_hints_not_taken
> >given not-taken is the default and thus will just increase code size?
> >Ac
On Thu, Jul 4, 2024 at 11:24 AM Levy Hsu wrote:
>
> This patch extends support for BF16 vector operations in GCC, including
> bitwise AND, ANDNOT, ABS, NEG, COPYSIGN, and XORSIGN for V8BF, V16BF, and
> V32BF modes.
> Bootstrapped and tested on x86_64-linux-gnu. ok for trunk?
>
> gcc/ChangeLog:
>
On Wed, Jul 10, 2024 at 10:10 PM Victor Do Nascimento
wrote:
>
> Following the migration of the dot_prod optab from a direct to a
> conversion-type optab, ensure all back-end patterns incorporate the
> second machine mode into pattern names.
The patch LGTM. BTW you can use existing instead of
new
strap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures. Ok for mainline?
Ok.
>
>
> 2024-07-11 Roger Sayle
> Hongtao Liu
>
> gcc/ChangeLog
> * config/i386/i386-expand.cc (ix86_broadcast_from_con
On Sat, Jul 13, 2024 at 3:44 PM Hongyu Wang wrote:
>
> Hi,
>
> According to the instruction spec of AVX512BF16, the convert from float
> to BF16 is not a simple truncation. It has special handling for
> denormal/nan, even for normal float it will add an extra bias according
> to the least signific
On Mon, Jul 15, 2024 at 10:21 AM Hongyu Wang wrote:
>
> > Could you just git revert 6d0b7b69d143025f271d0041cfa29cf26e6c343b?
>
> We can still deal with BFmode permutation the same way as HFmode, so
> the change in ix86_vectorize_vec_perm_const can be preserved.
>
> Hongt
On Wed, Jul 10, 2024 at 2:46 PM Hongyu Wang wrote:
>
> Hi,
>
> For APX ccmp, current infrastructure will always generate cstore for
> the ccmp flag user, like
>
> cmpe%rcx, %r8
> ccmpnel %rax, %rbx
> seta%dil
> add %rcx, %r9
> add %r9, %rdx
>
On Thu, Jul 11, 2024 at 9:07 PM Alexandre Oliva wrote:
>
> On Jul 4, 2024, Alexandre Oliva wrote:
>
> > On Jul 3, 2024, Rainer Orth wrote:
>
> > Hmm, I wonder if leaf frame pointer has to do with that.
>
> It did, in a way.
>
>
>
> The first two patches for PR113719 have each regressed
>
On Mon, Jul 15, 2024 at 1:39 PM Hu, Lin1 wrote:
>
> Hi, all
>
> Based on actual usage, trunc{128}2{16,32,64} use some instructions from
> sse/sse3, so extend their scope to extend the scope of optimization.
>
> Bootstraped and regtest on x86-64-linux-gnu, OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/C
On Mon, Jul 15, 2024 at 7:24 PM Paul-Antoine Arras wrote:
>
> This trivially fixes an incorrectly encoded character in the DejaGnu
> scan pattern.
>
> OK for trunk?
Ok.
> --
> PA
--
BR,
Hongtao
On Mon, May 13, 2024 at 5:57 AM Roger Sayle wrote:
>
>
> This patch improves the way that the x86 backend recognizes and
> expands AVX512's bitwise ternary logic (vpternlog) instructions.
I like the patch.
1 file changed, 25 insertions(+), 1 deletion(-)
gcc/config/i386/i386-expand.cc | 26 +++
On Mon, May 13, 2024 at 3:40 PM Richard Biener
wrote:
>
> On Mon, May 13, 2024 at 4:29 AM liuhongt wrote:
> >
> > As testcase in the PR, O3 cunrolli may prevent vectorization for the
> > innermost loop and increase register pressure.
> > The patch removes the 1/3 reduction of unr_insn for innermo
C -std=gnu++14 LP64 note (test for
> >
> > g++warnings, line 56)
> >
> > g++: g++.dg/warn/Warray-bounds-20.C -std=gnu++14 note (test for
> >
> > g++warnings, line 66)
> >
> > g++: g++.dg/warn/Warray-bounds-20.C -std=gnu++17 LP64 note (test for
> >
> > g++warnings, line 56)
> >
> > g++: g++.dg/wa
On Thu, May 16, 2024 at 10:40 PM Victor Do Nascimento
wrote:
>
> From: Victor Do Nascimento
>
> At present, the compiler offers the `{u|s|us}dot_prod_optab' direct
> optabs for dealing with vectorizable dot product code sequences. The
> consequence of using a direct optab for this is that backen
> >
> Sorry to chime in, for x86 backend, we defined usdot_prodv16hi, and
> 2-way dot_prod operations can be generated
>
This is the link https://godbolt.org/z/hcWr64vx3, x86 define
udot_prodv16qi/udot_prod8hi and both 2-way and 4-way dot_prod
instructions are generated
--
BR,
Hongtao
On Fri, May 17, 2024 at 3:55 PM Uros Bizjak wrote:
>
> Rename _3 expander to a standard ssadd,
> usadd, sssub and ussub name to enable corresponding optab expansion.
>
> Also add named expander for MMX modes.
LGTM.
>
> PR middle-end/112600
>
> gcc/ChangeLog:
>
> * config/i386/mmx.md (3): N
On Wed, May 15, 2024 at 11:30 AM Jiang, Haochen wrote:
>
> Also cc Honza and Richard since we touched generic tune.
>
> Thx,
> Haochen
>
> > -Original Message-
> > From: Haochen Jiang
> > Sent: Wednesday, May 15, 2024 11:04 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Liu, Hongtao ; ubiz...
On Wed, May 15, 2024 at 5:24 PM Richard Biener
wrote:
>
> On Wed, May 15, 2024 at 4:15 AM Hongtao Liu wrote:
> >
> > On Mon, May 13, 2024 at 3:40 PM Richard Biener
> > wrote:
> > >
> > > On Mon, May 13, 2024 at 4:29 AM liuhongt wrote:
> > &g
On Tue, May 21, 2024 at 2:16 PM Haochen Jiang wrote:
>
> Hi all,
>
> Since vpermq is really slow, we should avoid using it when it is
> the only instruction could be used for ix86_expand_vecop_qihi2.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu. Ok for trunk?
Please add a testcase for it.
On Tue, May 21, 2024 at 3:14 PM Haochen Jiang wrote:
>
> Hi all,
>
> This is the v2 patch to fix PR115069. The new testcase has passed.
>
> Changes in v2:
> - Added a testcase.
> - Change the comment for the early exit.
>
> Thx,
> Haochen
>
> Since vpermq is really slow, we should avoid using
On Wed, May 22, 2024 at 1:07 PM liuhongt wrote:
>
> >> Hard to find a default value satisfying all testcases.
> >> some require loop unroll with 7 insns increment, some don't want loop
> >> unroll w/ 5 insn increment.
> >> The original 2/3 reduction happened to meet all those testcases(or the
> >>
On Wed, May 22, 2024 at 3:59 PM Jakub Jelinek wrote:
>
> On Wed, May 22, 2024 at 09:46:41AM +0200, Richard Biener wrote:
> > On Wed, May 22, 2024 at 3:58 AM liuhongt wrote:
> > >
> > > According to IEEE standard, for conversions from floating point to
> > > integer. When a NaN or infinite operand
On Thu, May 23, 2024 at 2:38 PM Hu, Lin1 wrote:
>
> gcc/ChangeLog:
>
> PR 107432
> * config/i386/i386-expand.cc (ix86_expand_trunc_with_avx2_noavx512f):
> New function for generate a series of suitable insn.
> * config/i386/i386-protos.h (ix86_expand_trunc_with_avx2
On Thu, May 23, 2024 at 3:17 PM Hu, Lin1 wrote:
>
> > -Original Message-
> > From: Hongtao Liu
> > Sent: Thursday, May 23, 2024 2:42 PM
> > To: Hu, Lin1
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ;
> > ubiz...@gmail.com; rguent...@suse.de
>
CC for review.
On Tue, May 21, 2024 at 1:12 PM liuhongt wrote:
>
> When mask is (1 << (prec - imm) - 1) which is used to clear upper bits
> of A, then it can be simplified to LSHIFTRT.
>
> i.e Simplify
> (and:v8hi
> (ashifrt:v8hi A 8)
> (const_vector 0xff x8))
> to
> (lshifrt:v8hi A 8)
>
> Bo
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652231.html
Ok for this.
--
BR,
Hongtao
On Mon, May 20, 2024 at 11:15 AM Hongtao Liu wrote:
>
> On Wed, May 15, 2024 at 11:30 AM Jiang, Haochen
> wrote:
> >
> > Also cc Honza and Richard since we touched generic tune.
> >
> > Thx,
> > Haochen
> >
> > > -Original Message-
On Thu, May 23, 2024 at 2:38 PM Hu, Lin1 wrote:
>
> gcc/ChangeLog:
>
> PR target/107432
> * config/i386/mmx.md (truncv4hiv4qi2): New define_insn.
>
> gcc/testsuite/ChangeLog:
>
> PR target/107432
> * gcc.target/i386/pr107432-6.c: Add test.
> ---
> gcc/config/i386/mmx.md
On Tue, May 21, 2024 at 5:46 AM Alexander Monakov wrote:
>
>
> Hello!
>
> I looked at ternlog a bit last year, so I'd like to offer some drive-by
> comments. If you want to tackle them in a follow-up patch, or leave for
> someone else to handle, please let me know.
>
> On Fri, 17 May 2024, Roger S
On Sat, May 18, 2024 at 4:10 AM Roger Sayle wrote:
>
>
> Hi Hongtao,
> Many thanks for the review, bug fixes and suggestions for improvements.
> This revised version of the patch, implements all of your corrections. In
> theory
> the "ternlog idx" should guarantee that some operands are non-null
On Mon, May 27, 2024 at 2:48 PM Hongtao Liu wrote:
>
> On Sat, May 18, 2024 at 4:10 AM Roger Sayle
> wrote:
> >
> >
> > Hi Hongtao,
> > Many thanks for the review, bug fixes and suggestions for improvements.
> > This revised version of the patch,
On Thu, May 16, 2024 at 5:15 PM Hongyu Wang wrote:
>
> Richard Biener 于2024年5月16日周四 15:05写道:
>
> >
> > On Thu, May 16, 2024 at 8:25 AM Hongyu Wang wrote:
> > >
> > > Hi,
> > >
> > > In ix86_override_options_after_change, calls to ix86_default_align
> > > and ix86_recompute_optlev_based_flags wil
On Wed, May 29, 2024 at 4:56 PM Hu, Lin1 wrote:
>
> Exclude add TARGET_MMX_WITH_SSE, I merge two patterns.
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> PR target/107432
> * config/i386/mmx.md
> (VI2_32_64): New mode iterator.
> (mmxhalfmode): New mode atter.
> (mmxhalfmodelower):
On Wed, May 29, 2024 at 5:00 PM Hu, Lin1 wrote:
>
> According to hongtao's suggestion, I support some trunc in mmx.md under
> x86-64-v3, and optimize ix86_expand_trunc_with_avx2_noavx512f.
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> PR 107432
> * config/i386/i386-expand.cc (ix86_expa
On Tue, May 28, 2024 at 4:00 PM Hu, Lin1 wrote:
>
> Hi all,
>
> This patch aims to acheive EQ/NE comparison between avx512 kmask and -1
> by using kxortest with checking CF.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,-m64}. Ok for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
>
On Wed, May 15, 2024 at 4:24 PM Hongyu Wang wrote:
>
> APX CCMP feature implements conditional compare which executes compare
> when EFLAGS matches certain condition.
>
> CCMP introduces default flags value (dfv), when conditional compare does
> not execute, it will directly set the flags accordin
On Wed, May 15, 2024 at 4:21 PM Hongyu Wang wrote:
>
> The ccmp insn itself doesn't support fp compare, but x86 has fp comi
> insn that changes EFLAG which can be the scc input to ccmp. Allow
> scalar fp compare in ix86_gen_ccmp_first except ORDERED/UNORDERD
> compare which can not be identified i
On Wed, May 29, 2024 at 1:11 PM Kong, Lingling wrote:
>
> Hi, compared with v2, these patches restored the original lea patten position
> and addressed hongtao's comment.
>
> APX NF(no flags) feature implements suppresses the update of status flags
> for arithmetic operations.
Ok for the patch an
On Wed, May 29, 2024 at 11:05 AM Haochen Jiang wrote:
>
> Hi all,
>
> Since AVX10 is the first major ISA introduced after AVX-512, we propose
> to add target_clones support for it.
>
> Although AVX10.1-256 won't cover 512-bit part of AVX512F, but since
> it is only for priority but not for implica
On Mon, Aug 12, 2024 at 6:59 AM H.J. Lu wrote:
>
> On Thu, Aug 8, 2024 at 6:53 PM H.J. Lu wrote:
> >
> > When we emit .p2align to align BB_HEAD, we must update BB_HEAD. Otherwise
> > ENDBR will be inserted as the wrong place.
> >
> > gcc/
> >
> > PR target/116174
> > * config/i38
On Thu, Aug 1, 2024 at 3:50 PM Haochen Jiang wrote:
>
> Hi all,
>
> AVX10.2 tech details has been just published on July 31st in the
> following link:
>
> https://cdrdv2.intel.com/v1/dl/getContent/828965
>
> For new features and instructions, we could divide them into two parts.
> One is ymm round
On Mon, Aug 12, 2024 at 10:10 PM liuhongt wrote:
>
> > Are there any assumptions that BB_HEAD must be a note or label?
> > Maybe we should move ix86_align_loops into a separate pass and insert
> > the pass just before pass_final.
> The patch inserts .p2align after endbr pass, it can also fix the i
On Mon, Aug 12, 2024 at 3:10 PM kong lingling wrote:
>
> For APX instruction with an NDD, the destination GPR will get the
> instruction’s result in bits [OSIZE-1:0] and, if OSIZE < 64b, have its upper
> bits [63:OSIZE] zeroed. Now supporting other NDD instructions.
>
>
> Bootstrapped and regtes
On Mon, Aug 12, 2024 at 3:12 PM kong lingling wrote:
>
> gcc/ChangeLog:
>
>
>
>PR target/113729
>
>* config/i386/i386.md (*subqi_1_zext): New
>
>define_insn.
>
>(*subhi_1_zext): Ditto.
>
>(*addqi3_carry_zext): Ditto.
>
On Mon, Aug 12, 2024 at 3:12 PM kong lingling wrote:
>
> gcc/ChangeLog:
>
>
>PR target/113729
>
>* config/i386/i386.md (*andqi_1_zext):
>
>New define_insn.
>
>(*andhi_1_zext): Ditto.
>
>(*qi_1_zext): Ditto.
>
>
On Mon, Aug 12, 2024 at 3:12 PM kong lingling wrote:
>
> gcc/ChangeLog:
>
>
> PR target/113729
>
>* config/i386/i386.md (*ashlqi3_1_zext):
>
>New define_insn.
>
>(*ashlhi3_1_zext): Ditto.
>
>(*qi3_1_zext): Ditto.
>
>
On Wed, Aug 14, 2024 at 4:23 PM Kong, Lingling wrote:
>
>
>
> -Original Message-
> From: Kong, Lingling
> Sent: Wednesday, August 14, 2024 4:20 PM
> To: Kong, Lingling
> Subject: [PATCH v2] i386: Fix some vex insns that prohibit egpr
>
> Although these vex insn have evex counterpart, but
On Thu, Aug 15, 2024 at 3:27 PM liuhongt wrote:
>
> It results in 2 failures for x86_64-pc-linux-gnu{\
> -march=cascadelake};
>
> gcc: gcc.target/i386/extendditi3-1.c scan-assembler cqt?o
> gcc: gcc.target/i386/pr113560.c scan-assembler-times \tmulq 1
>
> For pr113560.c, now GCC generates mulx ins
On Wed, Aug 14, 2024 at 5:07 PM Haochen Jiang wrote:
>
> Hi all,
>
> The initial patch for AVX10.2 has been merged this week.
>
> For the upcoming patches, we will first upstream ymm rounding control part.
>
> In ymm rounding part, ALL the instructions in AVX512 with 512-bit rounding
> control wil
On Tue, Aug 20, 2024 at 2:12 PM HAO CHEN GUI wrote:
>
> Hi,
> Add Hongtao Liu as the patch affects x86.
>
> 在 2024/8/20 6:32, Richard Sandiford 写道:
> > HAO CHEN GUI writes:
> >> Hi,
> >> This patch adds const0 move checking for CLEAR_BY_PIECES.
On Tue, Aug 20, 2024 at 6:25 PM liuhongt wrote:
>
> From [1]
[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660575.html
> > > It's not obvious to me why movv16qi requires a nonimmediate_operand
> > > source, especially since ix86_expand_vector_mode does have code to
> > > cope with con
On Tue, Aug 20, 2024 at 2:50 PM Hongtao Liu wrote:
>
> On Tue, Aug 20, 2024 at 2:12 PM HAO CHEN GUI wrote:
> >
> > Hi,
> > Add Hongtao Liu as the patch affects x86.
> >
> > 在 2024/8/20 6:32, Richard Sandiford 写道:
> > > HAO CHEN GUI writes:
&g
On Wed, Aug 21, 2024 at 4:49 PM Richard Biener
wrote:
>
> On Wed, Aug 21, 2024 at 7:40 AM liuhongt wrote:
> >
> > When none of mprefer-vector-width, avx256_optimal/avx128_optimal,
> > avx256_store_by_pieces/avx512_store_by_pieces is specified, GCC will
> > set ix86_{move_max,store_max} as max ava
On Thu, Aug 22, 2024 at 4:06 PM HAO CHEN GUI wrote:
>
> Hi Hongtao,
>
> 在 2024/8/21 11:21, Hongtao Liu 写道:
> > r15-3058-gbb42c551905024 support const0 operand for movv16qi, please
> > rebase your patch and see if there's still the regressions.
>
> There
On Fri, Aug 23, 2024 at 11:03 AM HAO CHEN GUI wrote:
>
> Hi Hongtao,
>
> 在 2024/8/23 9:47, Hongtao Liu 写道:
> > On Thu, Aug 22, 2024 at 4:06 PM HAO CHEN GUI wrote:
> >>
> >> Hi Hongtao,
> >>
> >> 在 2024/8/21 11:21, Hongtao Liu 写道:
> >>
On Mon, Aug 19, 2024 at 4:57 PM Haochen Jiang wrote:
>
> Hi all,
>
> The AVX10.2 ymm rounding patches has been merged to trunk around
> 6 hours ago. As mentioned before, next step will be AVX10.2 new
> instruction support.
>
> This patch series could be divided into three part.
>
> The first patch
On Fri, Aug 23, 2024 at 5:46 PM HAO CHEN GUI wrote:
>
> Hi Hongtao,
>
> 在 2024/8/23 11:47, Hongtao Liu 写道:
> > On Fri, Aug 23, 2024 at 11:03 AM HAO CHEN GUI wrote:
> >>
> >> Hi Hongtao,
> >>
> >> 在 2024/8/23 9:47, Hongtao Liu 写道:
> >&
101 - 200 of 1383 matches
Mail list logo