Re: [PATCH] APX: add nf counterparts for rotl split pattern [PR 119539]

2025-04-05 Thread Hongyu Wang
> > Sent: Tuesday, April 1, 2025 5:24 PM > > To: Hongtao Liu > > Cc: Wang, Hongyu ; gcc-patches@gcc.gnu.org; Liu, > > Hongtao > > Subject: Re: [PATCH] APX: add nf counterparts for rotl split pattern [PR > > 119539] > > > > On Tue, Apr 1, 2025 at 10:

[PATCH] APX: add nf counterparts for rotl split pattern [PR 119539]

2025-04-01 Thread Hongyu Wang
Hi, For spiltter after 3_mask it now splits the pattern to *3_mask, causing the splitter doesn't generate nf variant. Add corresponding nf counterpart for define_insn_and_split to make the splitter also works for nf insn. Bootstrapped & regtested on x86-64-pc-linux-gnu. Ok for trunk? gcc/Change

Re: [PATCH v5 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2025-01-20 Thread Hongyu Wang
Richard Biener 于2025年1月20日周一 15:45写道: > > On Mon, 20 Jan 2025, Hongyu Wang wrote: > > > Thanks Richard for willing to review this part, it is true that the > > try_cmove_arith logic adds quite a lot of special handling for > > optimization, so I reduce the logic in e

Re: [PATCH v5 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2025-01-19 Thread Hongyu Wang
to take a review for the ifcvt changes? Thanks in advance! Richard Sandiford 于2025年1月16日周四 19:06写道: > > Hongyu Wang writes: > > From: Lingling Kong > > > > Hi, > > > > Appreciated to Richard's review, the v5 patch contaings below change: > > >

Re: [PATCH] i386: Fix wrong insn generated by shld/shrd ndd split [PR118510]

2025-01-17 Thread Hongyu Wang
Uros Bizjak 于2025年1月17日周五 15:05写道: > Is there a reason to have operand 0 with "nonimmediate_operand" > predicate? If you have to generate a register temporary and then > unconditionally copy it to the output, it is better to use > "register_operand" predicate and leave middle end to do the copy f

[PATCH] i386: Fix wrong insn generated by shld/shrd ndd split [PR118510]

2025-01-16 Thread Hongyu Wang
Hi, For shld/shrd_ndd_2 insn, the spiltter outputs wrong pattern that mixed parallel for clobber and set. Separate out the set to dest from parallel to fix it. Bootstrapped & regtested on x86-64-pc-linux-gnu. Ok for trunk? gcc/ChangeLog: PR target/118510 * config/i386/i386.md (

[PATCH 2/2] [APX CFCMOV] Support APX CFCMOV in backend

2025-01-08 Thread Hongyu Wang
From: Lingling Kong gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_int_cfmovcc): Expand to cfcmov pattern. * config/i386/i386-opts.h (enum apx_features): New. * config/i386/i386-protos.h (ix86_expand_int_cfmovcc): Define. * config/i386/i386.cc (

[PATCH v5 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2025-01-08 Thread Hongyu Wang
From: Lingling Kong Hi, Appreciated to Richard's review, the v5 patch contaings below change: 1. Separate the maskload/maskstore emit out from noce_emit_cmove, add a new function emit_mask_load_store in optabs.cc. 2. Follow the operand order of maskload and maskstore optab and takes cond as pre

Re: [PATCH] i386: Add br_mispredict_scale in cost table.

2025-01-07 Thread Hongyu Wang
h have no impact for the test in [2]. We will keep monitoring those cmove issues. Uros Bizjak 于2025年1月7日周二 17:34写道: > > On Tue, Jan 7, 2025 at 8:37 AM Hongyu Wang wrote: > > > > Hi, > > > > For later processors, the pipeline went deeper so the penalty for >

[PATCH] i386: Add br_mispredict_scale in cost table.

2025-01-06 Thread Hongyu Wang
Hi, For later processors, the pipeline went deeper so the penalty for untaken branch can be larger than before. Add a new parameter br_mispredict_scale to describe the penalty, and adopt to noce_max_ifcvt_seq_cost hook to allow longer sequence to be converted with cmove. This improves cpu2017 544

Re: [PATCH v4 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2024-11-27 Thread Hongyu Wang
Ping^2 Hongyu Wang 于2024年11月21日周四 11:04写道: > > Gently ping, it would be appreciate if anyone can help review this. > We hope this patch will not miss GCC15 for complete support on APX. > > Kong, Lingling 于2024年11月14日周四 09:50写道: > > > > > Hi, > > > >

Re: [PATCH v4 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2024-11-20 Thread Hongyu Wang
Gently ping, it would be appreciate if anyone can help review this. We hope this patch will not miss GCC15 for complete support on APX. Kong, Lingling 于2024年11月14日周四 09:50写道: > > Hi, > > Many thanks to Richard for the suggestion that conditional load is like a > scalar instance of maskload_opta

[PATCH] PR target/117669 - RISC-V:The 'VEEWTRUNC4' iterator 'RVVMF2BF' type condition error

2024-11-19 Thread Feng Wang
This patch fix the wrong condition for RVVMF2BF. It should be TARGET_VECTOR_ELEN_BF_16. gcc/ChangeLog: PR target/117669 * config/riscv/vector-iterators.md: Signed-off-by: Feng Wang --- gcc/config/riscv/vector-iterators.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion

[PATCH] RISC-V:Fix wrong condition for vector-bfloat16

2024-11-19 Thread Feng Wang
This patch fix the wrong condition for RVVMF2BF. It should be TARGET_VECTOR_ELEN_BF_16. gcc/ChangeLog: * config/riscv/vector-iterators.md: Modify condition. Signed-off-by: Feng Wang --- gcc/config/riscv/vector-iterators.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

Re: [PATCH] i386: Rewrite ieee_minmax pattern with if_then_else

2024-11-15 Thread Hongyu Wang
Jakub Jelinek 于2024年11月15日周五 16:20写道: > > On Fri, Nov 15, 2024 at 04:04:55PM +0800, Hongyu Wang wrote: > > Following the discussion in pr116738, the insn for UNSPEC_IEEE_MAXMIN > > actually matches the behavior of if_then_else, so remove the UNSPEC and > > rewr

[PATCH] i386: Rewrite ieee_minmax pattern with if_then_else

2024-11-15 Thread Hongyu Wang
Hi, Following the discussion in pr116738, the insn for UNSPEC_IEEE_MAXMIN actually matches the behavior of if_then_else, so remove the UNSPEC and rewrite related pattern with if_then_else. Bootstrapped & regtested on x86-64-pc-linux-gnu. Ok for trunk? gcc/ChangeLog: * config/i386/i386-

[PATCH] i386: Fix cstorebf4 fp comparison operand [PR117495]

2024-11-12 Thread Hongyu Wang
Hi, For cstorebf4 it uses comparison_operator for BFmode compare, which is incorrect when directly uses ix86_expand_setcc as it does not canonicalize the input comparison to correct the compare code by swapping operands. Since the original code without AVX10.2 calls emit_store_flag_force, who actu

Re: [PATCH] i386: Support cstorebf4 with native bf16 comi

2024-11-07 Thread Hongyu Wang
Uros Bizjak 于2024年11月7日周四 15:22写道: > > On Thu, Nov 7, 2024 at 6:58 AM Hongyu Wang wrote: > > > > Hi, > > > > We recently supports cbranchbf4 with AVX10_2 native bf16 comi > > instructions, so do similar to cstorebf4. > > > > Bootstrapped &

[PATCH] i386: Support cstorebf4 with native bf16 comi

2024-11-06 Thread Hongyu Wang
Hi, We recently supports cbranchbf4 with AVX10_2 native bf16 comi instructions, so do similar to cstorebf4. Bootstrapped & regtested on x86_64-pc-linux-gnu. Ok for trunk? gcc/ChangeLog: * config/i386/i386.md (cstorebf4): Use vcomsbf16 under TARGET_AVX10_2_256 and -fno-trapping-m

[PATCH] i386: Utilize VCOMSBF16 for BF16 Comparisons with AVX10.2

2024-10-31 Thread Hongyu Wang
From: Levy Hsu This patch enables the use of the VCOMSBF16 instruction from AVX10.2 for efficient BF16 comparisons. Bootstrapped & regtested on x86-64-pc-linux-gnu. Ok for trunk? gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_branch): Handle BFmode when TARGET_AVX10_2

[PATCH] RISC-V: override alignment of function/jump/loop

2024-10-22 Thread Wang Pengcheng
Just like what AArch64 has done. Signed-off-by: Wang Pengcheng gcc/ChangeLog: * config/riscv/riscv.cc (struct riscv_tune_param): Add new tune options. (riscv_override_options_internal): Override the default alignment when not optimizing for size. --- gcc/config/riscv

[PATCH v2] RISC-V:Auto vect for vector-bfloat16

2024-10-18 Thread Feng Wang
/ChangeLog: * gcc.target/riscv/rvv/autovec/vfncvt-auto-vect.c: New test. * gcc.target/riscv/rvv/autovec/vfwcvt-auto-vect.c: New test. * gcc.target/riscv/rvv/autovec/vfwmacc-auto-vect.c: New test. Signed-off-by: Feng Wang --- gcc/config/riscv/autovec-opt.md

[PATCH] RISC-V: override alignment of function/jump/loop

2024-10-17 Thread Wang Pengcheng
Just like what AArch64 has done. Signed-off-by: Wang Pengcheng gcc/ChangeLog: * config/riscv/riscv.cc (struct riscv_tune_param): Add new tune options. (riscv_override_options_internal): Override the default alignment when not optimizing for size. --- gcc/config/riscv/riscv.cc | 15

[PATCH] RISC-V:Auto vect for vector bf16

2024-10-16 Thread Feng Wang
-vect.c: New test. Signed-off-by: Feng Wang --- gcc/config/riscv/vector-bfloat16.md | 144 -- .../riscv/rvv/autovec/vfncvt-auto-vect.c | 19 +++ .../riscv/rvv/autovec/vfwcvt-auto-vect.c | 19 +++ .../riscv/rvv/autovec/vfwmacc-auto-vect.c | 14 ++ 4 files

[PATCH v2] RISC-V: Add auto-vect pattern for vector rotate shift

2024-08-07 Thread Feng Wang
/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vrolr-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vrolr-run.c: New test. * gcc.target/riscv/rvv/autovec/binop/vrolr-template.h: New test. Signed-off-by: Feng Wang --- gcc/config/riscv/autovec.md | 16

[PATCH] RISC-V: Add auto-vect pattern for vector rotate shift

2024-08-07 Thread Feng Wang
This patch add the vector rotate shift pattern for auto-vect. With this patch, the scalar rotate shift can be automatically vectorized into vector rotate shift. signed-off-by: Feng Wang gcc/ChangeLog: * config/riscv/autovec-opt.md (v3): Add define_expand for vector

Re: [PATCH 0/1] Initial support for AVX10.2

2024-08-04 Thread Hongyu Wang
Andi Kleen 于2024年8月5日周一 06:31写道: > > > BTW, I noticed that in LLVM there is FP8 support for ARM currently > > undergoing. I will have a look on it to see if everything is mature. > > There's even FP8 work for ARM work under way for gcc, see > https://gcc.gnu.org/pipermail/gcc-patches/2024-August/6

Re: [PATCH] i386: Mark target option with optimization when enabled with opt level [PR116065]

2024-07-29 Thread Hongyu Wang
Richard Biener 于2024年7月26日周五 19:45写道: > > On Fri, Jul 26, 2024 at 10:50 AM Hongyu Wang wrote: > > > > Hi, > > > > When introducing munroll-only-small-loops, the option was marked as > > Target Save and added to -O2 default which makes attribute(optimize)

[PATCH] i386: Mark target option with optimization when enabled with opt level [PR116065]

2024-07-26 Thread Hongyu Wang
Hi, When introducing munroll-only-small-loops, the option was marked as Target Save and added to -O2 default which makes attribute(optimize) resets target option and causing error when cmdline has O1 and funciton attribute has O2 and other target options. Mark this option as Optimization to fix.

Re: [PATCH] AVX512BF16: Do not allow permutation with vcvtne2ps2bf16 [PR115889]

2024-07-14 Thread Hongyu Wang
> Could you just git revert 6d0b7b69d143025f271d0041cfa29cf26e6c343b? We can still deal with BFmode permutation the same way as HFmode, so the change in ix86_vectorize_vec_perm_const can be preserved. Hongtao Liu 于2024年7月15日周一 09:40写道: > > On Sat, Jul 13, 2024 at 3:44 PM Hongyu Wa

[PATCH] AVX512BF16: Do not allow permutation with vcvtne2ps2bf16 [PR115889]

2024-07-13 Thread Hongyu Wang
Hi, According to the instruction spec of AVX512BF16, the convert from float to BF16 is not a simple truncation. It has special handling for denormal/nan, even for normal float it will add an extra bias according to the least significant bit for bf number. This means we cannot use the vcvtne2ps2bf1

[PATCH 3/3 v3] RISC-V: Add md files for vector BFloat16

2024-07-11 Thread Feng Wang
V3: Add Bfloat16 vector insn in generic-vector-ooo.md v2: Rebase Accroding to the BFloat16 spec, some vector iterators and new pattern are added in md files. Signed-off-by: Feng Wang gcc/ChangeLog: * config/riscv/generic-vector-ooo.md: Add def_insn_reservation for vector BFloat16

[PATCH 2/3 v3] RISC-V: Add Zvfbfmin and Zvfbfwma intrinsic

2024-07-11 Thread Feng Wang
v3: Modify warning message in riscv.cc v2: Rebase Accroding to the intrinsic doc, the 'Zvfbfmin' and 'Zvfbfwma' intrinsic functions are added by this patch. Signed-off-by: Feng Wang gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfncvtbf16

[PATCH 1/3 v3] RISC-V: Add vector type of BFloat16 format

2024-07-11 Thread Feng Wang
v3: Rebase v2: Rebase The vector type of BFloat16 format is added in this patch, subsequent extensions to zvfbfmin and zvfwma need to be based on this patch. Signed-off-by: Feng Wang gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (bfloat16_type): Generate bf16

[PATCH] [APX NF] Add a pass to convert legacy insn to NF insns

2024-07-09 Thread Hongyu Wang
Hi, For APX ccmp, current infrastructure will always generate cstore for the ccmp flag user, like cmpe%rcx, %r8 ccmpnel %rax, %rbx seta%dil add %rcx, %r9 add %r9, %rdx testb %dil, %dil je .L2 For such case, the legacy

Re: [PATCH] [APX PPX] Avoid generating unmatched pushp/popp in pro/epilogue

2024-07-02 Thread Hongyu Wang
apx spec, the mismatched pushp/popp pair does confused the fast-forwarding logic and turns off the PPX optimization. We just need to make sure every pushp for a certain reg has corresponding popp for that reg. Richard Biener 于2024年7月2日周二 16:18写道: > > On Tue, Jul 2, 2024 at 5:24 AM Hongyu Wan

[PATCH] [APX PPX] Avoid generating unmatched pushp/popp in pro/epilogue

2024-07-01 Thread Hongyu Wang
Hi, According to APX spec, the pushp/popp pairs should be matched, otherwise the PPX hint cannot take effect and cause performance loss. In the ix86_expand_epilogue, there are several optimizations that may cause the epilogue using mov to restore the regs. Check if PPX applied and prevent usage o

[PATCH 1/3 v2] RISC-V: Add vector type of BFloat16 format

2024-06-27 Thread Feng Wang
v2: Rebase. The vector type of BFloat16 format is added in this patch, subsequent extensions to zvfbfmin and zvfwma need to be based on this patch. gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (bfloat16_type): Generate bf16 vector_type and scalar_type in DEF_RVV_TYPE_I

[PATCH 3/3 v2] RISC-V: Add md files for vector BFloat16

2024-06-27 Thread Feng Wang
v2:Rebase. Accroding to the BFloat16 spec, some vector iterators and new pattern are added in md files. gcc/ChangeLog: * config/riscv/riscv.md: Add new insn name for vector BFloat16. * config/riscv/vector-iterators.md: Add some iterators for vector BFloat16. * config/risc

[PATCH 2/3 v2] RISC-V: Add Zvfbfmin and Zvfbfwma intrinsic

2024-06-27 Thread Feng Wang
v2: Rebase. Accroding to the intrinsic doc, the 'Zvfbfmin' and 'Zvfbfwma' intrinsic functions are added by this patch. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfncvtbf16_f): Add 'Zvfbfmin' intrinsic in bases. (class vfwcvtbf16_f): Ditto.

Re: Re: [PATCH] RISC-V: Support -m[no-]unaligned-access

2024-06-24 Thread Wang Pengcheng
riscv.opt: Add option alias. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/riscv/predef-align-10.c: New test. >> * gcc.target/riscv/predef-align-7.c: New test. >> * gcc.target/riscv/predef-align-8.c: New test. >> * gcc.target/riscv/predef-align-9.c: New test.

[PATCH] Always -lntdll for all cygming targets [PR113501]

2024-06-22 Thread Shengdun Wang
From: Shengdun Wang The mcf thread has already linked to -lntdll, and it's confirmed that even Windows 95 includes ntdll.dll. Additionally, if users do not use any functions from ntdll directly, the inclusion of -lntdll does not result in linking to it. Therefore, I propose making

[PATCH] libstdc++: Fix --disable-libstdcxx-verbose abi break [PR115585]

2024-06-22 Thread Shengdun Wang
__glibcxx_assert_fail is not defined when we disable the libstdcxx-verbose. This causes ABI break when a binary is compiled with verbose enabled. libstdc++-v3/ChangeLog: * src/c++11/assert_fail.cc: --- libstdc++-v3/src/c++11/assert_fail.cc | 13 + 1 file changed, 9 insertions

[PATCH] libstdc++: Fix --disable-libstdcxx-verbose abi break [PR115585]

2024-06-22 Thread Shengdun Wang
From: Shengdun Wang __glibcxx_assert_fail is not defined when we disable the libstdcxx-verbose. This causes ABI break when a binary is compiled with verbose enabled. libstdc++-v3/ChangeLog: * src/c++11/assert_fail.cc: --- libstdc++-v3/src/c++11/assert_fail.cc | 13 + 1

[PATCH] libstdc++: Fix --disable-libstdcxx-verbose abi break [PR115585]

2024-06-22 Thread Shengdun Wang
From: Shengdun Wang __glibcxx_assert_fail is not defined when we disable the libstdcxx-verbose. This causes ABI break when a binary is compiled with verbose enabled. libstdc++-v3/ChangeLog: * src/c++11/assert_fail.cc: --- libstdc++-v3/src/c++11/assert_fail.cc | 13 + 1

[PATCH 1/3] RISC-V: Add vector type of BFloat16 format

2024-06-20 Thread Feng Wang
The vector type of BFloat16 format is added in this patch, subsequent extensions to zvfbfmin and zvfwma need to be based on this patch. gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (bfloat16_type): Generate bf16 vector_type and scalar_type in DEF_RVV_TYPE_INDEX.

[PATCH 3/3] RISC-V: Add md files for vector BFloat16

2024-06-20 Thread Feng Wang
Accroding to the BFloat16 spec, some vector iterators and new pattern are added in md files. All these changes passed the rvv test and rvv-intrinsic test for bfloat16. gcc/ChangeLog: * config/riscv/riscv.md: Add new insn name for vector BFloat16. * config/riscv/vector-iterators.m

[PATCH 2/3] RISC-V: Add Zvfbfmin and Zvfbfwma intrinsic

2024-06-20 Thread Feng Wang
Accroding to the intrinsic doc, the 'Zvfbfmin' and 'Zvfbfwma' intrinsic functions are added by this patch. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfncvtbf16_f): Add 'Zvfbfmin' intrinsic in bases. (class vfwcvtbf16_f): Ditto. (class

[PATCH] i386: Fix some ISA bit test in option_override

2024-06-19 Thread Hongyu Wang
Hi, This patch adjusts several new feature check in ix86_option_override_interal that directly use TARGET_* instead of TARGET_*_P (opts->ix86_isa_flags), which caused cmdline option overrides target_attribute isa flag. Bootstrapped && regtested on x86_64-pc-linux-gnu. Ok for trunk? gcc/ChangeLo

Re: [PATCH] Add targetm.have_ccmp hook [PR115370]

2024-06-13 Thread Hongyu Wang
Thanks, this it the patch I'm going to check-in. Richard Sandiford 于2024年6月13日周四 17:04写道: > > Hongyu Wang writes: > > Hi, > > > > In cfgexpand, there is an optimization for branch which tests > > targetm.gen_ccmp_first == NULL. However for target like x86-64,

Re: [PATCH] [i386] restore recompute to override opts after change [PR113719]

2024-06-13 Thread Hongyu Wang
Sorry for breaking the original logic, and very appreciate for your patch!! It does makes the logic more clear on top of opts and opts_set. I think the function name can be like ix86_unroll_flag_adjust instead of ix86_override_options_after_change_1, like the previous 2 functions which declares th

Re: [PATCH] [APX CCMP] Use ctestcc when comparing to const 0

2024-06-12 Thread Hongyu Wang
> Perhaps the constraint can be slightly optimized to avoid repeating > (,) pairs. > > ",m," > "C ,," Yes, will check-in with this change. Thanks! Uros Bizjak 于2024年6月13日周四 14:06写道: > > On Thu, Jun 13, 2024 at 3:44 AM Hongyu Wang wrote: > &

[PATCH] Add targetm.have_ccmp hook [PR115370]

2024-06-12 Thread Hongyu Wang
Hi, In cfgexpand, there is an optimization for branch which tests targetm.gen_ccmp_first == NULL. However for target like x86-64, the hook was implemented but it does not indicate that ccmp was enabled. Add a new target hook TARGET_HAVE_CCMP and replace the middle-end check for the existance of ge

Re: [PATCH] [APX CCMP] Use ctestcc when comparing to const 0

2024-06-12 Thread Hongyu Wang
Thanks for the advice, updated patch in attachment. Bootstrapped/regtested on x86-64-pc-linux-gnu. Ok for trunk? Uros Bizjak 于2024年6月12日周三 18:12写道: > > On Wed, Jun 12, 2024 at 12:00 PM Uros Bizjak wrote: > > > > On Wed, Jun 12, 2024 at 5:12 AM Hongyu Wang wro

[PATCH] [APX CCMP] Use ctestcc when comparing to const 0

2024-06-11 Thread Hongyu Wang
Hi, For CTEST, we don't have conditional AND so there's no optimization opportunity to write a new ctest pattern. Emit ctest when ccmp did comparison to const 0 to save bytes. Bootstrapped & regtested under x86-64-pc-linux-gnu. Ok for trunk? gcc/ChangeLog: * config/i386/i386.md (@ccmp)

Re: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates

2024-06-06 Thread Hongyu Wang
ns first. The costs are not + meaningful for failed expansions. */ + + if (ret2 && (!ret || cost2 < cost1)) { *prep_seq = prep_seq_2; *gen_seq = gen_seq_2; -- 2.31.1 Richard Sandiford 于2024年6月5日周三 17:21写道: > > Hongyu Wang writes: > > CC'd R

[PATCH] [APX] Adjust target-support check [PR 115341]

2024-06-05 Thread Hongyu Wang
Current target apxf check does not specify sub-features that assembler supports, so the check with older binutils will fail at assemble stage for new apx features like NF,CCMP or CFCMOV. Adjust the assembler check for latest apx subfeatures. Bootstrapped & regtested on x86-64-pc-linux-gnu with bin

Re: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates

2024-05-29 Thread Hongyu Wang
Gently ping :) Hi Richard, Is it OK to adopt the ccmp change? Or did you know who can help to review this part? Thanks. Hongyu Wang 于2024年5月23日周四 16:27写道: > > Gently ping for this :) > Hi Richard, Is it OK to adopt the ccmp change? Or did you know who can > help to review this pa

Re: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates

2024-05-23 Thread Hongyu Wang
Gently ping for this :) Hi Richard, Is it OK to adopt the ccmp change? Or did you know who can help to review this part? Thanks. Hongyu Wang 于2024年5月15日周三 16:25写道: > > CC'd Richard for ccmp part as previously it is added only for aarch64. > The original logic will not interr

Re: [PATCH] i386: Fix ix86_option override after change [PR 113719]

2024-05-16 Thread Hongyu Wang
Richard Biener 于2024年5月16日周四 15:05写道: > > On Thu, May 16, 2024 at 8:25 AM Hongyu Wang wrote: > > > > Hi, > > > > In ix86_override_options_after_change, calls to ix86_default_align > > and ix86_recompute_optlev_based_flags will cause mism

[PATCH] i386: Fix ix86_option override after change [PR 113719]

2024-05-15 Thread Hongyu Wang
Hi, In ix86_override_options_after_change, calls to ix86_default_align and ix86_recompute_optlev_based_flags will cause mismatched target opt_set when doing cl_optimization_restore. Move them back to ix86_option_override_internal to solve the issue. Bootstrapped & regtested on x86_64-pc-linux-gnu

Re: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates

2024-05-15 Thread Hongyu Wang
t cmp supports but ccmp not, so ret/ret2 will all be valid when comparing cost. Thanks in advance. Hongyu Wang 于2024年5月15日周三 16:22写道: > > For general ccmp scenario, the tree sequence is like > > _1 = (a < b) > _2 = (c < d) > _3 = _1 & _2 > > current ccmp expandin

[PATCH 1/3] [APX CCMP] Support APX CCMP

2024-05-15 Thread Hongyu Wang
APX CCMP feature implements conditional compare which executes compare when EFLAGS matches certain condition. CCMP introduces default flags value (dfv), when conditional compare does not execute, it will directly set the flags according to dfv. The instruction goes like ccmpeq {dfv=sf,of,cf,zf}

[PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates

2024-05-15 Thread Hongyu Wang
For general ccmp scenario, the tree sequence is like _1 = (a < b) _2 = (c < d) _3 = _1 & _2 current ccmp expanding will try to swap compare order for _1 and _2, compare the cost/cost2 between compare _1 and _2 first, then return the sequence with lower cost. For x86 ccmp, we don't support FP com

[PATCH 3/3] [APX CCMP] Support ccmp for float compare

2024-05-15 Thread Hongyu Wang
The ccmp insn itself doesn't support fp compare, but x86 has fp comi insn that changes EFLAG which can be the scc input to ccmp. Allow scalar fp compare in ix86_gen_ccmp_first except ORDERED/UNORDERD compare which can not be identified in ccmp. gcc/ChangeLog: * config/i386/i386-expand.cc

[PATCH 0/3] Support Intel APX CCMP

2024-05-15 Thread Hongyu Wang
html Hongyu Wang (3): [APX CCMP] Support APX CCMP [APX CCMP] Adjust startegy for selecting ccmp candidates [APX CCMP] Support ccmp for float compare gcc/ccmp.cc| 12 +- gcc/config/i386/i386-expand.cc | 164 + gcc/config/i386/

[PATCH] Prohibit SHA/KEYLOCKER usage of EGPR when APX enabled

2024-04-09 Thread Hongyu Wang
The latest APX spec announced removal of SHA/KEYLOCKER evex promotion [1], which means the SHA/KEYLOCKER insn does not support EGPR when APX enabled. Update the corresponding constraints to their EGPR-disabled counterparts. Bootstrapped and regtested on x86-64-pc-linux-gnu. Ok for trunk? [1].htt

Re: [PATCH] x86: Properly implement AMX-TILE load/store intrinsics

2024-02-25 Thread Hongyu Wang
Thanks for fixing this! Didn't notice that the pointer conversion can cause this issue... Was it possible to use local array like char a[64] = (char *)p __asm__ volatile ("ldtilecfg\t%X0" :: "m" (a))); If not, for the two patterns we can use "m" instead of "jm" as APX supports EGPR extension for

[PATCH v2] RISC-V: remove param riscv-vector-abi. [PR113538]

2024-01-25 Thread yanzhang . wang
From: Yanzhang Wang Also adjust some of the tests for scan-assembly. The behavior is the same as --param=riscv-vector-abi before. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_get_arg_info): Remove the flag. (riscv_fntype_abi): Ditto. * config/riscv/riscv.opt: Ditto

[PATCH] RISC-V: remove param riscv-vector-abi. [PR113538]

2024-01-25 Thread yanzhang . wang
From: Yanzhang Wang Also adjust some of the tests for scan-assembly. The behavior is the same as --param=riscv-vector-abi before. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_get_arg_info): Remove the flag. (riscv_fntype_abi): Ditto. * config/riscv/riscv.opt: Ditto

[PATCH] RISC-V: remove param riscv-vector-abi. [PR113538]

2024-01-24 Thread yanzhang . wang
From: Yanzhang Wang Ran a full test to adjust some of the tests for scan-assembly. The behavior is the same as --param=riscv-vector-abi before. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_get_arg_info): Remove the flag. (riscv_fntype_abi): Ditto. * config/riscv

Re: [wwwdocs][PATCH] gcc-14/changes: Update APX inline asm behavior for x86_64

2024-01-15 Thread Hongyu Wang
I'm going to check-in this if no objection Hongyu Wang 于2024年1月9日周二 15:14写道: > > Hi, > > This patch adds missing description for inline asm behavior and related > compiler switch for APX. > > Ok for gcc-wwwdocs? > > --- > htdocs/gcc-14/changes.html | 6 +++

[PATCH 2/2] RISC-V: delete vector abi checking in all relevant tests.

2024-01-14 Thread yanzhang . wang
From: Yanzhang Wang gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: Delete the -Wno-psabi. * gcc.target/riscv/rvv/base/abi-call-args-1.c: Ditto. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: Ditto. * gcc.target/riscv/rvv

[PATCH 1/2] RISC-V: delete all the vector psabi checking.

2024-01-14 Thread yanzhang . wang
From: Yanzhang Wang Thanks the https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/389, we need not to maintain the psabi checking any more. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_arg_has_vector): Delete. (riscv_pass_in_vector_p): Delete

Re: Re: [PATCH] RISC-V: Modify ABI-name length of vfloat16m8_t

2024-01-11 Thread Feng Wang
Committed, thanks. From: juzhe.zh...@rivai.ai Date: 2024-01-12 09:38 To: wangfeng; gcc-patches CC: kito.cheng; jeffreyalaw; wangfeng Subject: Re: [PATCH] RISC-V: Modify ABI-name length of vfloat16m8_t Good catch. LGTM. juzhe.zh...@rivai.ai From: Feng Wang Date: 2024-01-12 09:35 To: gcc

[PATCH] RISC-V: Modify ABI-name length of vfloat16m8_t

2024-01-11 Thread Feng Wang
The length of vfloat16m8_t ABI-name should be 17. gcc/ChangeLog: * config/riscv/riscv-vector-builtins.def (vfloat16m8_t):Modify ABI-name length of vfloat16m8_t --- gcc/config/riscv/riscv-vector-builtins.def | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/r

Re: [PATCH] i386: [APX] Document inline asm behavior and new switch for APX

2024-01-10 Thread Hongyu Wang
Thanks, this is the patch I'm going to check-in Hongtao Liu 于2024年1月10日周三 16:02写道: > > On Tue, Jan 9, 2024 at 3:09 PM Hongyu Wang wrote: > > > > Hi, > > > > For APX, the inline asm behavior was not mentioned in any document > > before. Add description

[wwwdocs][PATCH] gcc-14/changes: Update APX inline asm behavior for x86_64

2024-01-08 Thread Hongyu Wang
Hi, This patch adds missing description for inline asm behavior and related compiler switch for APX. Ok for gcc-wwwdocs? --- htdocs/gcc-14/changes.html | 6 ++ 1 file changed, 6 insertions(+) diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index e3a68998..73a90d30 1006

[PATCH] i386: [APX] Document inline asm behavior and new switch for APX

2024-01-08 Thread Hongyu Wang
Hi, For APX, the inline asm behavior was not mentioned in any document before. Add description for it. Ok for trunk? gcc/ChangeLog: * config/i386/i386.opt: Adjust document. * doc/invoke.texi: Add description for -mapx-inline-asm-use-gpr32. --- gcc/config/i386/i386.opt |

回复: Re: [PATCH v7 1/2] RISC-V: Add crypto vector builtin function.

2024-01-08 Thread Feng Wang
Committed, thanks Juzhe. 发件人: 钟居哲 发送时间: 2024-01-09 07:02 收件人: wangfeng; gcc-patches 抄送: kito.cheng; Jeff Law; wangfeng 主题: Re: [PATCH v7 1/2] RISC-V: Add crypto vector builtin function. LGTM. juzhe.zh...@rivai.ai From: Feng Wang Date: 2024-01-08 17:12 To: gcc-patches CC: kito.cheng

回复: Re: [PATCH v8 2/2] RISC-V: Add crypto vector api-testing cases.

2024-01-08 Thread Feng Wang
Committed, thanks Juzhe. 发件人: 钟居哲 发送时间: 2024-01-09 07:02 收件人: wangfeng; gcc-patches 抄送: kito.cheng; Jeff Law; wangfeng 主题: Re: [PATCH v8 2/2] RISC-V: Add crypto vector api-testing cases. LGTM. juzhe.zh...@rivai.ai From: Feng Wang Date: 2024-01-08 17:12 To: gcc-patches CC: kito.cheng

[PATCH v7 1/2] RISC-V: Add crypto vector builtin function.

2024-01-08 Thread Feng Wang
Patch v7:Resubmit after fix trl-checking issue. Passed all the riscv regression test. Patch v6:Remove unused code. Patch v5:Rebase. Patch v4:Merge crypto vector function.def into vector. Patch v3:Define a shape for vaesz and merge vector-crypto-types.def into riscv-vector-builtins-types.d

[PATCH v8 2/2] RISC-V: Add crypto vector api-testing cases.

2024-01-08 Thread Feng Wang
Patch v8: Resubmit after fix the rtl-checking issue. Passed all the riscv regression test. Patch v7: Add newline at the end of file. Patch v6: Move intrinsic tests into rvv/base. Patch v5: Rebase Patch v4: Add some RV32 vx constraint testcase. Patch v3: Refine crypto vector api-testing cases. Patc

[PATCH] i386: [APX] Add missing document for APX

2024-01-07 Thread Hongyu Wang
Hi, The supported sub-features for APX was missing in option document and target attribute section. Add those missing ones. Ok for trunk? gcc/ChangeLog: * config/i386/i386.opt: Add supported sub-features. * doc/extend.texi: Add description for target attribute. --- gcc/config/i

[committed] RISC-V: Fix avl-type operand index error for ZVBC

2024-01-07 Thread Feng Wang
This patch fix the rtl-checking error for crypto vector. The root cause is the avl-type index of zvbc ins is error,it should be operand[8] not operand[5]. gcc/ChangeLog: * config/riscv/vector.md: Modify avl_type operand index of zvbc ins. --- gcc/config/riscv/vector.md | 4 ++-- 1 file ch

[PATCH] RISC-V: Fix avl-type operand index error for ZVBC

2024-01-05 Thread Feng Wang
This patch fix the rtl-checking error for crypto vector. The root cause is the avl-type index of zvbc ins is error,it should be operand[8] not operand[5]. gcc/ChangeLog: * config/riscv/vector.md: Modify avl_type operand index of zvbc ins. --- gcc/config/riscv/vector.md | 4 ++-- 1 file ch

Re: Re: [PATCH v7 1/2] RISC-V: Add crypto vector builtin function.

2024-01-05 Thread Feng Wang
s >of vclmul and vclmulh instructions". > > > >juzhe.zh...@rivai.ai > OK. Will separate it. >From: Feng Wang >Date: 2024-01-05 16:51 >To: gcc-patches >CC: kito.cheng; jeffreyalaw; juzhe.zhong; Feng Wang >Subject: [PATCH v7 1/2] RISC-

[PATCH v7 1/2] RISC-V: Add crypto vector builtin function.

2024-01-05 Thread Feng Wang
Patch v7:Fix avl_type operand index of zvbc ins. Patch v6:Remove unused code. Patch v5:Rebase. Patch v4:Merge crypto vector function.def into vector. Patch v3:Define a shape for vaesz and merge vector-crypto-types.def into riscv-vector-builtins-types.def. Patch v2:Optimize function_shape c

Re: Re: [committed] RISC-V: Add crypto vector builtin function.

2024-01-04 Thread Feng Wang
; Kito.cheng Subject: Re: Re: [committed] RISC-V: Add crypto vector builtin function. We (me and kito) has reviewed vector-crypto. I believe Wang Feng has done && passed the regression (with no RTL check), but he just didn't enable RTL check I guessed. (By default, RTL check is disabled

[committed] RISC-V: Add crypto vector api-testing cases.

2024-01-04 Thread Feng Wang
This patch add crypto vector api-testing cases based on https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/eopc/vector-crypto/auto-generated/vector-crypto gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/zvbb-intrinsic.c: New test. * gcc.target/riscv/rvv/base/zvbb_vandn_vx

[committed] RISC-V: Add crypto vector builtin function.

2024-01-04 Thread Feng Wang
This patch add the intrinsic funtions of crypto vector based on the intrinsic doc(https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob /eopc/vector-crypto/auto-generated/vector-crypto/intrinsic_funcs.md). Co-Authored by: Songhe Zhu Co-Authored by: Ciyan Pan gcc/ChangeLog: * config/ri

[PATCH v7 2/2] RISC-V: Add crypto vector api-testing cases.

2024-01-02 Thread Feng Wang
Patch v7: Add newline at the end of file. Patch v6: Move intrinsic tests into rvv/base. Patch v5: Rebase Patch v4: Add some RV32 vx constraint testcase. Patch v3: Refine crypto vector api-testing cases. Patch v2: Update march info according to the change of riscv-common.c This patch add crypto vec

[PATCH v6 2/2] RISC-V: Add crypto vector api-testing cases.

2024-01-02 Thread Feng Wang
Patch v6: Move intrinsic tests into rvv/base. Patch v5: Rebase Patch v4: Add some RV32 vx constraint testcase. Patch v3: Refine crypto vector api-testing cases. Patch v2: Update march info according to the change of riscv-common.c This patch add crypto vector api-testing cases based on https://git

Re: Re: [committed] RISC-V: Modify copyright year of vector-crypto.md

2024-01-02 Thread Feng Wang
2024-01-03 00:32 Jeff Law wrote: > > >On 1/1/24 19:25, Feng Wang wrote: >> gcc/ChangeLog: >> * config/riscv/vector-crypto.md: Modify copyright year. >> --- >>   gcc/config/riscv/vector-crypto.md | 2 +- >>   1 file changed, 1 insertion(+),

[PATCH v6 1/2] RISC-V: Add crypto vector builtin function.

2024-01-02 Thread Feng Wang
Patch v6:Remove unused code. Patch v5:Rebase. Patch v4:Merge crypto vector function.def into vector. Patch v3:Define a shape for vaesz and merge vector-crypto-types.def into riscv-vector-builtins-types.def. Patch v2:Optimize function_shape class for crypto_vector. This patch add the intri

Re: Re: [PATCH v5 1/2] RISC-V: Add crypto vector builtin function.

2024-01-02 Thread Feng Wang
int (*avail) (void); >+}; > >What is this used for ? Will delete it. > > >juzhe.zh...@rivai.ai > >From: Feng Wang >Date: 2024-01-02 15:47 >To: gcc-patches >CC: kito.cheng; jeffreyalaw; juzhe.zhong; Feng Wang >Subject: [PATCH v5 1/2] RISC-V: Add crypto vector

[PATCH v5 2/2] RISC-V: Add crypto vector api-testing cases.

2024-01-01 Thread Feng Wang
Patch v5: Rebase. Patch v4: Add some RV32 vx constraint testcase. Patch v3: Refine crypto vector api-testing case s. Patch v2: Update march info according to the change of riscv-common.c This patch add crypto vector api-testing cases based on https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob

[PATCH v5 1/2] RISC-V: Add crypto vector builtin function.

2024-01-01 Thread Feng Wang
Patch v5:Rebase. Patch v4:Merge crypto vector function.def into vector. Patch v3:Define a shape for vaesz and merge vector-crypto-types.def into riscv-vector-builtins-types.def. Patch v2:Optimize function_shape class for crypto_vector. This patch add the intrinsic funtions of crypto vecto

[committed] RISC-V: Modify copyright year of vector-crypto.md

2024-01-01 Thread Feng Wang
gcc/ChangeLog: * config/riscv/vector-crypto.md: Modify copyright year. --- gcc/config/riscv/vector-crypto.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/riscv/vector-crypto.md b/gcc/config/riscv/vector-crypto.md index e40b1543954..9625014e45e 100755 ---

[committed] RISC-V: Add crypto machine descriptions

2024-01-01 Thread Feng Wang
Co-Authored by: Songhe Zhu Co-Authored by: Ciyan Pan gcc/ChangeLog: * config/riscv/iterators.md: Add rotate insn name. * config/riscv/riscv.md: Add new insns name for crypto vector. * config/riscv/vector-iterators.md: Add new iterators for crypto vector. * config/

  1   2   3   4   5   6   7   8   9   10   >