[PATCH] [x86] Use x instead of v for alternative 2 (v, BH) in mov_internal.

2023-06-13 Thread liuhongt via Gcc-patches
Since there's no evex version for vpcmpeq ymm, ymm, ymm. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready to push to trunk and backport to GCC13. gcc/ChangeLog: PR target/110227 * config/i386/sse.md (mov_internal>): Use x instead of v for alter

[r14-1805 Regression] FAIL: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 45) on Linux/x86_64

2023-06-14 Thread haochen.jiang via Gcc-patches
t for excess errors) FAIL: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 45) with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-1805/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with

Re: [PATCHv3, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2023-06-14 Thread Kewen.Lin via Gcc-patches
apped and tested on powerpc64-linux BE and LE with no regressions. > > Thanks > Gui Haochen > > ChangeLog > 2023-05-26 Haochen Gui > > gcc/ > PR target/104124 > * config/rs6000/altivec.md (*altivec_vupkhs_direct): Rename > to... > (

Re: [PATCH ver 4] rs6000: Add builtins for IEEE 128-bit floating point values

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/6/15 04:37, Carl Love wrote: > Kewen, GCC maintainers: > > Version 4, added missing cases for new xxexpqp, xsxexpdp and xsxsigqp > cases to rs6000_expand_builtin. Merged the new define_insn definitions > with the existing definitions. Renamed the builtins

PING^3 [PATCH 0/9] rs6000: Rework rs6000_emit_vector_compare

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen > >> >> on 2022/11/24 17:15, Kewen Lin wrote: >>> Hi, >>> >>> Following Segher's suggestion, this patch series is to rework >>>

PING^2 [PATCH v2] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi, I'd like to gentle ping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614818.html BR, Kewen > on 2023/3/29 15:18, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> By addressing Alexander's comments, against v1 this >> patch v2 mainly

PING^1 [PATCH v2] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html BR, Kewen on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote: > Hi, > > As Honza pointed out in [1], the current uses of function > optimize_function_for_speed_p in rs6000_option_override_in

Re: [PATCH 4/4] rs6000: build constant via li/lis;rldic

2023-06-15 Thread guojiufu via Gcc-patches
On 2023-06-13 17:18, Jiufu Guo via Gcc-patches wrote: Hi David, Thanks for your valuable comments! David Edelsohn writes: ... Do you have any measurement of how expensive it is to test all of these additional methods to generate a constant? How much does this affect the compile time

[PATCH 1/2] Reimplement packuswb/packusdw with UNSPEC_US_TRUNCATE instead of original us_truncate.

2023-06-15 Thread liuhongt via Gcc-patches
-2.c execution test FAIL: gcc.target/i386/sse2-packuswb-1.c execution test Bootstrapped and regtested on x86_64-pc-linux-gnu. Ok for trunk? gcc/ChangeLog: PR target/110235 * config/i386/i386-expand.cc (ix86_split_mmx_pack): Use UNSPEC_US_TRUNCATE instead of original

[PATCH 2/2] Refined 256/512-bit vpacksswb/vpackssdw patterns.

2023-06-15 Thread liuhongt via Gcc-patches
test. Bootstrapped and regtested on x86_64-pc-linux-gnu. Ok for trunk? gcc/ChangeLog: PR target/110235 * config/i386/sse.md (_packsswb): Split to below 3 new define_insns. (sse2_packsswb): New define_insn. (avx2_packsswb): Ditto. (avx512bw_pac

Re: Re: [PATCH V5] RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread juzhe.zhong--- via Gcc-patches
Thanks Richard. I have addressed all comments on V7 patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624220.html Drop vlse/vsse codegen optimization in RISC-V backend, instead I will support LEN_MASK_STRIDED_LOAD/LEN_MASK_STRIDE_STORE in the future. Thanks. juzhe.zh...@rivai.ai

[IRA] Skip empty register classes in setup_reg_class_relations

2023-07-12 Thread SenthilKumar.Selvaraj--- via Gcc-patches
_p (, ) is always true, so ira_reg_class_subset[ALL_REGS][NO_REGS] ends up being set to cl3 = NO_LD_REGS. Adding a continue if hard_reg_set_empty_p (temp_hard_regset) fixes the problem for me. Does the below patch look ok? Bootstrapping and regression testing passed on x86_64. Regards Se

[r14-2407 Regression] FAIL: g++.dg/vect/pr110557.cc -std=c++98 (test for excess errors) on Linux/x86_64

2023-07-12 Thread haochen.jiang via Gcc-patches
.cc -std=c++14 (test for excess errors) FAIL: g++.dg/vect/pr110557.cc -std=c++17 (test for excess errors) FAIL: g++.dg/vect/pr110557.cc -std=c++20 (test for excess errors) FAIL: g++.dg/vect/pr110557.cc -std=c++98 (test for excess errors) with GCC configured with ../../gcc/configure --prefix

[PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-07-12 Thread yanzhang.wang--- via Gcc-patches
From: Yanzhang Wang gcc/ChangeLog: * config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf when enabling -mno-omit-leaf-frame-pointer (riscv_option_override): Override omit-frame-pointer. (riscv_frame_pointer_required): Save s0 for non-leaf function

Re: [PATCH ver4] rs6000, Add return value to __builtin_set_fpscr_rn

2023-07-13 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/12 02:06, Carl Love wrote: > GCC maintainers: > > Ver 4, Removed extra space in subject line. Added comment to commit > log comments about new __SET_FPSCR_RN_RETURNS_FPSCR__ define. Changed > Added to Add and Renamed to Rename in ChangeLog. Update

Re: [PATCH ver 3] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-13 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/8 04:18, Carl Love wrote: > > GCC maintainers: > > Version 3, added code to altivec_resolve_overloaded_builtin so the > correct instruction is selected for the size of the second argument. > This restores the instruction counts to the original values w

[PATCH] Implement Bit-field lowering

2023-07-13 Thread naveenh--- via Gcc-patches
From: Naveen H S This patch adds lowering bit-field and opposite endian accesses pass. The patch addresses many issues in:- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19466 2023-07-14 Andrew Pinski Co-authored-by: Naveen H S gcc/ChangeLog: * Makefile.in (OBJS): Add gimple-lower

[PATCH] vect: Initialize new_temp to avoid false positive warning [PR110652]

2023-07-16 Thread Kewen.Lin via Gcc-patches
ported variable 'new_temp' as NULL_TREE. Confirmed this patch fixed the reported issue in PR110652 (with the same configuration). Is it ok for trunk? BR, Kewen - PR tree-optimization/110652 gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Initialize new_

[PATCH] Remove # from one_cmpl2 assemble output.

2023-07-17 Thread liuhongt via Gcc-patches
-gnu{-m32,} Ready to push to trunk. libpng/pngread.c: In function ‘png_read_image’: libpng/pngread.c:786:1: internal compiler error: in final_scan_insn_1, at final.cc:2813 786 | } | ^ 0x73ac3d final_scan_insn_1 ../../gcc/final.cc:2813 0xb3420b final_scan_insn(rtx_insn*, _IO_FILE*, int

Re: [PATCH] vect: Initialize new_temp to avoid false positive warning [PR110652]

2023-07-17 Thread Kewen.Lin via Gcc-patches
on 2023/7/17 14:39, Richard Biener wrote: > On Mon, Jul 17, 2023 at 4:22 AM Kewen.Lin wrote: >> >> Hi, >> >> As PR110652 and its duplicate PRs show, there could be one >> build error >> >> error: 'new_temp' may be used uninitialized >> >> for some build configurations. It's a false positive war

[PATCH v3] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-07-18 Thread yanzhang.wang--- via Gcc-patches
From: Yanzhang Wang gcc/ChangeLog: * config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf when enabling -mno-omit-leaf-frame-pointer (riscv_option_override): Override omit-frame-pointer. (riscv_frame_pointer_required): Save s0 for non-leaf function

Re: [PATCH, rs6000] Generate mfvsrwz for all platforms and remove redundant zero extend [PR106769]

2023-07-18 Thread Kewen.Lin via Gcc-patches
vsrwz has lower latency than xxextractuw. So it should be generated Nice, it also has lower latency than vextuw[lr]x. > even with p9 vector enabled if possible. Also the instruction is > already zero extended. A combine pattern is needed to eliminate > redundant zero extend instruct

Re: rs6000: Fix expected counts powerpc/p9-vec-length-full

2023-07-18 Thread Kewen.Lin via Gcc-patches
needs some justification why it changes like that and the change is expected. BR, Kewen on 2023/7/18 23:39, Carl Love wrote: > Ping > > On Thu, 2023-06-01 at 16:11 -0700, Carl Love wrote: >> GCC maintainers: >> >> The following patch updates the expected instruction coun

Re: [PATCH v7, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2023-07-18 Thread Kewen.Lin via Gcc-patches
Any recommendations? Thanks a lot. Sorry for the late review, this patch is okay for trunk with the below nit tweaked or not. Thanks! > > ChangeLog > 2022-09-26 Haochen Gui > > gcc/ > PR target/103605 > * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fol

Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]

2023-07-18 Thread Kewen.Lin via Gcc-patches
it is > not > PCREL and also when the user explicitly requests TOC or fixed. If the register > r2 is fixed, it is made as non-volatile. Changes in register preservation > roles > can be accomplished with the help of available target hooks > (TARGET_CONDITIONAL_REGISTER_USAGE).

Re: PING^2 [PATCH] Adjust the symbol for SECTION_LINK_ORDER linked_to section [PR99889]

2023-07-19 Thread Kewen.Lin via Gcc-patches
Hi Fangrui, on 2023/7/19 14:33, Fangrui Song wrote: > On Thu, Nov 24, 2022 at 7:26 PM Kewen.Lin via Gcc-patches > wrote: >> >> Hi Richard, >> >> on 2022/11/23 00:08, Richard Sandiford wrote: >>> "Kewen.Lin" writes: >>>> Hi Richard, &g

[PATCH V2] rs6000: Don't allow AltiVec address in movoo & movxo pattern [PR110411]

2023-07-19 Thread jeevitha via Gcc-patches
and movxo pattern to disallow these types of addresses, which assists LRA in resolving this issue. Furthermore, the mode size 16 check has been removed in vsx_quad_dform_memory_operand to allow OOmode and quad_address_p already handles less than size 16. 2023-07-19 Jeevitha Palanisamy gcc

[PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2023-07-19 Thread jeevitha via Gcc-patches
Palanisamy gcc/ PR target/110411 * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add fields to hold PTImode type. * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Add node for PTImode type. gcc/testsuite/ PR target/106895

[PATCH] Fix fp16 related testcase failure for i686.

2023-07-19 Thread liuhongt via Gcc-patches
utable > +FAIL: gcc.target/i386/float16-7.c (test for errors, line 7) > > Perhaps we need to tweak > gcc/testsuite/lib/target-supports.exp (add_options_for_float16) > so that it adds -msse2 for i?86-*-* x86_64-*-* (that would likely > fix up floatn-convert) and for the others

[PATCH] Optimize vlddqu to vmovdqu for TARGET_AVX

2023-07-20 Thread liuhongt via Gcc-patches
rapped and regtested on x86_64-pc-linux-gnu{-m32,}. If AMD also like such optimization, Ok for trunk? gcc/ChangeLog: * config/i386/sse.md (_lddqu): Change to define_expand, expand as simple move when TARGET_AVX && ( == 16 || !TARGET_AVX256_SPLIT_UNALIGNED_LOAD).

[PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Kewen.Lin via Gcc-patches
the function_section. As Fangrui suggested[1], this patch is to add a bit more test coverage. I didn't find a good way to check all linked_to symbols are different, so I checked for LPFE[012] here. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624866.html Tested well on x86_64-r

[PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Kewen.Lin via Gcc-patches
110744. This patch is to fix the related handlings with the correct index. Bootstrapped and regress-tested on x86_64-redhat-linux, powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9/P10. Is it ok for trunk? BR, Kewen - PR tree-optimization/110744 gcc/ChangeLog: * tree-ssa

[r14-2629 Regression] FAIL: g++.dg/cpp0x/udlit-extended-id-3.C -std=c++20 (test for excess errors) on Linux/x86_64

2023-07-20 Thread haochen.jiang via Gcc-patches
: g++.dg/cpp0x/udlit-extended-id-3.C -std=c++17 (test for excess errors) FAIL: g++.dg/cpp0x/udlit-extended-id-3.C -std=c++20 (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2629/usr --enable-clocale=gnu

[r14-2639 Regression] FAIL: gcc.dg/vect/bb-slp-pr95839-v8.c scan-tree-dump slp2 "optimized: basic block" on Linux/x86_64

2023-07-20 Thread haochen.jiang via Gcc-patches
-pr95839-v8.c -flto -ffat-lto-objects scan-tree-dump slp2 "optimized: basic block" FAIL: gcc.dg/vect/bb-slp-pr95839-v8.c scan-tree-dump slp2 "optimized: basic block" with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r1

Re: [PATCH 1/2] rs6000, add argument to function find_instance

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/18 03:19, Carl Love wrote: > > GCC maintainers: > > The rs6000 function find_instance assumes that it is called for built- > ins with only two arguments. There is no checking for the actual > number of aruguments used in the built-in. This patch ad

Re: [PATCH 2/2 ver 4] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/18 03:20, Carl Love wrote: > GCC maintainers: > > Version 4, changed the new RS6000_OVLD_VEC_REPLACE_UN case statement > rs6000/rs6000-c.cc. The existing REPLACE_ELT iterator name was changed > to REPLACE_ELT_V along with the associated define_mode_attr. Rena

Re: [PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Kewen.Lin via Gcc-patches
way to check all linked_to >> symbols are different, so I checked for LPFE[012] here. >> >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624866.html >> >> Tested well on x86_64-redhat-linux, powerpc64-linux-gnu >> P7/P8/P9 and powerpc64le-linux-gnu

Re: [PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Kewen.Lin via Gcc-patches
-linux, >> powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9/P10. >> >> Is it ok for trunk? >> >> BR, >> Kewen >> - >> PR tree-optimization/110744 >> >> gcc/ChangeLog: >> >> * tree-ssa-sccvn.cc (vn_reference_lookup_3): Correct the index of bias >> operand for ifn IFN_LEN_STORE. > > OK, thanks. > Thanks Richard! Pushed as r14-2694. BR, Kewen

[PATCH] vect: Don't vectorize a single scalar iteration loop [PR110740]

2023-07-20 Thread Kewen.Lin via Gcc-patches
cc.dg/vect/slp-perm-{1,5,6,7}.c Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu, powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9/P10. Is it ok for trunk? BR, Kewen - PR tree-optimization/110740 gcc/ChangeLog: * tree-vect-lo

[r14-2655 Regression] FAIL: g++.dg/gomp/pr58567.C -std=c++98 (test for excess errors) on Linux/x86_64

2023-07-23 Thread haochen.jiang via Gcc-patches
(test for excess errors) FAIL: g++.dg/gomp/pr58567.C -std=c++17 (test for excess errors) FAIL: g++.dg/gomp/pr58567.C -std=c++20 (test for excess errors) FAIL: g++.dg/gomp/pr58567.C -std=c++98 (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj

[r14-2709 Regression] FAIL: gcc.target/i386/pr93089-3.c scan-assembler vmulps[^\n\r]*zmm on Linux/x86_64

2023-07-23 Thread haochen.jiang via Gcc-patches
]*zmm FAIL: gcc.target/i386/pr93089-3.c scan-assembler vmulps[^\n\r]*zmm with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2709/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c

Re: [PATCH 4/9] vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Iain, on 2023/7/22 23:58, Iain Sandoe wrote: > Hi Kewen, > > This patch breaks bootstrap on powerpc-darwin (which has Altivec, but not > VSX) while building libgfortran. > >> On 3 Jul 2023, at 04:19, Kewen.Lin via Gcc-patches >> wrote: > > Please

Re: [PATCH 1/2 ver 2] rs6000, add argument to function find_instance

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/22 07:38, Carl Love wrote: > GCC maintainers: > > Version 2: Updated a number of formatting and spacing issues. Added > the NARGS description to the header comment for function find_instance. > This patch was tested on Power 8 LE/BE, Power 9 LE/BE and Power

Re: [PATCH 2/2 ver 5] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/22 07:38, Carl Love wrote: > GCC maintainers: > > Version 5, Fixed patch description, the first argument should be of > type vector. Fixed comment in vsx.md to say "Vector and scalar > extract_elt iterator/attr ". Removed a few of t

Re: [PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-23 Thread Kewen.Lin via Gcc-patches
> which can help eliminate redundant zero extend. > > Compared to last version, the main change is to add a new expand for V4SI > and separate "vsx_extract_si" to 2 insn patterns. > https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622101.html > > Bootstrap

Re: [PATCH] vect: Don't vectorize a single scalar iteration loop [PR110740]

2023-07-23 Thread Kewen.Lin via Gcc-patches
on 2023/7/21 19:49, Richard Biener wrote: > On Fri, Jul 21, 2023 at 8:08 AM Kewen.Lin wrote: >> >> Hi, >> >> The function vect_update_epilogue_niters which has been >> removed by r14-2281 has some code taking care of that if >> there is only one scalar iteration left for epilogue then >> we won't

[r14-2754 Regression] FAIL: gfortran.dg/gomp/pr99226.f90 -O (test for excess errors) on Linux/x86_64

2023-07-24 Thread haochen.jiang via Gcc-patches
AIL: gfortran.dg/gomp/pr99226.f90 -O (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2754/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fort

[PATCH] vect: Treat VMAT_ELEMENTWISE as scalar load in costing [PR110776]

2023-07-25 Thread Kewen.Lin via Gcc-patches
9/P10. Is it ok for trunk? BR, Kewen - Co-authored-by: Richard Biener PR tree-optimization/110776 gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Always cost VMAT_ELEMENTWISE as scalar load. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr110776.c

[PATCH] rs6000: Correct vsx operands output for xxeval [PR110741]

2023-07-25 Thread Kewen.Lin via Gcc-patches
nu P9/P10. I'll push this soon and backport to release branches after a week or so. BR, Kewen - PR target/110741 gcc/ChangeLog: * config/rs6000/vsx.md (define_insn xxeval): Correct vsx operands output with "x". gcc/testsuite/ChangeLog:

Re: [PATCH] Fix typo in insn name.

2023-07-25 Thread Kewen.Lin via Gcc-patches
, IBM 128-bit long double > * Power9, LE, --with-cpu=power9, IEEE 128-bit long double > * Power9, LE, --with-cpu=power9, 64-bit default long double > * Power9, BE, --with-cpu=power9, IBM 128-bit long double > * Power8, BE, --with-cpu=power8, IBM 128-bit long d

Re: [PATCH] vect: Treat VMAT_ELEMENTWISE as scalar load in costing [PR110776]

2023-07-26 Thread Kewen.Lin via Gcc-patches
on 2023/7/26 18:02, Richard Biener wrote: > On Wed, Jul 26, 2023 at 4:52 AM Kewen.Lin wrote: >> >> Hi, >> >> PR110776 exposes one issue that we could query unaligned >> load for vector type but actually no unaligned vector load >> is supported there. The reason is that the costed load is >> with

[r14-2786 Regression] FAIL: g++.target/i386/pr98218-1.C -std=gnu++98 scan-assembler-times cmpltps 3 on Linux/x86_64

2023-07-27 Thread haochen.jiang via Gcc-patches
cmpltps 3 FAIL: g++.target/i386/pr98218-1.C -std=gnu++98 scan-assembler-times pcmpgtd 2 with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2786/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable

[PATCH] [x86] Add UNSPEC_MASKOP to vpbroadcastm pattern.

2023-07-27 Thread liuhongt via Gcc-patches
Prevent rtl optimization of vec_duplicate + zero_extend to vpbroadcastm since there could be an extra kmov after RA. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} Ready to push to trunk. gcc/ChangeLog: PR target/110788 * config/i386/sse.md (avx512cd_maskb_vec_dup

[r14-2797 Regression] FAIL: 23_containers/vector/bool/110807.cc (test for excess errors) on Linux/x86_64

2023-07-27 Thread haochen.jiang via Gcc-patches
/vector/bool/110807.cc (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2797/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet

Re: [PATCH] Optimize vec_splats of vec_extract for V2DI/V2DF (PR target/99293)

2023-07-28 Thread Kewen.Lin via Gcc-patches
; > vector long long > splat_dup_l_0 (vector long long v) > { > return __builtin_vec_splats (__builtin_vec_extract (v, 0)); > } > > would generate: > > mfvsrld 9,34 > mtvsrdd 34,9,9 > blr > > With this patch, GC

Re: [PATCH, rs6000] Skip redundant vector extract if the element is first element of dword0 [PR110429]

2023-07-28 Thread Kewen.Lin via Gcc-patches
x27;t need a redundant vector extraction at all. > is a memory operand. Only one 'stxsi[hb]x' instruction is enough. > > The V4SImode is fixed in a previous patch. > https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622101.html > > Bootstrapped and tested on powe

[PATCH v2] RISC-V: convert the mulh with 0 to mov 0 to the reg.

2023-07-28 Thread yanzhang.wang--- via Gcc-patches
below vmv. vsetvli zero,a2,e32,m1,ta,ma vmv.v.i v1,0 vs1r.v v1,0(a0) It will elimate the mul with const 0 instruction to the simple mov instruction. Signed-off-by: Yanzhang Wang gcc/ChangeLog: * config/riscv/autovec-opt.md: Add a split pattern. gcc/testsuite/ChangeLog

[r14-2834 Regression] FAIL: gcc.target/i386/pr87007-5.c scan-assembler-times vxorps[^\n\r]*xmm[0-9] 1 on Linux/x86_64

2023-07-28 Thread haochen.jiang via Gcc-patches
[^\n\r]*xmm[0-9] 1 FAIL: gcc.target/i386/pr87007-5.c scan-assembler-times vxorps[^\n\r]*xmm[0-9] 1 with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2834/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath

Re: [PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-30 Thread Kewen.Lin via Gcc-patches
srwz > which helps eliminate redundant zero extend. > > Compared to last version, the main change is to move "vsx_extract_v4si_w1" > and "*mfvsrwz" to the front of "*vsx_extract__di_p9". Also some insn > conditions are changed to assertions. > https://gc

Re: [PATCH] rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

2023-07-30 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/28 23:00, Carl Love wrote: > GCC maintainers: > > The following patch cleans up the definition for the > __builtin_altivec_vcmpnet. The current implementation implies that the s/__builtin_altivec_vcmpnet/__builtin_altivec_vcmpne[bhw]/ > built-in is only suppo

[PATCH] Adjust testcase for more optimal codegen.

2023-07-31 Thread liuhongt via Gcc-patches
vpextrd $3, %xmm0, %eax vmovddup %xmm3, %xmm0 vrndscalepd $9, %xmm0, %xmm0 vunpckhpd %xmm0, %xmm0, %xmm3 for vrndscalepd, no need to insert pxor since it reuses input register xmm0 to avoid partial sse dependece. Pushed to trunk. gcc/testsuite/ChangeLog: * gcc.target/i386

[PATCH] Support vec_fmaddsub/vec_fmsubadd for vector HFmode.

2023-08-01 Thread liuhongt via Gcc-patches
AVX512FP16 supports vfmaddsubXXXph and vfmsubaddXXXph. Also remove scalar mode from fmaddsub/fmsubadd pattern since there's no scalar instruction for that. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready to push to trunk. gcc/ChangeLog: PR target/81904 * c

[PATCH] Optimize vlddqu + inserti128 to vbroadcasti128

2023-08-01 Thread liuhongt via Gcc-patches
to load port comparing to vbroadcasti128, For latency perspective,vbroadcasti is no worse than vlddqu + vinserti128. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625122.html Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: * config/i386/sse.md (*avx

[r13-1357 Regression] FAIL: g++.dg/warn/Warray-bounds-16.C -std=gnu++98 pr102690 (test for bogus messages, line 22) on Linux/x86_64

2022-06-30 Thread skpandey--- via Gcc-patches
-std=gnu++98 pr102690 (test for bogus messages, line 22) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1357/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c

Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Kewen.Lin via Gcc-patches
Hi Richi, Thanks for the insightful comments! on 2022/7/1 16:40, Richard Biener wrote: > On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote: >> >> Hi, >> >> Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html >> >> BR, >> Kew

[r13-1395 Regression] FAIL: gfortran.dg/check_bits_2.f90 -O1 output pattern test on Linux/x86_64

2022-07-01 Thread skpandey--- via Gcc-patches
.f90 -O1 output pattern test with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1395/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl

[r13-1405 Regression] FAIL: gcc.dg/analyzer/allocation-size-4.c warning at line 31 (test for warnings, line 28) on Linux/x86_64

2022-07-02 Thread skpandey--- via Gcc-patches
at line 32 (test for warnings, line 28) FAIL: gcc.dg/analyzer/allocation-size-4.c note at line 33 (test for warnings, line 28) FAIL: gcc.dg/analyzer/allocation-size-4.c warning at line 31 (test for warnings, line 28) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork

[r13-1450 Regression] FAIL: 25_algorithms/find_end/constrained.cc (test for excess errors) on Linux/x86_64

2022-07-04 Thread skpandey--- via Gcc-patches
(test for excess errors) FAIL: gcc.dg/auto-init-uninit-4.c (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1450/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable

[r13-1460 Regression] FAIL: gcc.dg/tree-ssa/alias-access-path-13.c scan-tree-dump-times fre1 "return 123" 1 on Linux/x86_64

2022-07-04 Thread skpandey--- via Gcc-patches
-access-path-13.c scan-tree-dump-times fre1 "return 123" 1 with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1460/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c

[r13-1509 Regression] FAIL: gcc.target/i386/pr65105-5.c scan-assembler ptest on Linux/x86_64

2022-07-05 Thread skpandey--- via Gcc-patches
.c scan-assembler pandn FAIL: gcc.target/i386/pr65105-5.c scan-assembler ptest with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1509/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable

[PATCH] rs6000: Preserve REG_EH_REGION when replacing load/store [PR106091]

2022-07-07 Thread Kewen.Lin via Gcc-patches
f it looks good to you. Thanks! - PR target/106091 gcc/ChangeLog: * config/rs6000/rs6000-p8swap.cc (replace_swapped_aligned_store): Copy REG_EH_REGION when replacing one store insn having it. (replace_swapped_aligned_load): Likewise. gcc/t

Re: [PATCH] rs6000: Preserve REG_EH_REGION when replacing load/store [PR106091]

2022-07-07 Thread Kewen.Lin via Gcc-patches
the question, I'm not sure :(, when I was drafting this patch, I wondered if there is one function passing/copying reg_note REG_EH_REGION for this kind of need, so I went through almost all the places related to REG_EH_REGION, but nothing desired was found (though I may miss sth

[PATCH] Fix tree-opt/PR106087: ICE with inline-asm with multiple output and assigned only static vars

2022-07-07 Thread apinski--- via Gcc-patches
the statement was only defining one ssa name. OK? Bootstrapped and tested on x86_64 with no regressions. PR tree-optimization/106087 gcc/ChangeLog: * tree-ssa-dce.cc (simple_dce_from_worklist): Check to make sure the statement is only defining one operand. gcc/testsuite

Re: [PATCH/RFC] combine_completed global variable.

2022-07-08 Thread Kewen.Lin via Gcc-patches
ailed. I checked the unrecognizable pattern and the original patch, I guessed it needs a tiny adjustment like below: diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index dde123e87b8..0a089f12510 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -42

[r13-1573 Regression] FAIL: gcc.dg/pr106063.c (test for excess errors) on Linux/x86_64

2022-07-08 Thread skpandey--- via Gcc-patches
xcess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1573/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-l

[COMMITTED] Fix tree-opt/PR106087: ICE with inline-asm with multiple output and assigned only static vars

2022-07-08 Thread apinski--- via Gcc-patches
the statement was only defining one ssa name. Committed as approved after a bootstrapped and tested on x86_64 with no regressions. PR tree-optimization/106087 gcc/ChangeLog: * tree-ssa-dce.cc (simple_dce_from_worklist): Check to make sure the statement is only defining one

[PATCH] Allocate general register(memory/immediate) for 16/32/64-bit vector bit_op patterns.

2022-07-10 Thread liuhongt via Gcc-patches
) Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/106038 * config/i386/mmx.md (3): Expand with (clobber (reg:CC flags_reg)) under TARGET_64BIT (mmx_code>3): Ditto. (*mmx_3_1): New define_insn, add post_rel

Re: [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-10 Thread Kewen.Lin via Gcc-patches
pproved it. I guessed that thread escaped from your mail radar somehow, it started from [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597595.html BR, Kewen

Re: [PATCH] predict: Adjust optimize_function_for_size_p [PR105818]

2022-07-10 Thread Kewen.Lin via Gcc-patches
cgraph node for it, w/o this patch function >>> optimize_function_for_speed_p returns true eventually, while it >>> returns false with this patch. Since the command line option -Os >>> is specified, there is no reason to interpret it as "for speed". >>&

[PATCH] [RFC]Support vectorization for Complex type.

2022-07-10 Thread liuhongt via Gcc-patches
x-gnu{-m32,}. Also test the patch for SPEC2017 and find there's complex type vectorization in 510/549(but no performance impact). Any comments? gcc/ChangeLog: PR tree-optimization/106010 * tree-vect-data-refs.cc (vect_get_data_access_cost): Pass complex_p to vect_get_

Re: [PATCH] HIGH part of symbol ref is invalid for constant pool

2022-07-13 Thread Kewen.Lin via Gcc-patches
Hi Jeff, Thanks for the patch, one question is inlined below. on 2022/7/4 14:58, Jiufu Guo wrote: > The high part of the symbol address is invalid for the constant pool. In > function rs6000_cannot_force_const_mem, we already return true for > "HIGH with UNSPEC" rtx. During

Re: [PATCH, rs6000] Additional cleanup of rs6000_builtin_mask

2022-07-13 Thread Kewen.Lin via Gcc-patches
at function can also be safely removed. > The TargetVariable rs6000_builtin_mask in rs6000.opt is useless, it seems it can be removed together? > I have tested this on current systems (P8,P9,P10) without regressions. > > OK for trunk? > > > Thanks, > -

[PATCH] Extend 64-bit vector bit_op patterns with ?r alternative

2022-07-13 Thread liuhongt via Gcc-patches
inux-gnu{-m32,}. No big imact on SPEC2017(Most same binary). Ok for trunk? gcc/ChangeLog: PR target/106038 * config/i386/mmx.md (3): Expand with (clobber (reg:CC flags_reg)) under TARGET_64BIT (mmx_code>3): Ditto. (*mmx_3_gpr): New define_insn, add po

[PATCH] Extend 16/32-bit vector bit_op patterns with (m, 0, i)(vertical) alternative.

2022-07-17 Thread liuhongt via Gcc-patches
And split it after reload. >IMO, the only case it is worth adding is a direct immediate store to >memory, which HJ recently added. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/106038 * config/i386/mmx.md (3): Extend to A

[PATCH V2] [RFC]Support vectorization for Complex type.

2022-07-17 Thread liuhongt via Gcc-patches
ore). Any comments? gcc/ChangeLog: PR tree-optimization/106010 * tree-vect-data-refs.cc (vect_get_data_access_cost): Pass complex_p to vect_get_num_copies to avoid ICE. (vect_analyze_data_refs): Support vectorization for Complex type with vector sc

[PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-18 Thread liuhongt via Gcc-patches
e combine/forwprop will do optimization. > Please use if (!register_operand (operands[2], mode)) instead. Changed. Update patch. gcc/ChangeLog: PR target/106038 * config/i386/mmx.md (3): New define_expand, it's original "3". (*3): New define_insn, i

Re: [PATCH, rs6000, v2] Additional cleanup of rs6000_builtin_mask

2022-07-19 Thread Kewen.Lin via Gcc-patches
as been tested as before, so this patch is OK. Thanks! > gcc/ > * config/rs6000/rs6000-c.cc: Update comments. > (rs6000_target_modify_macros): Remove bu_mask references. > (rs6000_define_or_undefine_macro): Replace bu_mask reference > wit

[PATCH] Move pass_cse_sincos after vectorizer.

2022-07-19 Thread liuhongt via Gcc-patches
_sincos additionaly expands pow&cabs, this patch split that part into a separate pass named pass_expand_powcabs which remains the old pass position. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Observe more libmvec sin/cos vectorization in specfp, but no big performance. Ok fo

gcc-patches@gcc.gnu.org

2022-07-19 Thread liuhongt via Gcc-patches
PLEX_CST for rhs. And it will enable vectorization for pr106010-8a.c. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? 2022-07-20 Richard Biener Hongtao Liu gcc/ChangeLog: PR tree-optimization/106010 * tree-complex.cc (init_dont_simulate_a

[PATCH] libgo: make match.sh POSIX-shell compatible

2022-07-19 Thread soeren--- via Gcc-patches
From: Sören Tempel The `(( expression ))` syntax is a Bash extension and not supported by POSIX shell [1]. However, the arithmetic expressions used by the gobuild() function can also be expressed using arithmetic POSIX expansions with `$(( expression ))` [2]. Contrary to the Bash extension, arit

[PATCH] rs6000: Suggest unroll factor for loop vectorization

2022-07-20 Thread Kewen.Lin via Gcc-patches
-linux-gnu P7 and P8, and powerpc64le-linux-gnu P9. Bootstrapped on powerpc64le-linux-gnu P10, but one failure was exposed during regression testing there, it's identified as one miss optimization and can be reproduced without this support, PR106365 was opened for further tracking. Is it fo

[PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-20 Thread Kewen.Lin via Gcc-patches
ailures on gcc.target/powerpc/pr92398.p9-.c fixed, I can see it helps to bring back some testing coverage like: NA->PASS: gcc.target/powerpc/pr92398.p9+.c NA->PASS: gcc.target/powerpc/pr93453-1.c I'll push this soon if no objections. BR, Kewen - PR testsuite/106345 gc

[PATCH] rs6000/test: Update some cases with -mdejagnu-tune

2022-07-20 Thread Kewen.Lin via Gcc-patches
original -mdejagnu-cpu when it's required) accordingly. Tested on powerpc64-linux-gnu P7 and P8 and powerpc64le-linux-gnu P9 and P10, also with explicit p10 tune setting for configuration. I'll push this soon if no objections. BR, Kewen - PR testsuite/106345 gcc/testsuite

[r13-1762 Regression] FAIL: gcc.dg/pr56837.c scan-tree-dump-times optimized "memset ..c, 68, 16384.; " 1 on Linux/x86_64

2022-07-20 Thread skpandey--- via Gcc-patches
c.dg/pr23911.c scan-tree-dump-times dce3 "__complex__ \\(1.0e\\+0, 0.0\\)" 2 FAIL: gcc.dg/pr56837.c scan-tree-dump-times optimized "memset ..c, 68, 16384.;" 1 with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1762/u

[PATCH V3] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-20 Thread liuhongt via Gcc-patches
And split it after reload. gcc/ChangeLog: PR target/106038 * config/i386/mmx.md (3): New define_expand, it's original "3". (*3): New define_insn, it's original "3" be extended to handle memory and immediate operan

Re: [PATCH] Teach VN about masked/len stores

2022-07-21 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2022/7/21 17:01, Richard Biener via Gcc-patches wrote: > The following teaches VN to handle reads from .MASK_STORE and > .LEN_STORE. For this push_partial_def is extended first for > convenience so we don't have to handle the full def case in the > caller (possibl

Re: [PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-21 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the comments! on 2022/7/22 06:09, Segher Boessenkool wrote: > On Wed, Jul 20, 2022 at 05:32:01PM +0800, Kewen.Lin wrote: >> As the failure of test case gcc.target/powerpc/pr92398.p9-.c in >> PR106345 shows, some test sources for some powerpc effective >> targets use empty tr

Re: [PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-21 Thread Kewen.Lin via Gcc-patches
Hi! on 2022/7/22 09:02, Segher Boessenkool wrote: > Hi! > > On Fri, Jul 22, 2022 at 08:41:43AM +0800, Kewen.Lin wrote: >> Hi Segher, >> >> Thanks for the comments! > > Always. > This patch is to fix empty TUs with one dummy variable definition accordingly. >>> >>> You can also use >>>

[r13-1786 Regression] FAIL: gcc.dg/analyzer/stdarg-3.c (test for excess errors) on Linux/x86_64

2022-07-21 Thread skpandey--- via Gcc-patches
FAIL: gcc.dg/analyzer/stdarg-3.c (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1786/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran

[PATCH] Adjust testcase.

2022-07-21 Thread liuhongt via Gcc-patches
r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d lower complex type move to scalars, but testcase pr23911 is supposed to scan __complex__ constant which is never available, so adjust testcase to scan IMAGPART/REALPART_EXPR constants separately. Pushed as obvious patch. gcc/testsuite/ChangeLog

<    1   2   3   4   5   6   7   8   9   10   >