[PATCH] RISC-V: Add RVV mask logic auto-vectorization

2023-05-24 Thread juzhe . zhong
From: Juzhe-Zhong This patch is adding mask logic auto-vectorization. define the pattern as "define_insn_and_split" to allow combine PASS easily combine series instructions. For example: combine vmxor.mm + vmnot.m into vmxnor.mm Build success and regression PASS Ok for trunk ? gcc/ChangeLog:

Re: [PATCH] RISC-V: Add RVV mask logic auto-vectorization

2023-05-24 Thread Kito Cheng via Gcc-patches
Just one comment: define_insn_and_split should be used in this scenario rather than define_insn_and_rewrite since you are not really rewriting. You can commit after updating to define_insn_and_split :) On Wed, May 24, 2023 at 3:04 PM wrote: > > From: Juzhe-Zhong > > This patch is adding mask lo

Re: Re: [PATCH] RISC-V: Add RVV mask logic auto-vectorization

2023-05-24 Thread juzhe.zh...@rivai.ai
Thanks kito., change it into define_insn_and_split send V2 soon. juzhe.zh...@rivai.ai From: Kito Cheng Date: 2023-05-24 15:18 To: juzhe.zhong CC: gcc-patches; kito.cheng; palmer; palmer; jeffreyalaw; rdapp.gcc Subject: Re: [PATCH] RISC-V: Add RVV mask logic auto-vectorization Just one comment:

[V2 COMMITTED] RISC-V: Add RVV mask logic auto-vectorization

2023-05-24 Thread juzhe . zhong
From: Juzhe-Zhong This patch is adding mask logic auto-vectorization. define the pattern as "define_insn_and_split" to allow combine PASS easily combine series instructions. For example: combine vmxor.mm + vmnot.m into vmxnor.mm Build success and regression PASS And committed. --- gcc/config

Re: [V2 COMMITTED] RISC-V: Add RVV mask logic auto-vectorization

2023-05-24 Thread Kito Cheng via Gcc-patches
LGTM, just one comment in git comment, no need v3, just commit with the fix is fine :) On Wed, May 24, 2023 at 3:28 PM wrote: > > From: Juzhe-Zhong > > This patch is adding mask logic auto-vectorization. > define the pattern as "define_insn_and_split" to allow don't forgot to update here ^

Re: Re: [V2 COMMITTED] RISC-V: Add RVV mask logic auto-vectorization

2023-05-24 Thread juzhe.zh...@rivai.ai
> > From: Juzhe-Zhong > > This patch is adding mask logic auto-vectorization. > define the pattern as "define_insn_and_split" to allow >don't forgot to update here ^ I notice I missed changeLog here. Is that you want me to fix in the commit log? juzhe.zh...@rivai.ai From: Kito Cheng D

Re: Re: [V2 COMMITTED] RISC-V: Add RVV mask logic auto-vectorization

2023-05-24 Thread Kito Cheng via Gcc-patches
Oh, never mind, I mean you updated to use define_insn_and_split but comment still define_insn_and_split, but just ignore that if already committed On Wed, May 24, 2023 at 3:42 PM juzhe.zh...@rivai.ai wrote: > > > > > From: Juzhe-Zhong > > > > This patch is adding mask logic auto-vectorization. >

RE: Re: [V2 COMMITTED] RISC-V: Add RVV mask logic auto-vectorization

2023-05-24 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito. Pan -Original Message- From: Kito Cheng Sent: Wednesday, May 24, 2023 4:08 PM To: juzhe.zh...@rivai.ai Cc: gcc-patches ; Kito.cheng ; palmer ; palmer ; jeffreyalaw ; Robin Dapp ; Li, Pan2 Subject: Re: Re: [V2 COMMITTED] RISC-V: Add RVV mask logic auto-vectori

Re: [PATCH] early-remat: Resync with new DF postorders [PR109940]

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, 24 May 2023, Richard Sandiford wrote: > When I wrote early-remat, the DF_FORWARD block order was a postorder > of a reverse/backward walk (i.e. of the inverted cfg), rather than a > reverse postorder of a forward walk. A postorder of a backward walk > lacked the important property that do

[PATCH][committed] arm: PR target/109939 Correct signedness of return type of __ssat intrinsics

2023-05-24 Thread Kyrylo Tkachov via Gcc-patches
Hi all, As the PR says we shouldn't be using qualifier_unsigned for the return type of the __ssat intrinsics. UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS already exists for that. This was just a thinko. This patch fixes this and the warning with -Wconversion goes away. Bootstrapped and tested on

Re: [PATCH] [libstdc++] [testsuite] xfail to_chars/long_double on x86-vxworks

2023-05-24 Thread Jonathan Wakely via Gcc-patches
On Wed, 24 May 2023 at 06:52, Alexandre Oliva via Libstdc++ < libstd...@gcc.gnu.org> wrote: > > Just as on aarch64, x86's wider long double experiences loss of > precision with from_chars implemented in terms of double. Expect the > execution fail. > > Bootstrapped on x86_64-linux-gnu. Also test

Re: [PATCH] LoongArch: Fix the problem of structure parameter passing in C++. This structure has empty structure members and less than three floating point members.

2023-05-24 Thread Xi Ruoyao via Gcc-patches
Wang Lei raised some concerns about Itanium C++ ABI, so let's ask a C++ expert here... Jonathan: AFAIK the standard and the Itanium ABI treats an empty class as size 1 in order to guarantee unique address, so for the following: class Empty {}; class Test { Empty empty; double a, b; }; When we pa

Re: [PATCH] LoongArch: Fix the problem of structure parameter passing in C++. This structure has empty structure members and less than three floating point members.

2023-05-24 Thread Lulu Cheng
在 2023/5/24 下午2:45, Xi Ruoyao 写道: On Wed, 2023-05-24 at 14:04 +0800, Lulu Cheng wrote: An empty struct type that is not non-trivial for the purposes of calls will be treated as though it were the following C type: struct {   char c; }; Before this patch was added, a structure parameter cont

Re: [PATCH] PR middle-end/109840: Preserve popcount/parity type in match.pd.

2023-05-24 Thread Richard Biener via Gcc-patches
On Tue, May 23, 2023 at 8:30 PM Roger Sayle wrote: > > > PR middle-end/109840 is a regression introduced by my recent patch to > fold popcount(bswap(x)) as popcount(x). When the bswap and the popcount > have the same precision, everything works fine, but this optimization also > allowed a zero-ex

Re: [PATCH] LoongArch: Fix the problem of structure parameter passing in C++. This structure has empty structure members and less than three floating point members.

2023-05-24 Thread Jonathan Wakely via Gcc-patches
On Wed, 24 May 2023 at 09:41, Xi Ruoyao wrote: > Wang Lei raised some concerns about Itanium C++ ABI, so let's ask a C++ > expert here... > > Jonathan: AFAIK the standard and the Itanium ABI treats an empty class > as size 1 Only as a complete object, not as a subobject. > in order to guarant

Re: [PATCH] Dump if a pattern fails after having printed applying it

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 1:16 AM Andrew Pinski via Gcc-patches wrote: > > While trying to understand how to use the ! operand for match > patterns, I noticed that the debug dumps would print out applying > a pattern but nothing when it was rejected in the end. This was confusing > me. > This adds t

Re: [PATCH] [testsuite] tsvc: skip include malloc.h when unavailable

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 7:17 AM Alexandre Oliva via Gcc-patches wrote: > > > tsvc tests all fail on systems that don't offer a malloc.h, other than > those that explicitly rule that out. Use the preprocessor to test for > malloc.h's availability. > > tsvc.h also expects a definition for struct ti

Re: [PATCH] [testsuite] require pic for pr103074.c

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 7:19 AM Alexandre Oliva via Gcc-patches wrote: > > > Fix test that uses -fPIC without stating the requirement for PIC > support. > > Bootstrapped on x86_64-linux-gnu. Also tested on ppc- and x86-vx7r2 > with gcc-12. OK. > for gcc/testsuite/ChangeLog > > * gcc.ta

Re: [PATCH] [testsuite] require pthread for openmp

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 7:20 AM Alexandre Oliva via Gcc-patches wrote: > > > Fix test that uses -fopenmp without declaring requirement for pthread > support. > > Bootstrapped on x86_64-linux-gnu. Also tested on ppc- and x86-vx7r2 > with gcc-12. OK > for gcc/testsuite/ChangeLog > > * g+

Re: [PATCH] [testsuite] require profiling for -pg

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 7:21 AM Alexandre Oliva via Gcc-patches wrote: > > > Fix two tests that use -pg but don't declare their requirement for > profiling support. > > Bootstrapped on x86_64-linux-gnu. Also tested on ppc- and x86-vx7r2 > with gcc-12. OK. > for gcc/testsuite/ChangeLog > >

Re: [PATCH v2] [PR100106] Reject unaligned subregs when strict alignment is required

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 7:40 AM Alexandre Oliva via Gcc-patches wrote: > > On May 5, 2022, Alexandre Oliva wrote: > > > for gcc/ChangeLog > > > PR target/100106 > > * emit-rtl.cc (validate_subreg): Reject a SUBREG of a MEM that > > requires stricter alignment than MEM's. > > >

Re: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 7:47 AM Alexandre Oliva via Gcc-patches wrote: > > > MOVE_MAX on x86* used to accept up to 16 bytes, even without SSE, > which enabled inlining of small memmove by loading and then storing > the entire range. After the "x86: Update piecewise move and store" > r12-2666 chan

Re: [PATCH] LoongArch: Fix the problem of structure parameter passing in C++. This structure has empty structure members and less than three floating point members.

2023-05-24 Thread Xi Ruoyao via Gcc-patches
On Wed, 2023-05-24 at 16:47 +0800, Lulu Cheng wrote: > > 在 2023/5/24 下午2:45, Xi Ruoyao 写道: > > On Wed, 2023-05-24 at 14:04 +0800, Lulu Cheng wrote: > > > An empty struct type that is not non-trivial for the purposes of > > > calls > > > will be treated as though it were the following C type: > > >

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-24 Thread Prathamesh Kulkarni via Gcc-patches
On Mon, 22 May 2023 at 14:18, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > Hi Richard, > > Thanks for the suggestions. Does the attached patch look OK ? > > Boostrap+test in progress on aarch64-linux-gnu. > > Like I say, please wait for the tests to complete before sending an RFA.

Re: [patch]: Implement PR104327 for avr

2023-05-24 Thread Richard Biener via Gcc-patches
On Tue, May 23, 2023 at 2:56 PM Georg-Johann Lay wrote: > > PR target/104327 not only affects s390 but also avr: > The avr backend pre-sets some options depending on optimization level. > The inliner then thinks that always_inline functions are not eligible > for inlining and terminates with an er

[PATCH v3 0/9] MIPS: Add MIPS16e2 ASE instrucions.

2023-05-24 Thread Jie Mei
Patch V2: adds new patch. Patch V3: `%{mmips16e2} \` puts the wrong palce in first patch, V3 fix it. The MIPS16e2 ASE is an enhancement to the MIPS16e ASE, which includes all MIPS16e instructions, with some addition. This series of patches adds all instructions from MIPS16E2 ASE with correspondin

[PATCH v3 4/9] MIPS: Add bitwise instructions for mips16e2

2023-05-24 Thread Jie Mei
There are shortened bitwise instructions in the mips16e2 ASE, for instance, ANDI, ORI/XORI, EXT, INS etc. . This patch adds these instrutions with corresponding tests. gcc/ChangeLog: * config/mips/constraints.md(Yz): New constraints for mips16e2. * config/mips/mips-protos.h(mips_

[PATCH v3 5/9] MIPS: Add LUI instruction for mips16e2

2023-05-24 Thread Jie Mei
This patch adds LUI instruction from mips16e2 with corresponding test. gcc/ChangeLog: * config/mips/mips.cc(mips_symbol_insns_1): Generates LUI instruction. (mips_const_insns): Same as above. (mips_output_move): Same as above. (mips_output_function_prologue): Same

[PATCH v3 2/9] MIPS: Add MOVx instructions support for mips16e2

2023-05-24 Thread Jie Mei
This patch adds MOVx instructions from mips16e2 (movn,movz,movtn,movtz) with corresponding tests. gcc/ChangeLog: * config/mips/mips.h(ISA_HAS_CONDMOVE): Add condition for ISA_HAS_MIPS16E2. * config/mips/mips.md(*mov_on_): Add logics for MOVx insts. (*mov_on__mips16e2): G

[PATCH v3 1/9] MIPS: Add basic support for mips16e2

2023-05-24 Thread Jie Mei
The MIPS16e2 ASE is an enhancement to the MIPS16e ASE, which includes all MIPS16e instructions, with some addition. It defines new special instructions for increasing code density (e.g. Extend, PC-relative instructions, etc.). This patch adds basic support for mips16e2 used by the following series

[PATCH v3 6/9] MIPS: Add load/store word left/right instructions for mips16e2

2023-05-24 Thread Jie Mei
This patch adds LWL/LWR, SWL/SWR instructions with their corresponding tests. gcc/ChangeLog: * config/mips/mips.cc(mips_expand_ins_as_unaligned_store): Add logics for generating instruction. * config/mips/mips.h(ISA_HAS_LWL_LWR): Add clause for ISA_HAS_MIPS16E2. *

[PATCH v3 3/9] MIPS: Add instruction about global pointer register for mips16e2

2023-05-24 Thread Jie Mei
The mips16e2 ASE uses eight general-purpose registers from mips32, with some special-purpose registers, these registers are GPRs: s0-1, v0-1, a0-3, and special registers: t8, gp, sp, ra. As mentioned above, the special register gp is used in mips16e2, which is the global pointer register, it is us

[PATCH v3 7/9] MIPS: Use ISA_HAS_9BIT_DISPLACEMENT for mips16e2

2023-05-24 Thread Jie Mei
The MIPS16e2 ASE has PREF, LL and SC instructions, they use 9 bits immediate, like mips32r6. The MIPS32 PRE-R6 uses 16 bits immediate. gcc/ChangeLog: * config/mips/mips.h(ISA_HAS_9BIT_DISPLACEMENT): Add clause for ISA_HAS_MIPS16E2. (ISA_HAS_SYNC): Same as above. (I

[PATCH v3 8/9] MIPS: Add CACHE instruction for mips16e2

2023-05-24 Thread Jie Mei
This patch adds CACHE instruction from mips16e2 with corresponding tests. gcc/ChangeLog: * config/mips/mips.c(mips_9bit_offset_address_p): Restrict the address register to M16_REGS for MIPS16. (BUILTIN_AVAIL_MIPS16E2): Defined a new macro. (AVAIL_MIPS16E2_OR_NON_MI

[PATCH v3 9/9] MIPS: Make mips16e2 generating ZEB/ZEH instead of ANDI under certain conditions

2023-05-24 Thread Jie Mei
This patch allows mips16e2 acts the same with -O1~3 when generating ZEB/ZEH instead of ANDI under the -O0 option, which shrinks the code size. gcc/ChangeLog: * config/mips/mips.md(*and3_mips16): Generates ZEB/ZEH instructions. --- gcc/config/mips/mips.md | 30 +

Re: [PATCH V3] RISC-V: Add RVV comparison autovectorization

2023-05-24 Thread Richard Biener via Gcc-patches
On Tue, May 23, 2023 at 5:05 PM wrote: > > From: Juzhe-Zhong > > This patch enable RVV auto-vectorization including floating-point > unorder and order comparison. > > The testcases are leveraged from Richard. > So include Richard as co-author. > > Co-Authored-By: Richard Sandiford > > gcc/Change

Re: [PATCH 1/2] Missed opportunity to use [SU]ABD

2023-05-24 Thread Richard Sandiford via Gcc-patches
Thanks for the update. Mostly LGTM, just some minor things left below. Oluwatamilore Adebayo writes: > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc > index > a49b09539776c0056e77f99b10365d0a8747fbc5..3a2248263cf67834a1cb41167a1783a3b6400014 > 100644 > --- a/gcc/tree-vect-

[PATCH] Fix artificial overflow during GENERIC folding

2023-05-24 Thread Eric Botcazou via Gcc-patches
Hi, on the attached testcase, the Ada compiler gives a bogus warning: storage_offset1.ads:16:52: warning: Constraint_Error will be raised at run time [enabled by default] This directly comes from the GENERIC folding setting a bogus TREE_OVERFLOW on an INTEGER_CST during the (T)P - (T)(P + A) ->

Re: [PATCH V3] RISC-V: Add RVV comparison autovectorization

2023-05-24 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, May 23, 2023 at 5:05 PM wrote: >> >> From: Juzhe-Zhong >> >> This patch enable RVV auto-vectorization including floating-point >> unorder and order comparison. >> >> The testcases are leveraged from Richard. >> So include Richard as co-author. >> >> Co-Authored-B

Re: [PATCH] LoongArch: Fix the problem of structure parameter passing in C++. This structure has empty structure members and less than three floating point members.

2023-05-24 Thread Lulu Cheng
在 2023/5/24 下午5:25, Xi Ruoyao 写道: On Wed, 2023-05-24 at 16:47 +0800, Lulu Cheng wrote: 在 2023/5/24 下午2:45, Xi Ruoyao 写道: On Wed, 2023-05-24 at 14:04 +0800, Lulu Cheng wrote: An empty struct type that is not non-trivial for the purposes of calls will be treated as though it were the following

Re: [PATCH] tree-optimization/109849 - missed code hoisting

2023-05-24 Thread Christophe Lyon via Gcc-patches
Hi Richard, On Tue, 23 May 2023 at 11:55, Richard Biener via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: > The following fixes code hoisting to properly consider ANTIC_OUT instead > of ANTIC_IN. That's a bit expensive to re-compute but since we no > longer iterate we're doing this only once pe

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-24 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Mon, 22 May 2023 at 14:18, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > Hi Richard, >> > Thanks for the suggestions. Does the attached patch look OK ? >> > Boostrap+test in progress on aarch64-linux-gnu. >> >> Like I say, please wait for the

[PATCH] target/109944 - avoid STLF fail for V16QImode CTOR expansion

2023-05-24 Thread Richard Biener via Gcc-patches
The following dispatches to V2DImode CTOR expansion instead of using sets of (subreg:DI (reg:V16QI 146) [08]) which causes LRA to spill DImode and reload V16QImode. The same applies for V8QImode or V4HImode construction from SImode parts which happens during 32bit libgcc build. Boostrapped and te

Re: [PATCH] tree-optimization/109849 - missed code hoisting

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, 24 May 2023, Christophe Lyon wrote: > Hi Richard, > > On Tue, 23 May 2023 at 11:55, Richard Biener via Gcc-patches < > gcc-patches@gcc.gnu.org> wrote: > > > The following fixes code hoisting to properly consider ANTIC_OUT instead > > of ANTIC_IN. That's a bit expensive to re-compute but

[PATCH] libstdc++: Fix SFINAE for __is_intrinsic_type on ARM

2023-05-24 Thread Matthias Kretz via Gcc-patches
OK for master and all branches? (this issue only surfaced because of the new test) 8< - On ARM NEON doesn't support double, so __is_intrinsic_type_v should say false (instead of being ill-formed). Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR l

Re: [PATCH] libstdc++: Fix SFINAE for __is_intrinsic_type on ARM

2023-05-24 Thread Jonathan Wakely via Gcc-patches
On Wed, 24 May 2023 at 11:59, Matthias Kretz via Libstdc++ < libstd...@gcc.gnu.org> wrote: > OK for master and all branches? (this issue only surfaced because of the > new > test) > OK. > > 8< - > > On ARM NEON doesn't support double, so __is_intrinsic_type_v whate

Re: [PATCH] libstdc++: Add missing constexpr to simd_neon

2023-05-24 Thread Jonathan Wakely via Gcc-patches
On Tue, 23 May 2023 at 22:57, Matthias Kretz via Libstdc++ < libstd...@gcc.gnu.org> wrote: > > Signed-off-by: Matthias Kretz > > libstdc++-v3/ChangeLog: > > PR libstdc++/109261 > * include/experimental/bits/simd_neon.h (_S_reduce): Add > constexpr and make NEON implementat

Re: [PATCH] Fix artificial overflow during GENERIC folding

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 11:56 AM Eric Botcazou via Gcc-patches wrote: > > Hi, > > on the attached testcase, the Ada compiler gives a bogus warning: > storage_offset1.ads:16:52: warning: Constraint_Error will be raised at run > time [enabled by default] > > This directly comes from the GENERIC fold

[PATCH] tree-optimization/109849 - fix fallout of PRE hoisting change

2023-05-24 Thread Richard Biener via Gcc-patches
The PR109849 fix made us no longer hoist some memory loads because of the expression set intersection. We can still avoid to compute the union by simply taking the first sets expressions and leave the pruning of expressions with values not suitable for hoisting to sorted_array_from_bitmap_set. Bo

[PATCH] RISC-V: Add FRM_ prefix to dynamic rounding mode enum

2023-05-24 Thread juzhe . zhong
From: Juzhe-Zhong An obvious fix to make all enum naming consistent. gcc/ChangeLog: * config/riscv/riscv-protos.h (enum frm_field_enum): Add FRM_ prefix. --- gcc/config/riscv/riscv-protos.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv-prot

Re: [PATCH] RISC-V: Add FRM_ prefix to dynamic rounding mode enum

2023-05-24 Thread Kito Cheng via Gcc-patches
ok On Wed, May 24, 2023 at 7:20 PM wrote: > > From: Juzhe-Zhong > > An obvious fix to make all enum naming consistent. > > gcc/ChangeLog: > > * config/riscv/riscv-protos.h (enum frm_field_enum): Add FRM_ prefix. > > --- > gcc/config/riscv/riscv-protos.h | 2 +- > 1 file changed, 1 inser

Re: [PATCH V3] RISC-V: Add RVV comparison autovectorization

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 11:57 AM Richard Sandiford wrote: > > Richard Biener writes: > > On Tue, May 23, 2023 at 5:05 PM wrote: > >> > >> From: Juzhe-Zhong > >> > >> This patch enable RVV auto-vectorization including floating-point > >> unorder and order comparison. > >> > >> The testcases are

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review. I needed some time to go through this patch and surrounding code to understand it, and to understand why it wasn't structured the way I was expecting. I've got some specific comments below, and then a general comment about how I think we should structure this. juzhe.zh

[PATCH] RISC-V: Remove FRM_REGNUM dependency for rtx conversions

2023-05-24 Thread juzhe . zhong
From: Juzhe-Zhong According to RVV ISA: The conversions use the dynamic rounding mode in frm, except for the rtz variants, which round towards zero. So rtz conversion patterns should not have FRM dependency. We can't support mode switching for FRM yet since rvv intrinsic doc is not updated bu

RE: [PATCH] RISC-V: Add FRM_ prefix to dynamic rounding mode enum

2023-05-24 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito. Pan -Original Message- From: Gcc-patches On Behalf Of Kito Cheng via Gcc-patches Sent: Wednesday, May 24, 2023 7:21 PM To: juzhe.zh...@rivai.ai Cc: gcc-patches@gcc.gnu.org; kito.ch...@sifive.com; pal...@rivosinc.com; rdapp@gmail.com; jeffreya...@gmail.com Su

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Hi, Richard. It's quite complicated for me and I am not sure whether I can catch up with you. So I will rather split the work step by step to implement the decrement IV For the first step you mentioned: >> (1) In vect_set_loop_condition_partial_vectors, for the first iteration of: >> FOR_EA

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Hi, Richard. For step 1. I have write this patch. Could you take a look at it? Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-24 19:23 To: juzhe.zhong CC: gcc-patches; rguenther Subject: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount supp

[PATCH] Use expandargv on gcc-ar [PR77576]

2023-05-24 Thread Pekka Seppänen
Call expandargv prior attempting to prepend a dash to the first argument. When using response files the first character is never a dash but an at-sign. PR gcc/77576 gcc/ChangeLog: * gcc-ar.cc (main): Call expandargv. --- gcc/gcc-ar.cc | 2 ++ 1 file changed, 2 insertions(+) d

Re: [PATCH] target/109944 - avoid STLF fail for V16QImode CTOR expansion

2023-05-24 Thread Uros Bizjak via Gcc-patches
On Wed, May 24, 2023 at 12:13 PM Richard Biener wrote: > > The following dispatches to V2DImode CTOR expansion instead of > using sets of (subreg:DI (reg:V16QI 146) [08]) which causes > LRA to spill DImode and reload V16QImode. The same applies for > V8QImode or V4HImode construction from SImode

Re: [PATCH] Fix artificial overflow during GENERIC folding

2023-05-24 Thread Eric Botcazou via Gcc-patches
> I don't like littering the patterns with this and it's likely far from the > only cases we have? Maybe, but that's the only problematic case we have in Ada. It occurs only on mainline because we have streamlined address calculations there, from out-of- line to inline expansion, i.e. from run t

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
Sorry, I realised later that I had an implicit assumption here: if there are multiple rgroups, it's better to have a single IV for the smallest rgroup and scale that up to bigger rgroups. E.g. if the loop control IV is taken from an N-control rgroup and has a step S, an N*M-control rgroup would be

[COMMITTED 1/3] PR tree-optimization/109695 - Choose better initial values for ranger.

2023-05-24 Thread Andrew MacLeod via Gcc-patches
Instead of defaulting to an initial value of VARYING before resolving cycles, try folding the statement using available global values instead.  THis can give us a much better initial approximation, especially in cases where there are no dependencies, ie    f_45 = 77 This implements suggestion

[COMMITTED 3/3] PR tree-optimization/109695 - Only update global value if it changes.

2023-05-24 Thread Andrew MacLeod via Gcc-patches
This patch implements suggestion 1) from the PR: 1) We unconditionally write the new value calculated to the global cache once the dependencies are resolved.  This gives it a new timestamp, and thus makes any other values which used it out of date when they really aren't.   This cause

[COMMITTED 2/3] PR tree-optimization/109695 - Use negative values to reflect always_current in the, temporal cache.

2023-05-24 Thread Andrew MacLeod via Gcc-patches
This implements suggestion 3) from the PR: 3) When we first set the intial value for _1947 and give it the ALWAYS_CURRENT timestamp, we lose the context of when the initial value was set.  So even with 1) & 2) implemented, we are *still* need to set a timestamp for it when its finally

Re: [PATCH] tree-optimization/109849 - missed code hoisting

2023-05-24 Thread Christophe Lyon via Gcc-patches
On Wed, 24 May 2023 at 12:41, Richard Biener wrote: > On Wed, 24 May 2023, Christophe Lyon wrote: > > > Hi Richard, > > > > On Tue, 23 May 2023 at 11:55, Richard Biener via Gcc-patches < > > gcc-patches@gcc.gnu.org> wrote: > > > > > The following fixes code hoisting to properly consider ANTIC_OUT

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, 24 May 2023, Richard Sandiford wrote: > Sorry, I realised later that I had an implicit assumption here: > if there are multiple rgroups, it's better to have a single IV > for the smallest rgroup and scale that up to bigger rgroups. > > E.g. if the loop control IV is taken from an N-contro

[PATCH] doc: clarify semantics of vector bitwise shifts

2023-05-24 Thread Alexander Monakov via Gcc-patches
Explicitly say that bitwise shifts for narrow types work similar to element-wise C shifts with integer promotions, which coincides with OpenCL semantics. gcc/ChangeLog: * doc/extend.texi (Vector Extensions): Clarify bitwise shift semantics. --- gcc/doc/extend.texi | 7 ++- 1

Re: [V7][PATCH 2/2] Update documentation to clarify a GCC extension [PR77650]

2023-05-24 Thread Qing Zhao via Gcc-patches
Joseph, Thanks a lot for the review. And sorry for my late reply (just came back from a short vacation). > On May 19, 2023, at 5:12 PM, Joseph Myers wrote: > > On Fri, 19 May 2023, Qing Zhao via Gcc-patches wrote: > >> +GCC extension accepts a structure containing an ISO C99 @dfn{flexible arr

Re: [PATCH] Fix artificial overflow during GENERIC folding

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 2:39 PM Eric Botcazou wrote: > > > I don't like littering the patterns with this and it's likely far from the > > only cases we have? > > Maybe, but that's the only problematic case we have in Ada. It occurs only on > mainline because we have streamlined address calculatio

Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-05-24 Thread Richard Biener via Gcc-patches
On Wed, May 24, 2023 at 2:54 PM Alexander Monakov via Gcc-patches wrote: > > Explicitly say that bitwise shifts for narrow types work similar to > element-wise C shifts with integer promotions, which coincides with > OpenCL semantics. Do we need to clarify that v << w with v being a vector of sho

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
>> In other words, why is this different from what >>vect_set_loop_controls_directly would do? Oh, I see. You are confused that why I do not make multiple-rgroup vec_trunk handling inside "vect_set_loop_controls_directly". Well. Frankly, I just replicate the handling of ARM SVE: unsigned int nmas

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
OK. Thanks. I am gonna refine the patch following Richard's idea and test it. Thanks both Richard and Richi. juzhe.zh...@rivai.ai From: Richard Biener Date: 2023-05-24 20:51 To: Richard Sandiford CC: 钟居哲; gcc-patches Subject: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by va

[PATCH][committed] aarch64: PR target/99195 Annotate vector shift patterns for vec-concat-zero

2023-05-24 Thread Kyrylo Tkachov via Gcc-patches
Hi all, Continuing the series of straightforward annotations, this one handles the normal (not widening or narrowing) vector shifts. Tests included. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. Pushing to trunk. Thanks, Kyrill gcc/ChangeLog: PR target/9919

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: >>> In other words, why is this different from what >>>vect_set_loop_controls_directly would do? > Oh, I see. You are confused that why I do not make multiple-rgroup vec_trunk > handling inside "vect_set_loop_controls_directly". > > Well. Frankly, I just replicate the handling of ARM

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
>> Both approaches are fine. I'm not against one or the other. >> What I didn't understand was why your patch only reuses existing IVs >> for max_nscalars_per_iter == 1. Was it to avoid having to do a >> multiplication (well, really a shift left) when moving from one >> rgroup to another? E.g.

Re: [PATCH] Provide an API for ipa_vr.

2023-05-24 Thread Martin Jambor
Hello, On Wed, May 17 2023, Aldy Hernandez wrote: > This patch encapsulates the ipa_vr internals into an API. It also > makes it type agnostic, in preparation for upcoming changes to IPA. > > Interestingly, there's a 0.44% improvement to IPA-cp, which I'm sure > we'll soak up with future changes

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
>> Actually, I just want to hanlde multip-rgroup for non-SLP here, I am trying >> to avoid multiplication and I think >> scalar multiplication (not cost too much) is fine in modern CPU. Sorry for incorrect typo. I didn't try to avoid multiplication and I think multiplication is fine. juzhe.zh.

[COMMITTED] i386: Add vv4qi3 expander

2023-05-24 Thread Uros Bizjak via Gcc-patches
Also, move vv8qi3 expander to a better place and enable it with TARGET_MMX_WITH_SSE. Remove handling of V8QImode from ix86_expand_vecop_qihi2 since all partial QI->HI vector modes expand via ix86_expand_vecop_qihi_partial. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_vecop_qihi2)

Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-05-24 Thread Alexander Monakov via Gcc-patches
On Wed, 24 May 2023, Richard Biener wrote: > On Wed, May 24, 2023 at 2:54 PM Alexander Monakov via Gcc-patches > wrote: > > > > Explicitly say that bitwise shifts for narrow types work similar to > > element-wise C shifts with integer promotions, which coincides with > > OpenCL semantics. > >

Re: [V7][PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-05-24 Thread Qing Zhao via Gcc-patches
Bernhard, Thanks a lot for your comments. > On May 19, 2023, at 7:11 PM, Bernhard Reutner-Fischer > wrote: > > On Fri, 19 May 2023 20:49:47 + > Qing Zhao via Gcc-patches wrote: > >> GCC extension accepts the case when a struct with a flexible array member >> is embedded into another stru

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Oh. I just realize the follow you design is working well for vec_pack_trunk too. Will send V13 patch soon. Thanks. juzhe.zh...@rivai.ai From: 钟居哲 Date: 2023-05-24 22:10 To: richard.sandiford CC: gcc-patches; rguenther Subject: Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control

[PATCH V13] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread juzhe . zhong
From: Ju-Zhe Zhong This patch is supporting decrement IV by following the flow designed by Richard: (1) In vect_set_loop_condition_partial_vectors, for the first iteration of: call vect_set_loop_controls_directly. (2) vect_set_loop_controls_directly calculates "step" as in your patch. If rg

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Hi. Richard. I have sent V13: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619475.html It looks more reasonable now. Could you continue review it again? Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-24 22:01 To: 钟居哲 CC: gcc-patches; rguenther Subject: Re: [PATCH V12

Re: [PATCH] Fix artificial overflow during GENERIC folding

2023-05-24 Thread Eric Botcazou via Gcc-patches
> But nobody is going to understand why the INTEGER_CST case goes the > other way. I can add a fat comment to that effect of course. :-) > As you say we don't have a good way to say we're doing > this to avoid undefined behavior, but then a view-convert back would > be a good way to indicate that

[PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread juzhe . zhong
From: Ju-Zhe Zhong This patch is supporting decrement IV by following the flow designed by Richard: (1) In vect_set_loop_condition_partial_vectors, for the first iteration of: call vect_set_loop_controls_directly. (2) vect_set_loop_controls_directly calculates "step" as in your patch. If rg

Re: [PATCH V13] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Forget about V13. Plz go directly review V14. https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619478.html Thanks. juzhe.zh...@rivai.ai From: juzhe.zhong Date: 2023-05-24 22:29 To: gcc-patches CC: richard.sandiford; rguenther; Ju-Zhe Zhong Subject: [PATCH V13] VECT: Add decrement IV iterat

Re: [PATCH] LoongArch: Fix the problem of structure parameter passing in C++. This structure has empty structure members and less than three floating point members.

2023-05-24 Thread Xi Ruoyao via Gcc-patches
On Wed, 2023-05-24 at 18:07 +0800, Lulu Cheng wrote: > > 在 2023/5/24 下午5:25, Xi Ruoyao 写道: > > On Wed, 2023-05-24 at 16:47 +0800, Lulu Cheng wrote: > > > 在 2023/5/24 下午2:45, Xi Ruoyao 写道: > > > > On Wed, 2023-05-24 at 14:04 +0800, Lulu Cheng wrote: > > > > > An empty struct type that is not non-tr

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: >>> Both approaches are fine. I'm not against one or the other. > >>> What I didn't understand was why your patch only reuses existing IVs >>> for max_nscalars_per_iter == 1. Was it to avoid having to do a >>> multiplication (well, really a shift left) when moving from one >>> rgroup

Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Yeah. Thanks. I have sent V14: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619478.html which I found there is no distinction between SLP and non-SLP. Could you review it? I think it's more reasonable now. Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-24 22:57 To:

[PATCH] libstdc++: Fix type of first argument to vec_cntm call

2023-05-24 Thread Matthias Kretz via Gcc-patches
OK for master and backports? (also a long-standing bug that didn't surface until the new constexpr test was added) tested on powerpc64le-linux-gnu - 8< - Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109949 * include/experiment

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
Thanks for trying it. I'm still surprised that no multiplication is needed though. Does the patch work for: short x[100]; int y[200]; void f() { for (int i = 0, j = 0; i < 100; i += 2, j += 4) { x[i + 0] += 1; x[i + 1] += 2; y[j + 0] += 1; y[j + 1] += 2; y[j + 2] += 3;

Re: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Hi, the .optimized dump is like this: [local count: 21045336]: ivtmp.26_36 = (unsigned long) &x; ivtmp.27_3 = (unsigned long) &y; ivtmp.30_6 = (unsigned long) &MEM [(void *)&y + 16B]; ivtmp.31_10 = (unsigned long) &MEM [(void *)&y + 32B]; ivtmp.32_14 = (unsigned long) &MEM [(void *

Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions

2023-05-24 Thread Carl Love via Gcc-patches
On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote: > on 2023/5/24 06:30, Peter Bergner wrote: > > On 5/23/23 12:24 AM, Kewen.Lin wrote: > > > on 2023/5/23 01:31, Carl Love wrote: > > > > The builtins were requested for use in GLibC. As of version > > > > 2.31 they > > > > were added as inline asm

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Hi, the .optimized dump is like this: > >[local count: 21045336]: > ivtmp.26_36 = (unsigned long) &x; > ivtmp.27_3 = (unsigned long) &y; > ivtmp.30_6 = (unsigned long) &MEM [(void *)&y + 16B]; > ivtmp.31_10 = (unsigned long) &MEM [(void *)&y + 32B]; > ivtmp.32_14 = (u

回复: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Hi, Richard. I think it can work after I analyze it. Let's take a look the codes: void f() { for (int i = 0, j = 0; i < 100; i += 2, j += 4) { x[i + 0] += 1; x[i + 1] += 2; y[j + 0] += 1; y[j + 1] += 2; y[j + 2] += 3; y[j + 3] += 4; } } For "x", each scalar iteration

Re: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Hi, Richard. I still don't understand it. Sorry about that. >> loop_len_48 = MIN_EXPR ; >> _74 = loop_len_34 * 2 - loop_len_48; I have the tests already tested. We have a MIN_EXPR to calculate the total elements: loop_len_34 = MIN_EXPR ; I think "8" is already multiplied by 2? Why do we n

Re: [patch]: Implement PR104327 for avr

2023-05-24 Thread Georg-Johann Lay
Am 24.05.23 um 11:38 schrieb Richard Biener: On Tue, May 23, 2023 at 2:56 PM Georg-Johann Lay wrote: PR target/104327 not only affects s390 but also avr: The avr backend pre-sets some options depending on optimization level. The inliner then thinks that always_inline functions are not eligi

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Hi, Richard. I still don't understand it. Sorry about that. > >>> loop_len_48 = MIN_EXPR ; > >> _74 = loop_len_34 * 2 - loop_len_48; > > I have the tests already tested. > We have a MIN_EXPR to calculate the total elements: > loop_len_34 = MIN_EXPR ; > I think "8" is already mul

Re: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread 钟居哲
Oh. I see. Thank you so much for pointing this. Could you tell me what I should do in the codes? It seems that I should adjust it in vect_adjust_loop_lens_control muliply by some factor ? Is this correct multiply by max_nscalars_per_iter ? Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Oh. I see. Thank you so much for pointing this. > Could you tell me what I should do in the codes? > It seems that I should adjust it in > vect_adjust_loop_lens_control > > muliply by some factor ? Is this correct multiply by max_nscalars_per_iter > ? max_nscalars_per_iter * factor

  1   2   >