[PATH 1/8] vect: Add a function to check lane-reducing stmt

2024-06-16 Thread Feng Xue OS
The series of patches are meant to support multiple lane-reducing reduction statements. Since the original ones conflicted with the new single-lane slp node patches, I have reworked most of the patches, and split them as small as possible, which may make code review easier. In the 1st one, I ad

[PATCH 2/8] vect: Remove duplicated check on reduction operand

2024-06-16 Thread Feng Xue OS
In vectorizable_reduction, one check on a reduction operand via index could be contained by another one check via pointer, so remove the former. Thanks, Feng --- gcc/ * tree-vect-loop.cc (vectorizable_reduction): Remove the duplicated check. --- gcc/tree-vect-loop.cc | 6 ++

[PATCH 3/8] vect: Use one reduction_type local variable

2024-06-16 Thread Feng Xue OS
Two local variables were defined to refer same STMT_VINFO_REDUC_TYPE, better to keep only one. Thanks, Feng --- gcc/ * tree-vect-loop.cc (vectorizable_reduction): Remove v_reduc_type, and replace it to another local variable reduction_type. --- gcc/tree-vect-loop.cc | 8

[PATCH 4/8] vect: Determine input vectype for multiple lane-reducing

2024-06-16 Thread Feng Xue OS
The input vectype of reduction PHI statement must be determined before vect cost computation for the reduction. Since lance-reducing operation has different input vectype from normal one, so we need to traverse all reduction statements to find out the input vectype with the least lanes, and set tha

[PATCH 5/8] vect: Use an array to replace 3 relevant variables

2024-06-16 Thread Feng Xue OS
It's better to place 3 relevant independent variables into array, since we have requirement to access them via an index in the following patch. At the same time, this change may get some duplicated code be more compact. Thanks, Feng --- gcc/ * tree-vect-loop.cc (vect_transform_reduction):

[PATCH 6/8] vect: Tighten an assertion for lane-reducing in transform

2024-06-16 Thread Feng Xue OS
According to logic of code nearby the assertion, all lane-reducing operations should not appear, not just DOT_PROD_EXPR. Since "use_mask_by_cond_expr_p" treats SAD_EXPR same as DOT_PROD_EXPR, and WIDEN_SUM_EXPR should not be allowed by the following assertion "gcc_assert (commutative_binary_op_p (.

[PATCH 7/8] vect: Support multiple lane-reducing operations for loop reduction [PR114440]

2024-06-16 Thread Feng Xue OS
For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current vectorizer could only handle the pattern if the reduction chain does not contain other operation, no matter the other is normal or lane-reducing. Actually, to allow multiple arbitrary lane-reducing operations, we need t

[PATCH 8/8] vect: Optimize order of lane-reducing statements in loop def-use cycles

2024-06-16 Thread Feng Xue OS
When transforming multiple lane-reducing operations in a loop reduction chain, originally, corresponding vectorized statements are generated into def-use cycles starting from 0. The def-use cycle with smaller index, would contain more statements, which means more instruction dependency. For example

[pushed] wwwdocs: news: Update link to our ACM SIGPLAN award

2024-06-16 Thread Gerald Pfeifer
This isn't just http to https, also the anchor has changed. Not sure why anyone would go for #2014_The_GNU_Compiler_Collection_(GCC) - but so be it.) Pushed. Gerald --- htdocs/news.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htdocs/news.html b/htdocs/news.html index

[RFC PATCH] ARM: thumb1: Use LDMIA/STMIA for DI/DF loads/stores

2024-06-16 Thread Siarhei Volkau
If the address register is dead after load/store operation it looks beneficial to use LDMIA/STMIA instead of pair of LDR/STR instructions, at least if optimizing for size. E.g. ldr r0, [r3, #0] ldr r1, [r3, #4] @ r3 is dead after will be replaced by ldmia r3!, {r0, r1} also for reused reg is

[PATCH] LoongArch: NFC: Dedup and sort the comment in loongarch_print_operand_reloc

2024-06-16 Thread Xi Ruoyao
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_print_operand_reloc): Dedup and sort the comment describing modifiers. --- It's a non-functional change thus I've not tested it. Ok for trunk? gcc/config/loongarch/loongarch.cc | 10 +- 1 file changed, 1 insertio

[to-be-committed][RISC-V] Improve variable bit set for rv64

2024-06-16 Thread Jeff Law
Another case of being able to safely use bset for 1 << n. In this case the (1 << n) is explicitly zero extended from SI to DI. Two things to keep in mind. The (1 << n) is done in SImode. So it doesn't directly define bits 32..63 and those bits are cleared by the explicit zero extension.

[PATCH] aarch64: Fix reg_is_wrapped_separately array size [PR100211]

2024-06-16 Thread Andrew Pinski
Currrently the size of the array reg_is_wrapped_separately is LAST_SAVED_REGNUM. But LAST_SAVED_REGNUM could be regno that is being saved. So the size needs to be `LAST_SAVED_REGNUM + 1` like aarch64_frame->reg_offset is. Committed as obvious after a bootstrap/test for aarch64-linux-gnu. gcc/Chan

Re: [Fortran, Patch, PR 96992] Fix Class arrays of different ranks are rejected as storage association argument

2024-06-16 Thread Harald Anlauf
Hi Andre, Am 14.06.24 um 17:05 schrieb Andre Vehreschild: Hi all, I somehow got assigned to this PR so I fixed it. GFortran was ICEing because of the ASSUME_RANK in a derived to class conversion. After fixing this, storage association was producing segfaults. The "shape conversion" of the class

libbacktrace patch committed: OK if zero backward bits

2024-06-16 Thread Ian Lance Taylor
I've committed this libbacktrace patch to not fail on the case where there are no bits available when looking backward. This can happen at the very end of the frame if no bits are actually required. The test case is long and may be proprietary, so not including it. Bootstrapped and ran libbacktra

Re: [PATCH] tree-optimization/115254 - don't account single-lane SLP against discovery limit

2024-06-16 Thread YunQiang Su
Richard Biener 于2024年6月6日周四 14:20写道: > > On Thu, 6 Jun 2024, YunQiang Su wrote: > > > Richard Biener 于2024年5月28日周二 17:47写道: > > > > > > The following avoids accounting single-lane SLP to the discovery > > > limit. As the two testcases show this makes discovery fail, > > > unfortunately even not

[PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

2024-06-16 Thread pan2 . li
From: Pan Li When investigate the vectorization of .SAT_ADD, we notice there are additional 2 forms, aka form 7 and 8 for .SAT_ADD. Form 7: #define DEF_SAT_U_ADD_FMT_7(T) \ T __attribute__((noinline)) \ sat_u_add_##T##_fmt_7 (T x, T y)\ {

RE: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-16 Thread Hu, Lin1
Ping this thread. BRs, Lin -Original Message- From: Hu, Lin1 Sent: Tuesday, June 11, 2024 2:49 PM To: gcc-patches@gcc.gnu.org Cc: Liu, Hongtao ; ubiz...@gmail.com; rguent...@suse.de Subject: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <->

Re: [PATCH 0/3] [APX CFCMOV] Support APX CFCMOV

2024-06-16 Thread Hongtao Liu
On Sat, Jun 15, 2024 at 1:22 AM Jeff Law wrote: > > > > On 6/14/24 11:10 AM, Alexander Monakov wrote: > > > > On Fri, 14 Jun 2024, Kong, Lingling wrote: > > > >> APX CFCMOV[1] feature implements conditionally faulting which means that > >> all memory faults are suppressed > >> when the condition

Re: [PATCH] middle-end/114189 - drop uses of vcond{,u,eq}_optab

2024-06-16 Thread Hongtao Liu
On Fri, Jun 14, 2024 at 10:53 PM Hongtao Liu wrote: > > On Fri, Jun 14, 2024 at 6:31 PM Richard Biener wrote: > > > > The following retires vcond{,u,eq} optabs by stopping to use them > > from the middle-end. Targets instead (should) implement vcond_mask > > and vec_cmp{,u,eq} optabs. The PR th

Re: [PATCH] i386: Refine all cvtt* instructions with UNSPEC instead of FIX/UNSIGNED_FIX.

2024-06-16 Thread Hongtao Liu
On Thu, Jun 13, 2024 at 3:13 PM Hu, Lin1 wrote: > > Hi, all > > This patch aims to refine all cvtt* instructions with UNSPEC instead of > FIX/UNSIGNED_FIX. Because the intrinsics should behave as documented. > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? Ok. > > BRs, > Lin >

Re: [PATCH] x86: Emit cvtne2ps2bf16 for odd increasing perm in __builtin_shufflevector

2024-06-16 Thread Hongtao Liu
On Fri, Jun 14, 2024 at 9:35 AM Levy Hsu wrote: > > This patch updates the GCC x86 backend to efficiently handle > odd, incrementally increasing permutations of BF16 vectors > using the cvtne2ps2bf16 instruction. > It modifies ix86_vectorize_vec_perm_const to support these operations > and adds a

[PATCH] fsra: gimple final sra pass for paramters and returns

2024-06-16 Thread Jiufu Guo
Hi, There are a few PRs (meta-bug PR101926) about accessing aggregate param/returns which are passed through registers. We could use the current SRA pass in a special mode right before RTL expansion for the incoming/outgoing part, as the talked at: https://gcc.gnu.org/pipermail/gcc-patches/2023-N

Re: [Patch, Fortran, 90076] 1/3 Fix Polymorphic Allocate on Assignment Memory Leak

2024-06-16 Thread Paul Richard Thomas
Hi Andre, The patch is OK for mainline. Please change the subject line to have [PR90076] at the end. I am not sure that the contents of the first square brackets are especially useful in the commit. Thanks for the fix Paul On Tue, 11 Jun 2024 at 13:57, Andre Vehreschild wrote: > Hi all, > >

Re: [PATCH] Enhance if-conversion for automatic arrays

2024-06-16 Thread Richard Biener
On Fri, 14 Jun 2024, Andrew Pinski wrote: > On Fri, Jun 14, 2024 at 5:54 AM Richard Biener wrote: > > > > Automatic arrays that are not address-taken should not be subject to > > store data races. > > That seems conservative enough. Though I would think if the array > never escaped the function

Re: [PATCH] middle-end/114189 - drop uses of vcond{,u,eq}_optab

2024-06-16 Thread Richard Biener
On Mon, 17 Jun 2024, Kewen.Lin wrote: > Hi Richi, > > on 2024/6/14 18:31, Richard Biener wrote: > > The following retires vcond{,u,eq} optabs by stopping to use them > > from the middle-end. Targets instead (should) implement vcond_mask > > and vec_cmp{,u,eq} optabs. The PR this change refers t

Re: [COMMITTED] Do not assume LHS of call is an ssa-name.

2024-06-16 Thread Richard Biener
On Fri, Jun 14, 2024 at 9:20 PM Andrew MacLeod wrote: > > gimple_range_fold makes an assumption that if there is a LHS on a call > that it is an ssa_name. Especially later in compilation that may not be > true. It's always true if the LHS is of register type (is_gimple_reg_type) and never true w

[pushed] wwwdocs: readings: Drop 1750a section

2024-06-16 Thread Gerald Pfeifer
We dropped support for 1750a back in 2002. Pushed. Gerald --- htdocs/readings.html | 6 -- 1 file changed, 6 deletions(-) diff --git a/htdocs/readings.html b/htdocs/readings.html index 0f6032c2..784a3bd7 100644 --- a/htdocs/readings.html +++ b/htdocs/readings.html @@ -632,12 +632,6 @@ Below