The series of patches are meant to support multiple lane-reducing reduction
statements. Since the original ones conflicted with the new single-lane slp
node patches, I have reworked most of the patches, and split them as small as
possible, which may make code review easier.
In the 1st one, I ad
In vectorizable_reduction, one check on a reduction operand via index could be
contained by another one check via pointer, so remove the former.
Thanks,
Feng
---
gcc/
* tree-vect-loop.cc (vectorizable_reduction): Remove the duplicated
check.
---
gcc/tree-vect-loop.cc | 6 ++
Two local variables were defined to refer same STMT_VINFO_REDUC_TYPE, better
to keep only one.
Thanks,
Feng
---
gcc/
* tree-vect-loop.cc (vectorizable_reduction): Remove v_reduc_type, and
replace it to another local variable reduction_type.
---
gcc/tree-vect-loop.cc | 8
The input vectype of reduction PHI statement must be determined before
vect cost computation for the reduction. Since lance-reducing operation has
different input vectype from normal one, so we need to traverse all reduction
statements to find out the input vectype with the least lanes, and set tha
It's better to place 3 relevant independent variables into array, since we
have requirement to access them via an index in the following patch. At the
same time, this change may get some duplicated code be more compact.
Thanks,
Feng
---
gcc/
* tree-vect-loop.cc (vect_transform_reduction):
According to logic of code nearby the assertion, all lane-reducing operations
should not appear, not just DOT_PROD_EXPR. Since "use_mask_by_cond_expr_p"
treats SAD_EXPR same as DOT_PROD_EXPR, and WIDEN_SUM_EXPR should not be allowed
by the following assertion "gcc_assert (commutative_binary_op_p (.
For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current
vectorizer could only handle the pattern if the reduction chain does not
contain other operation, no matter the other is normal or lane-reducing.
Actually, to allow multiple arbitrary lane-reducing operations, we need t
When transforming multiple lane-reducing operations in a loop reduction chain,
originally, corresponding vectorized statements are generated into def-use
cycles starting from 0. The def-use cycle with smaller index, would contain
more statements, which means more instruction dependency. For example
This isn't just http to https, also the anchor has changed.
Not sure why anyone would go for #2014_The_GNU_Compiler_Collection_(GCC)
- but so be it.)
Pushed.
Gerald
---
htdocs/news.html | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/htdocs/news.html b/htdocs/news.html
index
If the address register is dead after load/store operation it looks
beneficial to use LDMIA/STMIA instead of pair of LDR/STR instructions,
at least if optimizing for size.
E.g.
ldr r0, [r3, #0]
ldr r1, [r3, #4] @ r3 is dead after
will be replaced by
ldmia r3!, {r0, r1}
also for reused reg is
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_print_operand_reloc):
Dedup and sort the comment describing modifiers.
---
It's a non-functional change thus I've not tested it. Ok for trunk?
gcc/config/loongarch/loongarch.cc | 10 +-
1 file changed, 1 insertio
Another case of being able to safely use bset for 1 << n. In this case
the (1 << n) is explicitly zero extended from SI to DI. Two things to
keep in mind. The (1 << n) is done in SImode. So it doesn't directly
define bits 32..63 and those bits are cleared by the explicit zero
extension.
Currrently the size of the array reg_is_wrapped_separately is LAST_SAVED_REGNUM.
But LAST_SAVED_REGNUM could be regno that is being saved. So the size needs
to be `LAST_SAVED_REGNUM + 1` like aarch64_frame->reg_offset is.
Committed as obvious after a bootstrap/test for aarch64-linux-gnu.
gcc/Chan
Hi Andre,
Am 14.06.24 um 17:05 schrieb Andre Vehreschild:
Hi all,
I somehow got assigned to this PR so I fixed it. GFortran was ICEing because of
the ASSUME_RANK in a derived to class conversion. After fixing this, storage
association was producing segfaults. The "shape conversion" of the class
I've committed this libbacktrace patch to not fail on the case where
there are no bits available when looking backward. This can happen at
the very end of the frame if no bits are actually required. The test
case is long and may be proprietary, so not including it.
Bootstrapped and ran libbacktra
Richard Biener 于2024年6月6日周四 14:20写道:
>
> On Thu, 6 Jun 2024, YunQiang Su wrote:
>
> > Richard Biener 于2024年5月28日周二 17:47写道:
> > >
> > > The following avoids accounting single-lane SLP to the discovery
> > > limit. As the two testcases show this makes discovery fail,
> > > unfortunately even not
From: Pan Li
When investigate the vectorization of .SAT_ADD, we notice there
are additional 2 forms, aka form 7 and 8 for .SAT_ADD.
Form 7:
#define DEF_SAT_U_ADD_FMT_7(T) \
T __attribute__((noinline)) \
sat_u_add_##T##_fmt_7 (T x, T y)\
{
Ping this thread.
BRs,
Lin
-Original Message-
From: Hu, Lin1
Sent: Tuesday, June 11, 2024 2:49 PM
To: gcc-patches@gcc.gnu.org
Cc: Liu, Hongtao ; ubiz...@gmail.com; rguent...@suse.de
Subject: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int,
float -> float and int <->
On Sat, Jun 15, 2024 at 1:22 AM Jeff Law wrote:
>
>
>
> On 6/14/24 11:10 AM, Alexander Monakov wrote:
> >
> > On Fri, 14 Jun 2024, Kong, Lingling wrote:
> >
> >> APX CFCMOV[1] feature implements conditionally faulting which means that
> >> all memory faults are suppressed
> >> when the condition
On Fri, Jun 14, 2024 at 10:53 PM Hongtao Liu wrote:
>
> On Fri, Jun 14, 2024 at 6:31 PM Richard Biener wrote:
> >
> > The following retires vcond{,u,eq} optabs by stopping to use them
> > from the middle-end. Targets instead (should) implement vcond_mask
> > and vec_cmp{,u,eq} optabs. The PR th
On Thu, Jun 13, 2024 at 3:13 PM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to refine all cvtt* instructions with UNSPEC instead of
> FIX/UNSIGNED_FIX. Because the intrinsics should behave as documented.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
Ok.
>
> BRs,
> Lin
>
On Fri, Jun 14, 2024 at 9:35 AM Levy Hsu wrote:
>
> This patch updates the GCC x86 backend to efficiently handle
> odd, incrementally increasing permutations of BF16 vectors
> using the cvtne2ps2bf16 instruction.
> It modifies ix86_vectorize_vec_perm_const to support these operations
> and adds a
Hi,
There are a few PRs (meta-bug PR101926) about accessing aggregate
param/returns which are passed through registers.
We could use the current SRA pass in a special mode right before
RTL expansion for the incoming/outgoing part, as the talked at:
https://gcc.gnu.org/pipermail/gcc-patches/2023-N
Hi Andre,
The patch is OK for mainline. Please change the subject line to have
[PR90076] at the end. I am not sure that the contents of the first square
brackets are especially useful in the commit.
Thanks for the fix
Paul
On Tue, 11 Jun 2024 at 13:57, Andre Vehreschild wrote:
> Hi all,
>
>
On Fri, 14 Jun 2024, Andrew Pinski wrote:
> On Fri, Jun 14, 2024 at 5:54 AM Richard Biener wrote:
> >
> > Automatic arrays that are not address-taken should not be subject to
> > store data races.
>
> That seems conservative enough. Though I would think if the array
> never escaped the function
On Mon, 17 Jun 2024, Kewen.Lin wrote:
> Hi Richi,
>
> on 2024/6/14 18:31, Richard Biener wrote:
> > The following retires vcond{,u,eq} optabs by stopping to use them
> > from the middle-end. Targets instead (should) implement vcond_mask
> > and vec_cmp{,u,eq} optabs. The PR this change refers t
On Fri, Jun 14, 2024 at 9:20 PM Andrew MacLeod wrote:
>
> gimple_range_fold makes an assumption that if there is a LHS on a call
> that it is an ssa_name. Especially later in compilation that may not be
> true.
It's always true if the LHS is of register type (is_gimple_reg_type) and
never true w
We dropped support for 1750a back in 2002.
Pushed.
Gerald
---
htdocs/readings.html | 6 --
1 file changed, 6 deletions(-)
diff --git a/htdocs/readings.html b/htdocs/readings.html
index 0f6032c2..784a3bd7 100644
--- a/htdocs/readings.html
+++ b/htdocs/readings.html
@@ -632,12 +632,6 @@ Below
28 matches
Mail list logo