-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Blocks: 53947, 115130
Target Milestone: ---
consider the following loop:
#define N 512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286
--- Comment #9 from Tamar Christina ---
(In reply to Thomas Schwinge from comment #8)
> Tamar, thanks! I confirm all fixed -- but one:
>
> (In reply to myself from comment #1)
> > ..., and similarly -- but not identical! -- for '-march=gfx1100
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119858
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
Tamar Christina changed:
What|Removed |Added
Priority|P1 |P2
--- Comment #23 from Tamar Christi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
Tamar Christina changed:
What|Removed |Added
Target Milestone|15.0|14.3
Summary|[15 Regressio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
Tamar Christina changed:
What|Removed |Added
Keywords|needs-reduction,|
|needs-source
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #18 from Tamar Christina ---
(In reply to Richard Biener from comment #17)
> I wonder if we can use
>
> BIT_FIELD_REF
>
> as the "reduction" step.
Yeah that's the same comment Richard S suggested when we were talking to avoid
th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #16 from Tamar Christina ---
Ok, found the bug and c-vise is running for a testcase.
The issue is as follows:
For early break we need to know which value to start the scalar loop with if we
take an early exit.
Historically this me
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #15 from Tamar Christina ---
The following example reproduces the CFG but not the bad codegen:
https://godbolt.org/z/Thzo7hz8P
This generates the actual code I expected:
_55 = {_2, _2, _2, _2};
_56 = {_11, _11, _11, _11};
_57
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #14 from Tamar Christina ---
There seems to be an one error in the pre-header when calculating the initial
vector IV.
The starting values are calculated as:
sub z27.s, z23.s, z31.s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187
--- Comment #8 from Tamar Christina ---
(In reply to ktkachov from comment #7)
> Could this be extended to scale Neon intrinsics code to SVE by
> re-vectorising and treating the 128-bit Neon lane as a Q-word element of a
> wider SVE vector? I t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119577
--- Comment #2 from Tamar Christina ---
(In reply to Richard Biener from comment #1)
> IIRC it depends on the "kind" of early break whether we need the
> first IV (scalar IV possible) or the last, but I don't rememeber exactly.
First is always
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #13 from Tamar Christina ---
Sorry had a week off, looking into this again today.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892
--- Comment #18 from Tamar Christina ---
(In reply to Pavol Rusnak from comment #17)
> Is the fix going to be backported from master to 14.x release? Possibly
> targeting 14.3.0 release?
Yep
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #9 from Tamar Christina ---
---
static bool next_ci(int dimYY, int numCells, int nth, int ci_block, int* ci_x,
int* ci_y, int* ci_b, int* ci)
{
while (*ci >= *ci_x * dimYY + *ci_y + 1)
{
*ci_y += 1;
if (*ci_y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115450
--- Comment #11 from Tamar Christina ---
(In reply to Richard Biener from comment #10)
> Can anybody still reproduce this?
I can't. I can reproduce the failure with the original commit but cannot with
today's trunk.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #8 from Tamar Christina ---
Looking at it some more, I think the loop is valid to vectorize. But we don't
seem to vectorize the reduction jumping back to the outerloop:
;; basic block 384, loop depth 3, count 8598980 (estimated lo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119402
--- Comment #3 from Tamar Christina ---
(In reply to Jakub Jelinek from comment #2)
> Started with r14-5673-g33c2b70dbabc02788caabcbc66b7baeafeb95bcf
> With -O2 -mtune=generic it is fine even on the current trunk.
Seems like it's due to missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108
--- Comment #12 from Tamar Christina ---
Sorry for the slow response, had a few days off.
The regression here can be reproduced through this example loop:
https://godbolt.org/z/jnGe5x4P7
for the current loop in snappy what you want is -UALIGNE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #7 from Tamar Christina ---
Sorry for the delay, had a few days off.
So looking at this again, it's happening When next_ci gets inlined into
nbnxn_make_pairlist_part, the while loop
while (next_ci(iGrid, nth, ci_block, &ci_x, &ci_y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119393
--- Comment #3 from Tamar Christina ---
Confirmed.
|1
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
--- Comment #4 from Tamar Christina ---
While looking at the codegen it looks like GROMACS has a lot of loops that get
vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #6 from Tamar Christina ---
(In reply to ktkachov from comment #5)
> (In reply to Tamar Christina from comment #4)
> > While looking at the codegen it looks like GROMACS has a lot of loops that
> > get vectorized now and it's showing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286
--- Comment #5 from Tamar Christina ---
Still have one to fix.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842
--- Comment #9 from Tamar Christina ---
(In reply to Hongtao Liu from comment #8)
> (In reply to Tamar Christina from comment #7)
> > (In reply to Hongtao Liu from comment #6)
> > > I noticed some double-counting of cost in group-candidate (reg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #24 from Tamar Christina ---
Hi,
Yeah vectorization was one of the reasons for the slowdown.
Do note however it's not entirely safe to backport that patch, as it exposes
another bug which has a large fix.
At least the top two comm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #3 from Tamar Christina ---
Confirmed, able to reproduce it now.
Taking a look. -march=armv8-a+sve is enough FFIW.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187
--- Comment #5 from Tamar Christina ---
(In reply to Richard Biener from comment #4)
>
> for (...)
>a[32*i] = ..;
>a[32*i+1] = ..;
> ...
>a[32*i + 31] = ...;
>
> to match the number of lanes in a HW vector. It shares some of the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118974
--- Comment #3 from Tamar Christina ---
and using the SVE CC regs:
.L6:
ldr q30, [x2, x0]
cmple p15.s, p7/z, z30.s, #0
b.none .L2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118974
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108
--- Comment #11 from Tamar Christina ---
Actually I just realized that loop uses two pointers, and we can only peel for
one unknown misalignment atm. This loop will instead be versioned, and because
of the manual misalignment in the caller I don
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108
--- Comment #10 from Tamar Christina ---
(In reply to Matthew Malcomson from comment #9)
> (In reply to Tamar Christina from comment #8)
> > Ok, so having looked at this I'm not sure the compiler is at fault here.
> >
> > Similar to the SVN cas
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > There is another bug report for a similar thing but with SSE and AVX2.
>
> yes PR 95960.
Ah yeah, I guess I w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108
--- Comment #8 from Tamar Christina ---
Ok, so having looked at this I'm not sure the compiler is at fault here.
Similar to the SVN case the snappy code is misaligning the loads intentionally
and loading 64-bits at a time from the 8-bit pointe
-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Today there's a lot of code written as intrinsics for older microarchitectures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118464
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855
Tamar Christina changed:
What|Removed |Added
Summary|[14/15 Regression] Unsafe |[14 Regression] Unsafe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119145
--- Comment #1 from Tamar Christina ---
The vectorizer seems confused. Vectorization fails, but seems to fail during
SLP transform so the ifc loop is kept, but the statements not transformed.
it then produces broken SSA:
note: * Analysis
-code
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*
The following testcase:
typedef short Quantum;
Quantum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108
--- Comment #6 from Tamar Christina ---
Ok, now really confirmed :)
Interestingly the behavior on other uarches suggests this may be cost
modelling.
On Neoverse-V1 we get (without LTO):
BM_UFlat/0/1 -4.60251
BM_UFlat/0/2 -2.34742
BM_UFlat/3/1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108
--- Comment #5 from Tamar Christina ---
Ah... It looks like somehow the built for
/data/gcc/gcc-with-68326d5d1a5-install/ failed and it was silently picking up
the distro compiler instead.
Hence the difference in memmove only!
I'll clean every
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108
--- Comment #4 from Tamar Christina ---
(In reply to Matthew Malcomson from comment #3)
> I only looked into VecSource/5/2, and unfortunately I looked into it on an
> internal setup that compiles slightly differently.
>
> In that slightly diffe
||2025-03-05
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
Ever confirmed|0 |1
--- Comment #2 from Tamar Christina ---
Confirmed.
The only early break vectorization is in the reporting harness in
benchmark
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892
--- Comment #13 from Tamar Christina ---
(In reply to Jakub Jelinek from comment #12)
> E.g. the i386 backend usually uses force_reg in this case. If the operand
> is a REG, it does nothing, if it is a SUBREG, it is forced into a temporary
> an
||tnfchris at gcc dot gnu.org
Ever confirmed|0 |1
Last reconfirmed||2025-02-27
Status|UNCONFIRMED |NEW
--- Comment #1 from Tamar Christina ---
The late-combine pass was supposed to handle
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016
--- Comment #10 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #9)
> On Wed, 26 Feb 2025, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016
> >
> > -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016
--- Comment #8 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #7)
> On Wed, 26 Feb 2025, tnfchris at gcc dot gnu.org wrote:
>
> > Because of the scalar code doing DI mode loads, and the misalignment being
&g
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016
--- Comment #6 from Tamar Christina ---
At the start of the second iteration len = 2, so start becomes misaligned at
0x7fffe2f2
but the peeling iteration code checks (0x7fffe2f2 / 8) & 1 which is 0, so
it doesn't peel to align it.
Inde
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016
Tamar Christina changed:
What|Removed |Added
Priority|P3 |P1
Last reconfirmed|
|tree-optimization
Last reconfirmed||2025-02-24
CC||tnfchris at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #11 from Tamar Christina ---
Confirmed.
As Kyrill mentioned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118974
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
-valid
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: arm*
The follow intrinsics incorrectly take a pointer to unsigned rather
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892
--- Comment #11 from Tamar Christina ---
(In reply to Richard Sandiford from comment #10)
> (In reply to Tamar Christina from comment #9)
> > I swear that was something that was fixed. But in any case, the simplest
> > fix is to force it into a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892
--- Comment #9 from Tamar Christina ---
(In reply to Andrew Pinski from comment #8)
> (In reply to Tamar Christina from comment #7)
> >
> > But operand1 is marked as `register_operand` which means whatever did the
> > expansion didn't honor the
: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The following example:
# pragma STDC FP_CONTRACT ON
# if __GNUC__ >= 4
# pragma GCC optimize ("no-fast-math,fp-contract=on")
# endif
# ifdef __FAST_MATH__
#
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 117270, which changed state.
Bug 117270 Summary: [15 Regression] 9% exec time slowdown of 538.imagick_r on
aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117270
What|Removed |Adde
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117270
Tamar Christina changed:
What|Removed |Added
Resolution|FIXED |---
Status|RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 118691, which changed state.
Bug 118691 Summary: [15 Regression] gcc_r in SPECCPU 2017 miscompare on train
dataset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691
What|Removed |Adde
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118464
--- Comment #14 from Tamar Christina ---
Still being worked on, I'll send v3 of the patch today or tomorrow.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611
--- Comment #8 from Tamar Christina ---
Yeah, that makes sense. Thanks for working on it!
We've been trying to reduce the different cases where we see this happening in
the hopes to provide more data to tune any possible heuristics.
So the pa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611
Tamar Christina changed:
What|Removed |Added
CC||acoplan at gcc dot gnu.org
--- Commen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118852
--- Comment #4 from Tamar Christina ---
(In reply to ktkachov from comment #3)
> FWIW I see this also on aarch64
I filed the AArch64 bug weeks ago
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691, there we don't need
-fprofile-generate to tr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118800
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118211
Bug 118211 depends on bug 118754, which changed state.
Bug 118754 Summary: [15 Regression] FAIL: gcc.target/i386/pr106010-8c.c by
r15-6807-g68326d5d1a593d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118754
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118753
Bug 118753 depends on bug 118754, which changed state.
Bug 118754 Summary: [15 Regression] FAIL: gcc.target/i386/pr106010-8c.c by
r15-6807-g68326d5d1a593d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118754
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118754
Tamar Christina changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
||tnfchris at gcc dot gnu.org
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
Last reconfirmed||2025-02-08
Status|UNCONFIRMED |ASSIGNED
--- Comment #1 from Tamar Christina ---
Arg, wonder
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118756
Tamar Christina changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118754
--- Comment #3 from Tamar Christina ---
As for vect-tail-nomask-1.c and pr106010-8c.c they are testisms that I had
fixed but it seems like I never updated the final patch with.
The result checking loops just need a
#pragma GCC novector.
Will s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118756
--- Comment #2 from Tamar Christina ---
Ah, indeed it's unused now.
I'll send a cleanup patch then. Thanks for catching it!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118754
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691
--- Comment #7 from Tamar Christina ---
(In reply to Richard Biener from comment #6)
> Works for me on x86_64-linux with -Ofast -march=znver4
Yeah still failing here. I'll track down the change in code this week. It's on
my list for the week.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118727
--- Comment #17 from Tamar Christina ---
(In reply to Xi Ruoyao from comment #16)
> (In reply to Tamar Christina from comment #15)
> > (In reply to Xi Ruoyao from comment #13)
> > > For example for the original gcc.dg/pr108692.c:
> > >
> > >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118727
--- Comment #15 from Tamar Christina ---
(In reply to Xi Ruoyao from comment #13)
> For example for the original gcc.dg/pr108692.c:
>
> a.0_4 = (unsigned char) a_14;
> _5 = (int) a.0_4;
> b.1_6 = (unsigned char) b_16;
> _7 = (int) b.1_6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118727
--- Comment #11 from Tamar Christina ---
(In reply to Xi Ruoyao from comment #10)
> The difference from AArch64 and LoongArch64 is AArch64 has WIDEN_ABD, and
> (with GCC 14.2):
>
> t.c:10:17: note: abd pattern recognized: patt_29 = (int) patt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118727
--- Comment #8 from Tamar Christina ---
That change was made in g:aec90c8bf30cbd66e4febae2c78622dc217f3918, but no real
explanation as to why.
patt_40 = (signed char) a.0_4;
patt_41 = SAD_EXPR ;
would be ok if it was
patt_41 = SAD_EXPR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691
Tamar Christina changed:
What|Removed |Added
Summary|[15 Regression] gcc_r in|[15 Regression] gcc_r in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691
--- Comment #4 from Tamar Christina ---
(In reply to Andrew Pinski from comment #3)
> Isn't this a dup of bug 115450 ?
Don't believe so. This is only showing up with PGO for me, but it's only during
training, so I suspect -fprofile-generate is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257
--- Comment #14 from Tamar Christina ---
Should be fixed now on trunk and GCC 14 and 13, leaving it open for Iain's
patch introducing the cores in aarch64-cores.def which would give us the right
architecture too.
However this should unblock the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110901
Tamar Christina changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691
--- Comment #2 from Tamar Christina ---
(In reply to Richard Biener from comment #1)
> Please add -fno-strict-aliasing and try again.
Already on. Full options are:
-fprofile-generate -mcpu=neoverse-v1 -Ofast -fomit-frame-pointer -flto=auto -g
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*
During the training part the run fails with:
200.c: In function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > I think this is the same as PR 82237.
>
> Or at least related.
I'm not sure, in this one the instructions hav
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*
The following example:
#include
float32x4_t
bad (float32x4_t x, float32x4_t c0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118273
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113425
--- Comment #6 from Tamar Christina ---
(In reply to Torbjorn SVENSSON from comment #5)
> @Tamar: You can see the same fails with 14.2.Rel1 that is available for
> download from the Arm webpage.
>
> I see the following in my gcc.log for Cortex-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118273
--- Comment #2 from Tamar Christina ---
It seems that the nmasks is wrong here:
unsigned nmasks
= exact_div (ncopies * bestn->simdclone->simdlen,
TYPE_VECTOR_SUBPARTS (vecty
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118529
--- Comment #7 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #6)
> On Fri, 17 Jan 2025, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118529
> >
> > -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118529
--- Comment #5 from Tamar Christina ---
(In reply to Richard Biener from comment #4)
> Confirmed.
> OTOH the initial choice of mask mode for the compare by the vectorizer
> is a bit odd. We get there from vect_recog_bool_pattern handling
>
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257
Tamar Christina changed:
What|Removed |Added
Version|14.0|13.0
--- Comment #11 from Tamar Chris
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118451
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118472
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118472
--- Comment #3 from Tamar Christina ---
reducer:
typedef int a;
typedef struct {
a b __attribute__((__vector_size__(8)))
} c;
typedef a d __attribute__((__vector_size__(8)));
c e, f, g;
d h, j;
void k() {
c l;
l.b[1] = 0;
c m = l;
|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
--- Comment #2 from Tamar Christina ---
(In reply to Richard Biener from comment #1)
> Confirmed with -O3 -fopenmp-simd. The operand_equal_p code isn't good:
>
> 3745 /* BIT_INSERT_EXPR has an implict opera
1 - 100 of 1249 matches
Mail list logo