from:"tnfchris at gcc dot gnu.org"

[Bug tree-optimization/120922] [16 Regression] RISC-V: ICE during GIMPLE pass: vect in verify_range with -mrvv-max-lmul=m8

2025-07-09 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/120922] [16 Regression] RISC-V: ICE during GIMPLE pass: vect in verify_range with -mrvv-max-lmul=m8

2025-07-09 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922 --- Comment #9 from Tamar Christina --- (In reply to Robin Dapp from comment #6) > (In reply to Tamar Christina from comment #5) > > Question, can I count on > > > > -march=rv64gcv_zvl1024b -mrvv-vector-bits=zvl -mrvv-max-lmul=m8 > > > > alway

[Bug tree-optimization/120980] Vectorizer (early exit) introduces out-of-bounds memory access

2025-07-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120980 --- Comment #10 from Tamar Christina --- Could we perhaps emit additional annotation into gimple to describe what the vectorizer thinks is safe? And the tool verifies the claims?

[Bug tree-optimization/120922] [16 Regression] RISC-V: ICE during GIMPLE pass: vect in verify_range with -mrvv-max-lmul=m8

2025-07-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922 --- Comment #5 from Tamar Christina --- Question, can I count on -march=rv64gcv_zvl1024b -mrvv-vector-bits=zvl -mrvv-max-lmul=m8 always being available as a codegen option for RVV? or do I need some require-effective-target checks?

[Bug tree-optimization/120980] Vectorizer (early exit) introduces out-of-bounds memory access

2025-07-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120980 --- Comment #5 from Tamar Christina --- (In reply to Richard Biener from comment #4) > > Now, the testcase shows a missed optimization - we are unnecessarily > using a large VF because of > > t.c:2:21: note: ==> examining phi: ivtmp_21 = PHI

[Bug target/120922] [16 Regression] RISC-V: ICE during GIMPLE pass: vect in verify_range with -mrvv-max-lmul=m8

2025-07-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug target/120922] [16 Regression] RISC-V: ICE during GIMPLE pass: vect in verify_range with -mrvv-max-lmul=m8

2025-07-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org

[Bug tree-optimization/120817] [13/14/15/16 regression] Wrong code when compiled with -O1 -ftree-loop-vectorize for AArch64 target

2025-07-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817 --- Comment #17 from Tamar Christina --- (In reply to Richard Biener from comment #16) > No, that cannot be required for correct operation. I think DSE is wrong in > assessing that the store covers more than 5 bytes. The following fixes it > f

[Bug tree-optimization/120817] [13/14/15/16 regression] Wrong code when compiled with -O1 -ftree-loop-vectorize for AArch64 target

2025-07-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817 --- Comment #15 from Tamar Christina --- (In reply to Richard Biener from comment #13) > (In reply to Tamar Christina from comment #12) > > Looks like the problem is that during ao_ref_init_from_ptr_and_range when > > initializing vectp_target.1

[Bug tree-optimization/120980] Vectorizer (early exit) introduces out-of-bounds memory access

2025-07-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120980 --- Comment #1 from Tamar Christina --- I'm not sure that I'd draw the same conclusion. I view it as the vectorizer has put a 32-byte alignment requirement on the object and so I'd consider the object itself to be 32-bytes sized. So to the not

[Bug tree-optimization/120817] [13/14/15/16 regression] Wrong code when compiled with -O1 -ftree-loop-vectorize for AArch64 target

2025-07-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817 --- Comment #12 from Tamar Christina --- Looks like the problem is that during ao_ref_init_from_ptr_and_range when initializing vectp_target.14_54 = &targetD.4595 + _55; we don't enter the block splitting apart POINTER_PLUS_EXPR. So it ends up

[Bug tree-optimization/120817] [13/14/15/16 regression] Wrong code when compiled with -O1 -ftree-loop-vectorize for AArch64 target

2025-07-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817 --- Comment #11 from Tamar Christina --- (In reply to Richard Biener from comment #10) > (In reply to Tamar Christina from comment #8) > > C testcase > > > > typedef struct { > > int _M_current; > > } __normal_iterator; > > > > typedef str

[Bug testsuite/120805] [16 Regression] gcc.target/powerpc/p9-vec-length-epil-4.c fail starting with r16-1645-g309dbcea2cabb3

2025-07-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805 --- Comment #3 from Tamar Christina --- before the bounds variable didn't have any range attached to it. e.g. bnd.704_180 = _181 - _132; but now it shows # RANGE [irange] unsigned int [1, 2147483647] bnd.704_180 = _181 - _132; For some reas

[Bug target/120959] [16 Regression] 9% slowdown of 549.fotonik3d_r on Zen5 since r16-1645-g309dbcea2cabb3

2025-07-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120959 Tamar Christina changed: What|Removed |Added Assignee|tnfchris at gcc dot gnu.org|unassigned at gcc dot gnu.org

[Bug target/120959] [16 Regression] 9% slowdown of 549.fotonik3d_r on Zen5 since r16-1645-g309dbcea2cabb3

2025-07-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120959 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/120817] [13/14/15/16 regression] Wrong code when compiled with -O1 -ftree-loop-vectorize for AArch64 target

2025-06-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817 --- Comment #9 from Tamar Christina --- So the key to triggering it here is the pass by value. Testing a patch.

[Bug tree-optimization/120817] [13/14/15/16 regression] Wrong code when compiled with -O1 -ftree-loop-vectorize for AArch64 target

2025-06-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817 Tamar Christina changed: What|Removed |Added Target|aarch64-linux-gnu |aarch64-* Build|x86_64-l

[Bug tree-optimization/120817] [13/14/15/16 regression] Wrong code when compiled with -O1 -ftree-loop-vectorize for AArch64 target

2025-06-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817 --- Comment #7 from Tamar Christina --- (In reply to Richard Biener from comment #6) > A C testcase would be really nice. Have not been able to reproduce it as C yet, but here's a cut-down C++ version #include #include #include extern int

[Bug tree-optimization/120817] [13/14/15/16 regression] Wrong code when compiled with -O1 -ftree-loop-vectorize for AArch64 target

2025-06-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817 --- Comment #4 from Tamar Christina --- Confirmed, bisecting and taking a look.

[Bug testsuite/120805] [16 Regression] gcc.target/powerpc/p9-vec-length-epil-4.c fail starting with r16-1645-g309dbcea2cabb3

2025-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org

[Bug target/115842] [15/16 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2025-06-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842 --- Comment #12 from Tamar Christina --- (In reply to Hongtao Liu from comment #11) > (In reply to Tamar Christina from comment #9) > > (In reply to Hongtao Liu from comment #8) > > > (In reply to Tamar Christina from comment #7) > > > > (In rep

[Bug tree-optimization/119187] vectorizer should be able to SLP already vectorized code

2025-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187 --- Comment #10 from Tamar Christina --- (In reply to ktkachov from comment #9) > (In reply to Tamar Christina from comment #8) > > (In reply to ktkachov from comment #7) > > > Could this be extended to scale Neon intrinsics code to SVE by > > >

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2025-06-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 116855, which changed state. Bug 116855 Summary: [14 Regression] Unsafe early-break vectorization https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855 What|Removed |Added

[Bug tree-optimization/116855] [14 Regression] Unsafe early-break vectorization

2025-06-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855 Tamar Christina changed: What|Removed |Added Resolution|--- |WONTFIX Status|NEW

[Bug target/114860] [14/15/16 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2025-06-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 Tamar Christina changed: What|Removed |Added Resolution|--- |WONTFIX Status|UNCONFIRME

[Bug target/120447] [16 Regression] cpython fails to compile on AArch64 after r16-446-g210d06502f22964c7214586c54f8eb54a6965bfd

2025-05-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447 --- Comment #5 from Tamar Christina --- I could be mistaken, but VNx4QI is a partial vector, so every QI element occupies 32-bits (so we'd use a widening load here). I'm not sure this operation is valid for partial vectors as it means you're ta

[Bug tree-optimization/120357] [14/15/16 Regression] ICE in vect "error: definition in block 9 does not dominate use in block 3" with early break

2025-05-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120357 --- Comment #9 from Tamar Christina --- (In reply to Richard Biener from comment #8) > The following fixes this. I'm not 100% convinced but it does seem "obvious" > (but for the "peeled" case we seem to eventually create duplicate COND > reduct

[Bug target/120447] [16 Regression] cpython fails to compile on AArch64 after r16-446-g210d06502f22964c7214586c54f8eb54a6965bfd

2025-05-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447 Tamar Christina changed: What|Removed |Added Status|WAITING |NEW Keywords|needs-source

[Bug target/120447] New: [16 Regression] cpython fails to compile on AArch64 after r16-446-g210d06502f22964c7214586c54f8eb54a6965bfd

2025-05-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

Status: UNCONFIRMED Keywords: aarch64-sve, ice-on-valid-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org CC: jschmitz at gcc dot

[Bug tree-optimization/120383] Improving early break unrolled sequences with Adv. SIMD

2025-05-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120383 --- Comment #2 from Tamar Christina --- (In reply to Richard Biener from comment #1) > Sure, I'm OK with an optab for it. So it's like (half-type)((unsigned)(a + > b) >> (sizeof(a)*4))? Yeah, and I was planning on if an optab was acceptable to

[Bug tree-optimization/120357] [14/15/16 Regression] ICE in vect "error: definition in block 9 does not dominate use in block 3" with early break

2025-05-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120357 --- Comment #7 from Tamar Christina --- (In reply to Richard Biener from comment #5) > Confirmed on trunk. I'll eventually have a look. Sorry I'm on holiday till Tuesday, I'm happy to take a look then if you prefer. I did not mean to dump my b

[Bug tree-optimization/120383] Improving early break unrolled sequences with Adv. SIMD

2025-05-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120383 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/120383] New: Improving early break unrolled sequences with Adv. SIMD

2025-05-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Blocks: 53947, 115130 Target Milestone: --- Target: aarch64* Today if we unroll an early break loop such as: #define N 640 long

[Bug middle-end/120352] New: scalar epiloque not needed for early break when exit block is invariant

2025-05-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Blocks: 53947, 115130 Target Milestone: --- The following sequence #define N 4 int a[N

[Bug tree-optimization/116855] [14 Regression] Unsafe early-break vectorization

2025-05-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855 --- Comment #14 from Tamar Christina --- (In reply to Richard Biener from comment #13) > Too late for backporting to 14.3 IMO, also not sure how important it is - we > did not have an actual case where this caused problems AFAIK. early-break >

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #9 from Tamar Christina --- (In reply to rguent...@suse.de from comment #8) > On Thu, 8 May 2025, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 > > > > -

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #7 from Tamar Christina --- (In reply to Richard Biener from comment #6) > (In reply to Tamar Christina from comment #5) > > The given example is an easy one to drop, but I wonder what would happen if > > the block had other instruct

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #5 from Tamar Christina --- (In reply to Richard Biener from comment #4) > Note with "vectorizing" prefetches I meant adjusting the prefetched address, > "vectorizing" it as an induction but only prefetching on the first (or > last?)

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #3 from Tamar Christina --- (In reply to Tamar Christina from comment #2) > (In reply to Richard Biener from comment #1) > > As of today this is a job for the vectorizer if-conversion pass then. > > > > OTOH I believe we should work

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #2 from Tamar Christina --- (In reply to Richard Biener from comment #1) > As of today this is a job for the vectorizer if-conversion pass then. > > OTOH I believe we should work towards vectorizing the prefetches themselves > rathe

[Bug target/120157] No use of SVE early break vectorisation in FP loop

2025-05-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157 --- Comment #5 from Tamar Christina --- (In reply to ktkachov from comment #4) > > Ah indeed, -msve-vector-bits= does do what I expected. Feel free to close > > this if it's not tracking anything new then. > > Ok. FWIW the original testcase for

[Bug target/120157] No use of SVE early break vectorisation in FP loop

2025-05-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigne

[Bug target/120157] No use of SVE early break vectorisation in FP loop

2025-05-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157 --- Comment #1 from Tamar Christina --- (In reply to ktkachov from comment #0) > Not sure if this is a target-specific issue or not. For input: > int f11(float *x, float val, int n) > { > int i; > for (i = 0; i < n; i++) { > if (

[Bug libstdc++/116140] [15/16 Regression] 5-10% slowdown of 483.xalancbmk and 523.xalancbmk_r since r15-2356-ge69456ff9a54ba

2025-05-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116140 --- Comment #20 from Tamar Christina --- We're currently working on it. The improvements come from architectures where the code vectorized. The performance losses come from those where it didn't vectorize, or the vectorizer generated inefficien

[Bug tree-optimization/119351] [14 Regression] Incorrect forall masking for AND reduction in early break

2025-04-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/118892] [14 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-04-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/119921] [12/13/14/15/16 Regression] ICE building SVE ACLE in varasm

2025-04-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119921 Tamar Christina changed: What|Removed |Added Version|13.3.1 |16.0 Target Milestone|---

[Bug target/119921] New: [12/13/14/15/16 Regression] ICE building SVE ACLE in varasm

2025-04-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Target Milestone: --- Target: aarch64* Created attachment 61193 --> https://gcc.gnu.org/bugzi

[Bug tree-optimization/119881] support a large number of pointers in alias versioning

2025-04-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119881 --- Comment #2 from Tamar Christina --- (In reply to Richard Biener from comment #1) > I wonder where this matters in practice and my usual stance is educating > users > about __restrict or #pragma GCC ivdep or OMP simd safelen is better than >

[Bug tree-optimization/119872] [15/16 regression] wrong code at -O{1,2,s} since r15-1809-g735edbf1e2479f

2025-04-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119872 --- Comment #10 from Tamar Christina --- (In reply to rguent...@suse.de from comment #9) > On Mon, 21 Apr 2025, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119872 > > > > -

[Bug tree-optimization/119872] [15/16 regression] wrong code at -O{1,2,s} since r15-1809-g735edbf1e2479f

2025-04-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119872 --- Comment #8 from Tamar Christina --- (In reply to Richard Biener from comment #7) > Please make sure to not "fix" something where the input is already wrong - > see the various issues where SCEV produces an invalid CHREC - forming a chrec > i

[Bug tree-optimization/119872] [15/16 regression] wrong code at -O{1,2,s} since r15-1809-g735edbf1e2479f

2025-04-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

|ASSIGNED Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org Last reconfirmed||2025-04-21 --- Comment #6 from Tamar Christina --- Thanks for the report. This is happening because we're using the affine tree at the

[Bug tree-optimization/119881] New: support alias analysis for large number of pointers

2025-04-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Blocks: 53947, 115130 Target Milestone: --- Consider the following example: void foo (int *a1, int *a2

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2025-04-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #27 from Tamar Christina --- (In reply to Tianyang Chou from comment #26) > (In reply to Tamar Christina from comment #0) > > Hi Tamar, > After reading the whole discussion, I still confused about how does the > immediate offset

[Bug tree-optimization/119860] New: needless vector unrolling causes less profitable vectorization

2025-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Blocks: 53947, 115130 Target Milestone: --- consider the following loop: #define N 512

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 --- Comment #9 from Tamar Christina --- (In reply to Thomas Schwinge from comment #8) > Tamar, thanks! I confirm all fixed -- but one: > > (In reply to myself from comment #1) > > ..., and similarly -- but not identical! -- for '-march=gfx1100

[Bug tree-optimization/119858] [15/16 Regression] GCN vs. "middle-end: Fix incorrect codegen with PFA and VLS [PR119351]"

2025-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119858 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/119351] [14 Regression] Incorrect forall masking for AND reduction in early break

2025-04-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Priority|P1 |P2 --- Comment #23 from Tamar Christi

[Bug tree-optimization/119351] [14 Regression] Incorrect forall masking for AND reduction in early break

2025-04-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Target Milestone|15.0|14.3 Summary|[15 Regressio

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-04-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Keywords|needs-reduction,| |needs-source

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #18 from Tamar Christina --- (In reply to Richard Biener from comment #17) > I wonder if we can use > > BIT_FIELD_REF > > as the "reduction" step. Yeah that's the same comment Richard S suggested when we were talking to avoid th

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #16 from Tamar Christina --- Ok, found the bug and c-vise is running for a testcase. The issue is as follows: For early break we need to know which value to start the scalar loop with if we take an early exit. Historically this me

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-09 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #15 from Tamar Christina --- The following example reproduces the CFG but not the bad codegen: https://godbolt.org/z/Thzo7hz8P This generates the actual code I expected: _55 = {_2, _2, _2, _2}; _56 = {_11, _11, _11, _11}; _57

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-09 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #14 from Tamar Christina --- There seems to be an one error in the pre-header when calculating the initial vector IV. The starting values are calculated as: sub z27.s, z23.s, z31.s

[Bug tree-optimization/119187] vectorizer should be able to SLP already vectorized code

2025-04-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187 --- Comment #8 from Tamar Christina --- (In reply to ktkachov from comment #7) > Could this be extended to scale Neon intrinsics code to SVE by > re-vectorising and treating the 128-bit Neon lane as a Q-word element of a > wider SVE vector? I t

[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra

2025-04-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug middle-end/119577] RISC-V: Redundant vector IV roundtrip.

2025-04-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119577 --- Comment #2 from Tamar Christina --- (In reply to Richard Biener from comment #1) > IIRC it depends on the "kind" of early break whether we need the > first IV (scalar IV possible) or the last, but I don't rememeber exactly. First is always

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #13 from Tamar Christina --- Sorry had a week off, looking into this again today.

[Bug target/118892] [14 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-04-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892 --- Comment #18 from Tamar Christina --- (In reply to Pavol Rusnak from comment #17) > Is the fix going to be backported from master to 14.x release? Possibly > targeting 14.3.0 release? Yep

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #9 from Tamar Christina --- --- static bool next_ci(int dimYY, int numCells, int nth, int ci_block, int* ci_x, int* ci_y, int* ci_b, int* ci) { while (*ci >= *ci_x * dimYY + *ci_y + 1) { *ci_y += 1; if (*ci_y

[Bug tree-optimization/115450] [15 Regression] cpu2017 502.gcc runtime miscompute on aarch64 with SVE since r15-1006-gd93353e6423eca

2025-03-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115450 --- Comment #11 from Tamar Christina --- (In reply to Richard Biener from comment #10) > Can anybody still reproduce this? I can't. I can reproduce the failure with the original commit but cannot with today's trunk.

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #8 from Tamar Christina --- Looking at it some more, I think the loop is valid to vectorize. But we don't seem to vectorize the reduction jumping back to the outerloop: ;; basic block 384, loop depth 3, count 8598980 (estimated lo

[Bug tree-optimization/119402] [14/15 Regression] `((-bool) & _6) & (~_6)` is not optimized to 0 on some targets since r14-5673

2025-03-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119402 --- Comment #3 from Tamar Christina --- (In reply to Jakub Jelinek from comment #2) > Started with r14-5673-g33c2b70dbabc02788caabcbc66b7baeafeb95bcf > With -O2 -mtune=generic it is fine even on the current trunk. Seems like it's due to missing

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #12 from Tamar Christina --- Sorry for the slow response, had a few days off. The regression here can be reproduced through this example loop: https://godbolt.org/z/jnGe5x4P7 for the current loop in snappy what you want is -UALIGNE

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #7 from Tamar Christina --- Sorry for the delay, had a few days off. So looking at this again, it's happening When next_ci gets inlined into nbnxn_make_pairlist_part, the while loop while (next_ci(iGrid, nth, ci_block, &ci_x, &ci_y

[Bug tree-optimization/119393] [15 Regression] Worse vectorization of imagick_r hot loop on aarch64 since r15-5024-g2a2e6784074e1f

2025-03-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119393 --- Comment #3 from Tamar Christina --- Confirmed.

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

|1 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org --- Comment #4 from Tamar Christina --- While looking at the codegen it looks like GROMACS has a lot of loops that get vectorized

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #6 from Tamar Christina --- (In reply to ktkachov from comment #5) > (In reply to Tamar Christina from comment #4) > > While looking at the codegen it looks like GROMACS has a lot of loops that > > get vectorized now and it's showing

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-03-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 --- Comment #5 from Tamar Christina --- Still have one to fix.

[Bug target/115842] [15 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2025-03-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842 --- Comment #9 from Tamar Christina --- (In reply to Hongtao Liu from comment #8) > (In reply to Tamar Christina from comment #7) > > (In reply to Hongtao Liu from comment #6) > > > I noticed some double-counting of cost in group-candidate (reg

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2025-03-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #24 from Tamar Christina --- Hi, Yeah vectorization was one of the reasons for the slowdown. Do note however it's not entirely safe to backport that patch, as it exposes another bug which has a large fix. At least the top two comm

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #3 from Tamar Christina --- Confirmed, able to reproduce it now. Taking a look. -march=armv8-a+sve is enough FFIW.

[Bug tree-optimization/119187] vectorizer should be able to SLP already vectorized code

2025-03-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187 --- Comment #5 from Tamar Christina --- (In reply to Richard Biener from comment #4) > > for (...) >a[32*i] = ..; >a[32*i+1] = ..; > ... >a[32*i + 31] = ...; > > to match the number of lanes in a HW vector. It shares some of the

[Bug target/115842] [15 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2025-03-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-03-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed|

[Bug target/118974] Use SVE cbranch sequence for Neon modes when TARGET_SVE

2025-03-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118974 --- Comment #3 from Tamar Christina --- and using the SVE CC regs: .L6: ldr q30, [x2, x0] cmple p15.s, p7/z, z30.s, #0 b.none .L2

[Bug target/118974] Use SVE cbranch sequence for Neon modes when TARGET_SVE

2025-03-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118974 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #11 from Tamar Christina --- Actually I just realized that loop uses two pointers, and we can only peel for one unknown misalignment atm. This loop will instead be versioned, and because of the manual misalignment in the caller I don

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #10 from Tamar Christina --- (In reply to Matthew Malcomson from comment #9) > (In reply to Tamar Christina from comment #8) > > Ok, so having looked at this I'm not sure the compiler is at fault here. > > > > Similar to the SVN cas

[Bug tree-optimization/119187] vectorizer should be able to SLP already vectorized code

2025-03-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > (In reply to Andrew Pinski from comment #1) > > There is another bug report for a similar thing but with SSE and AVX2. > > yes PR 95960. Ah yeah, I guess I w

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #8 from Tamar Christina --- Ok, so having looked at this I'm not sure the compiler is at fault here. Similar to the SVN case the snappy code is misaligning the loads intentionally and loading 64-bits at a time from the 8-bit pointe

[Bug tree-optimization/119187] New: vectorizer should be able to SLP already vectorized code

2025-03-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Target Milestone: --- Today there's a lot of code written as intrinsics for older microarchitectures

[Bug tree-optimization/118464] [15 Regression] gcc-15.0.0_pre20250112 ICE with opencv-4.10.0 using -O2/-ftree-loop-vectorize: memory_descriptor_ref.cpp:94:19: internal compiler error: in exact_div, at

2025-03-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118464 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/116855] [14 Regression] Unsafe early-break vectorization

2025-03-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855 Tamar Christina changed: What|Removed |Added Summary|[14/15 Regression] Unsafe |[14 Regression] Unsafe

[Bug middle-end/119145] [15 Regression] ICE in expanding IFN_MASK_CALL from vector math

2025-03-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119145 --- Comment #1 from Tamar Christina --- The vectorizer seems confused. Vectorization fails, but seems to fail during SLP transform so the ifc loop is kept, but the statements not transformed. it then produces broken SSA: note: * Analysis

[Bug middle-end/119145] New: [15 Regression] ICE in expanding IFN_MASK_CALL from vector math

2025-03-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Target Milestone: --- Target: aarch64* The following testcase: typedef short Quantum; Quantum

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #6 from Tamar Christina --- Ok, now really confirmed :) Interestingly the behavior on other uarches suggests this may be cost modelling. On Neoverse-V1 we get (without LTO): BM_UFlat/0/1 -4.60251 BM_UFlat/0/2 -2.34742 BM_UFlat/3/1

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #5 from Tamar Christina --- Ah... It looks like somehow the built for /data/gcc/gcc-with-68326d5d1a5-install/ failed and it was silently picking up the distro compiler instead. Hence the difference in memmove only! I'll clean every

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #4 from Tamar Christina --- (In reply to Matthew Malcomson from comment #3) > I only looked into VecSource/5/2, and unfortunately I looked into it on an > internal setup that compiles slightly differently. > > In that slightly diffe

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1303 matches

Mail list logo