[Bug rtl-optimization/69377] [6 Regression] wrong code at -O2 on x86_64-linux-gnu (in 32-bit mode)

2016-01-20 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69377 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug target/69442] [6 Regression] wrong code with -Og and 64bit modulo @ armv7a

2016-01-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69442 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug target/65768] sub-optimimal code for constant Uses in loop

2015-06-02 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65768 --- Comment #3 from kugan at gcc dot gnu.org --- Author: kugan Date: Tue Jun 2 22:53:15 2015 New Revision: 224048 URL: https://gcc.gnu.org/viewcvs?rev=224048&root=gcc&view=rev Log: gcc/ChangeLog: 2015-06-03 Kugan Vivekana

[Bug target/65768] sub-optimimal code for constant Uses in loop

2015-06-02 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65768 kugan at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution

[Bug target/66554] [4.9 Regression] ICE (in expand_fix, at optabs.c:5365) on aarch64-linux-gnu

2015-06-16 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66554 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug target/66554] [4.9 Regression] ICE (in expand_fix, at optabs.c:5365) on aarch64-linux-gnu

2015-06-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66554 --- Comment #2 from kugan at gcc dot gnu.org --- can_fix_p is returining CODE_FOR_nothing for converting from tomode=V4SImode to frommode=V4SFmode with branch 4.9. With trunk it is returning CODE_FOR_fix_truncv4sfv4si2.

[Bug target/66554] [4.9 Regression] ICE (in expand_fix, at optabs.c:5365) on aarch64-linux-gnu

2015-06-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66554 --- Comment #3 from kugan at gcc dot gnu.org --- correction: with 4.9 when it ICE we have: Breakpoint 1, expand_fix (to=to@entry=0x765b5480, from=0x765b2000, unsignedp=unsignedp@entry=0) at /home/kugan/work/sources/gcc-fsf/4.9/gcc

[Bug target/66554] [4.9 Regression] ICE (in expand_fix, at optabs.c:5365) on aarch64-linux-gnu

2015-06-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66554 --- Comment #6 from kugan at gcc dot gnu.org --- -fno-tree-forwprop works. forwprop propagates: vect__11.22_96 = (vector(4) float) vect_c.21_94; vect__13.24_98 = (vector(4) signed int) vect__11.22_96; into: vect__13.24_98 = (vector(4) signed

[Bug target/66554] [4.9 Regression] ICE (in expand_fix, at optabs.c:5365) on aarch64-linux-gnu

2015-06-18 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66554 --- Comment #8 from kugan at gcc dot gnu.org --- Starting bisect now.

[Bug tree-optimization/64130] vrp: handle non zero constant divided by range cannot be zero.

2015-06-18 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64130 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug tree-optimization/64130] vrp: handle non zero constant divided by range cannot be zero.

2015-06-19 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64130 --- Comment #10 from kugan at gcc dot gnu.org --- (In reply to Marc Glisse from comment #6) > (In reply to kugan from comment #5) > > I think it should be in from front-end? > > ? Sorry for the confusing terminology. for the c

[Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32

2015-06-24 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375 kugan at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution

[Bug tree-optimization/64130] vrp: handle non zero constant divided by range cannot be zero.

2015-06-28 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64130 --- Comment #11 from kugan at gcc dot gnu.org --- Author: kugan Date: Mon Jun 29 00:15:41 2015 New Revision: 225108 URL: https://gcc.gnu.org/viewcvs?rev=225108&root=gcc&view=rev Log: gcc/ChangeLog: 2015-06-29 Kugan Vivekana

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2015-07-01 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726 --- Comment #1 from kugan at gcc dot gnu.org --- Created attachment 35888 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35888&action=edit untested prototype patch Hi Jeff, Here is a patch (without debug dumps and not tesetd ful

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2015-07-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726 --- Comment #3 from kugan at gcc dot gnu.org --- > really you should handle more > than two arguments to phis. I am not sure how we can handle phi stmt with more than two arguments here. Any hints please?

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2015-07-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726 --- Comment #5 from kugan at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #4) > (In reply to kugan from comment #3) > > > really you should handle more > > > than two arguments to phis. > > I am not sure

[Bug tree-optimization/66759] [6 Regression] ICE in generic-match.c on 456.hmmer

2015-07-04 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66759 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2015-07-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726 --- Comment #8 from kugan at gcc dot gnu.org --- Author: kugan Date: Sun Jul 12 11:22:42 2015 New Revision: 225722 URL: https://gcc.gnu.org/viewcvs?rev=225722&root=gcc&view=rev Log: gcc/testsuite/ChangeLog: 2015-07-12 Kugan Vivekana

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2015-07-13 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726 --- Comment #11 from kugan at gcc dot gnu.org --- Thanks for reporting. This test case is valid for targets that has branch cost greater than 1. One way to handle this is by disabling this for convections involving bool that are part of branch

[Bug c/66865] wine segfaults from gcc in trunk (r225757) (regression)

2015-07-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66865 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2015-07-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726 --- Comment #12 from kugan at gcc dot gnu.org --- Created attachment 35976 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35976&action=edit patch for tree-ssa-reassoc Here is a prototype patch (to fix comment 9) that makes tree-ssa-

[Bug rtl-optimization/66865] [6 Regression] wine segfaults from gcc in trunk (r225757) (regression)

2015-07-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66865 --- Comment #4 from kugan at gcc dot gnu.org --- Ok, I downloaded wine-1.7.47.tar.bz2 and built it with trunk gcc. What do I have to do to reproduce this please?

[Bug rtl-optimization/66865] [6 Regression] wine64 segfaults from gcc in trunk (r225757) (regression)

2015-07-15 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66865 --- Comment #12 from kugan at gcc dot gnu.org --- (In reply to austinenglish from comment #5) > (In reply to kugan from comment #4) > > Ok, I downloaded wine-1.7.47.tar.bz2 and built it with trunk gcc. What do I > > have to do to

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2016-07-24 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726 --- Comment #19 from kugan at gcc dot gnu.org --- Author: kugan Date: Sun Jul 24 12:47:29 2016 New Revision: 238695 URL: https://gcc.gnu.org/viewcvs?rev=238695&root=gcc&view=rev Log: gcc/ChangeLog: 2016-07-24 Kugan Vivekana

[Bug tree-optimization/71994] [7 Regression] ICE: verify_gimple failed

2016-07-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71994 --- Comment #2 from kugan at gcc dot gnu.org --- Patch to fix this is posted for review at https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01680.html

[Bug tree-optimization/71994] [7 Regression] ICE: verify_gimple failed

2016-07-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71994 --- Comment #4 from kugan at gcc dot gnu.org --- Author: kugan Date: Wed Jul 27 22:45:46 2016 New Revision: 238802 URL: https://gcc.gnu.org/viewcvs?rev=238802&root=gcc&view=rev Log: gcc/testsuite/ChangeLog: 2016-07-28 Kugan Vivekana

[Bug tree-optimization/71994] [7 Regression] ICE: verify_gimple failed

2016-07-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71994 --- Comment #5 from kugan at gcc dot gnu.org --- Author: kugan Date: Wed Jul 27 23:02:44 2016 New Revision: 238803 URL: https://gcc.gnu.org/viewcvs?rev=238803&root=gcc&view=rev Log: gcc/testsuite/ChangeLog: 2016-07-28 Kugan Vivekana

[Bug rtl-optimization/68217] Wrong constant folding

2016-07-28 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68217 --- Comment #4 from kugan at gcc dot gnu.org --- Author: kugan Date: Fri Jul 29 00:35:23 2016 New Revision: 238846 URL: https://gcc.gnu.org/viewcvs?rev=238846&root=gcc&view=rev Log: gcc/ChangeLog: 2016-07-29 Kugan Vivekana

[Bug tree-optimization/72835] [7 Regression] Incorrect arithmetic optimization involving bitfield arguments

2016-08-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72835 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug tree-optimization/72835] [7 Regression] Incorrect arithmetic optimization involving bitfield arguments

2016-08-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72835 --- Comment #4 from kugan at gcc dot gnu.org --- Looks like it was a latent issue. In rewrite_expr_tree, when re-associate operands, we should reset range_info for the LHS. We don’t do that now. Following patch fixes the test case. diff --git

[Bug tree-optimization/61839] More optimize opportunity for VRP

2016-08-19 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61839 --- Comment #3 from kugan at gcc dot gnu.org --- Author: kugan Date: Sat Aug 20 01:18:09 2016 New Revision: 239637 URL: https://gcc.gnu.org/viewcvs?rev=239637&root=gcc&view=rev Log: gcc/testsuite/ChangeLog: 2016-08-20 Kugan Vivekana

[Bug tree-optimization/77387] New: Value range not computed in some cases for ABS_EXPR

2016-08-25 Thread kugan at gcc dot gnu.org
Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- For testcase: int foo (int i) { int x = i; x = __builtin_abs (i); x >>= 24; if (x > 256) return 0; return x; } vrp1 dump is: Val

[Bug tree-optimization/77387] Value range not computed in some cases for ABS_EXPR

2016-08-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77387 --- Comment #1 from kugan at gcc dot gnu.org --- With : diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index e4d789b..2d1f4c8 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -3416,6 +3416,17 @@ extract_range_from_unary_expr_1 (value_range *vr

[Bug libgomp/113698] GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance

2024-02-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113698 --- Comment #4 from kugan at gcc dot gnu.org --- Thanks for looking into this. The main reason we ere seeing performance issue turned out to be due to glibc malloc issue in https://sourceware.org/bugzilla/show_bug.cgi?id=30945

[Bug middle-end/111683] [11/12/13/14 Regression] Incorrect answer when using SSE2 intrinsics with -O3 since r7-3163-g973625a04b3d9351f2485e37f7d3382af2aed87e

2024-03-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111683 --- Comment #5 from kugan at gcc dot gnu.org --- -O3 -fno-tree-vectorize and -O3 -fno-tree-vrp works. I looked at the ever dump and it is not doing anything suspicious. Looks like range_info usage in vectoriser is causing the problem.

[Bug middle-end/116337] New: Reverse iterated loops has redundant code compared to clang

2024-08-11 Thread kugan at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- For: extern __attribute__((aligned(64))) int a[32000],b[32000]; void s1112(void) { for (int i = 32000 - 1; i >= 0

[Bug tree-optimization/116338] New: GCC is not vectoring TSVC s255 while clang can

2024-08-11 Thread kugan at gcc dot gnu.org via Gcc-bugs
: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- reduced test case: typedef float real_t; extern __attribute__((aligned(64))) real_t a[32000], b[32000]; void s255() { real_t x, y; x = b[32000 -1

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-08-13 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #20 from kugan at gcc dot gnu.org --- (In reply to Richard Sandiford from comment #19) > (In reply to Richard Biener from comment #14) > > Usually targets do have a limit on the actual length

[Bug tree-optimization/116338] GCC is not vectoring TSVC s255 while clang can

2024-08-20 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116338 --- Comment #3 from kugan at gcc dot gnu.org --- (In reply to Richard Biener from comment #2) > The issue is the recurrence > >[local count: 10737416]: > x_10 = b[31999]; > y_11 = b[31998]; > >[local count: 10

[Bug tree-optimization/116338] GCC is not vectoring TSVC s255 while clang can

2024-08-20 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116338 --- Comment #5 from kugan at gcc dot gnu.org --- (In reply to Richard Biener from comment #4) > You can try to see whether adding a SSA copy would make this supported, it > seems not allowing a PHI is simply a missed feature. We now f

[Bug tree-optimization/116528] New: Not vectoring TSVC s318 loop

2024-08-28 Thread kugan at gcc dot gnu.org via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- See: typedef float real_t; extern __attribute__((aligned(64))) real_t a[32000]; real_t not_woring(struct args_t *func_args) { int k, index; int inc = 1; real_t max, chksum

[Bug middle-end/116562] New: wrong cost of gather load preventing loop from vectored

2024-09-01 Thread kugan at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- typedef int real_t; extern __attribute__((aligned(64))) real_t a[32000],b[32000],c[32000],d[32000]; void s4117() { for (int i = 0; i

[Bug middle-end/116626] New: ICE while VLA vectorisation

2024-09-05 Thread kugan at gcc dot gnu.org via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Created attachment 59057 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59057&action=edit testcase For a partally reduced code, I am seeing: t.cpp:350:12: internal compile

[Bug middle-end/116626] ICE while VLA vectorisation

2024-09-05 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116626 --- Comment #1 from kugan at gcc dot gnu.org --- Looks duplicate of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116569

[Bug middle-end/114653] New: Not vectoring the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Created attachment 57910 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57910&action=edit testcase Main loop in the attached test case is not vectoriz

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 --- Comment #2 from kugan at gcc dot gnu.org --- Thanks. I see the following in the log: test.cpp:33:53: missed: not vectorized: relevant stmt not supported: _54 = .MASK_LOAD (_53, 32B, _171); test.cpp:22:19: missed: bad operation or

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 --- Comment #3 from kugan at gcc dot gnu.org --- For SVE mode in vect_analyze_loop_2, we have (gdb) p min_vf $15 = {coeffs = {4, 4}} (gdb) p max_vf $16 = 16 Thus maybe_lt (max_vf, min_vf)) is false. This results in bad data dependence.

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 --- Comment #4 from kugan at gcc dot gnu.org --- This particular loop has loop->safelen set to 16. Does this mean this can never be loop vectorized for VLA?

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 --- Comment #5 from kugan at gcc dot gnu.org --- ddd for the : ref_a: _57 = D.4803[_20]; ref_b: D.4803[_20] = _ifc__174; We get DDR_ARE_DEPENDENT (ddr) == chrec_dont_know. Hence apply_safelen ().

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 kugan at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |DUPLICATE Status

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 114653, which changed state. Bug 114653 Summary: Not vectorizing the loop with openmp reduction. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 What|Removed |Added -

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #9 from kugan at gcc dot gnu.org --- Looking at the options, looks to me that making loop->safelen a poly_in is the way to go. (In reply to Jakub Jelinek from comment #4) > The OpenMP safelen clause argument is a scalar integ

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #10 from kugan at gcc dot gnu.org --- Created attachment 57946 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57946&action=edit patch patch to make loop->safelen a poly_int

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #12 from kugan at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #11) > (In reply to kugan from comment #9) > > Looking at the options, looks to me that making loop->safelen a poly_in is > > the way t

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #18 from kugan at gcc dot gnu.org --- Also, can we set INT_MAX when there is no explicit safelen specified in OMP. Something like: --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -6975,14 +6975,11 @@ lower_rec_input_clauses (tree

[Bug tree-optimization/115383] New: ICE with TCVC_2 build

2024-06-07 Thread kugan at gcc dot gnu.org via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Patch [PATCH 1/4] Relax COND_EXPR reduction vectorization SLP restriction seem to cause ICE while building TSVC_2 Reduced test: cat tsvc_vec.i void dummy(); void s331() { int j

[Bug tree-optimization/115383] [15 Regression] ICE with TCVC_2 build since r15-1053-g28edeb1409a7b8

2024-06-07 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115383 --- Comment #5 from kugan at gcc dot gnu.org --- (In reply to Richard Biener from comment #4) > Created attachment 58378 [details] > patch > > I'm testing this, but I do not have hardware to test correctness (and qemu > not

[Bug tree-optimization/115383] [15 Regression] ICE with TCVC_2 build since r15-1053-g28edeb1409a7b8

2024-06-07 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115383 --- Comment #6 from kugan at gcc dot gnu.org --- (In reply to kugan from comment #5) > (In reply to Richard Biener from comment #4) > > Created attachment 58378 [details] > > patch > > > > I'm testing this, bu

[Bug libgomp/113698] New: GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance

2024-01-31 Thread kugan at gcc dot gnu.org via Gcc-bugs
Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Created attachment 57275 --> https://gcc.gnu.

[Bug tree-optimization/115450] New: cpu2017 502.gcc runtime miscompute

2024-06-11 Thread kugan at gcc dot gnu.org via Gcc-bugs
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- 5022.gcc is meicompiling for aarch64 with -O3 -Wl,-z,muldefs -lm -fallow-argument-mismatch -fpermissive -fstack-arrays -flto -Wl,--sort-section=name -fno-strict-aliasing

[Bug tree-optimization/115450] [15 Regression] cpu2017 502.gcc runtime miscompute on aarch64 with SVE since r15-1006-gd93353e6423eca

2024-06-16 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115450 --- Comment #2 from kugan at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #1) > >[r15-1006-gd93353e6423eca] Do single-lane SLP discovery for reductions > > > Interesting because PR 115256 bisect it to an earlier p

[Bug tree-optimization/116785] [15 Regression] RAJAPerf REDUCE_SUM regresses with r15-792-gf0a02467bbc35a

2024-09-30 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116785 --- Comment #14 from kugan at gcc dot gnu.org --- (In reply to Richard Biener from comment #13) > Did it help? Thanks for the quick Fix. This commit brings back most of the regression. Please note that the current trunk seems to be broken

[Bug tree-optimization/117050] [15 Regression] ice in vect_build_slp_tree_2

2024-10-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117050 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug target/115258] [14 Regression] register swaps for vector perm in some cases after r14-6290

2024-09-17 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115258 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug tree-optimization/116785] [15 Regression] RAJAPerf REDUCE_SUM regresses with r15-792-gf0a02467bbc35a

2024-09-24 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116785 --- Comment #10 from kugan at gcc dot gnu.org --- Created attachment 59186 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59186&action=edit reduced test (second attempt) Sorry about the test case. Here is another attempt at reducing.

[Bug tree-optimization/116785] New: RAJAPerf REDUCE_SUM regresses with commit f0a02467bbc35a478eb82f5a8a7e8870827b51fc

2024-09-19 Thread kugan at gcc dot gnu.org via Gcc-bugs
Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Some of the loops in RAJAPerf are not vectored with the change. This results in ~64% regression for this

[Bug tree-optimization/116785] RAJAPerf REDUCE_SUM regresses with commit g:f0a02467bbc35a478eb82f5a8a7e8870827b51fc

2024-09-19 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116785 --- Comment #2 from kugan at gcc dot gnu.org --- Created attachment 59155 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59155&action=edit creduce reduced file

[Bug tree-optimization/116785] RAJAPerf REDUCE_SUM regresses with commit f0a02467bbc35a478eb82f5a8a7e8870827b51fc

2024-09-19 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116785 --- Comment #1 from kugan at gcc dot gnu.org --- Created attachment 59154 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59154&action=edit preprocessed file

[Bug ipa/117782] template ICE in write_unscoped_name while using autofda bootstrap on aarch64

2024-11-27 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117782 --- Comment #9 from kugan at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #8) > Can you try again now that PR 117350 has actually been pushed? Thanks. This fixes.

[Bug ipa/117782] template ICE in write_unscoped_name while using autofda bootstrap on aarch64

2024-11-27 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117782 kugan at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED Resolution

[Bug ipa/117782] template ICE in write_unscoped_name while using autofda bootstrap on aarch64

2024-11-26 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117782 --- Comment #6 from kugan at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #5) > Specifically see > https://inbox.sourceware.org/gcc-patches/20241031204043.3231740-1-ak@linux. > intel.com/T/#u . > > You need to

[Bug c++/117782] New: template ICE in write_unscoped_name while using autofda bootstrap on aarch64

2024-11-25 Thread kugan at gcc dot gnu.org via Gcc-bugs
: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Created attachment 59704 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59704&action=edit testcase (gdb

[Bug c++/117782] template ICE in write_unscoped_name while using autofda bootstrap on aarch64

2024-11-25 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117782 --- Comment #1 from kugan at gcc dot gnu.org --- Created attachment 59705 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59705&action=edit profile gcov

[Bug c++/117782] template ICE in write_unscoped_name while using autofda bootstrap on aarch64

2024-11-25 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117782 --- Comment #2 from kugan at gcc dot gnu.org --- --- a/gcc/cp/mangle.cc +++ b/gcc/cp/mangle.cc @@ -1194,6 +1194,7 @@ write_unscoped_name (const tree decl) in a local function scope. A lambda can also be mangled in the scope of

[Bug target/118320] New: [aarch64] internal compiler error: Segmentation fault in aarch64-ldp-fusion.cc

2025-01-06 Thread kugan at gcc dot gnu.org via Gcc-bugs
: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Created attachment 60058 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60058&action=edit testcase ICE

[Bug middle-end/120614] 525.x264_r is ~30% slower with AutoFDO

2025-06-11 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614 --- Comment #8 from kugan at gcc dot gnu.org --- (In reply to Jan Hubicka from comment #6) > Also BTW, I think it is useful to do the dumps wth -details-blocks since > that also dumps BB count inconsistencies caused by AutoFDO th

[Bug middle-end/120614] 525.x264_r is ~30% slower with AutoFDO

2025-06-11 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614 --- Comment #7 from kugan at gcc dot gnu.org --- (In reply to Jan Hubicka from comment #6) > Also BTW, I think it is useful to do the dumps wth -details-blocks since > that also dumps BB count inconsistencies caused by AutoFDO th

[Bug middle-end/120614] 525.x264_r is ~30% slower with AutoFDO

2025-06-11 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614 --- Comment #10 from kugan at gcc dot gnu.org --- (In reply to Jan Hubicka from comment #9) > > > as mentioned by Andrew, it is important to clone and also resolve indirect > > > calls. Those auto-FDO 0 may prevent it from happ

[Bug middle-end/120614] 525.x264_r is ~30% slower with AutoFDO

2025-06-12 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614 --- Comment #11 from kugan at gcc dot gnu.org --- This specific ICE seems to be fixed with e416c8097fc87513e05c2d104c63488f733758c0 Thanks for the fix. I am now seeing one in: x264_src/common/mc.c: In function 'mc_weight_w16.part.0'

[Bug middle-end/120614] 525.x264_r is ~30% slower with AutoFDO

2025-06-12 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614 --- Comment #12 from kugan at gcc dot gnu.org --- (In reply to kugan from comment #11) > This specific ICE seems to be fixed with > e416c8097fc87513e05c2d104c63488f733758c0 > Thanks for the fix. > > I am now seeing one in: >

[Bug gcov-profile/120614] New: 525.x264_r is ~30% slower with AutoFDO

2025-06-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
-profile Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- 525.x264_r is ~30% slower with AutoFDO compared to PGO. Functions that contribute most to the regressions are: x264_pixel_sad_x4_16x16.lto_priv.0 x264_pixel_sad_x4_8x8

[Bug middle-end/120614] 525.x264_r is ~30% slower with AutoFDO

2025-06-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614 --- Comment #3 from kugan at gcc dot gnu.org --- Created attachment 61610 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61610&action=edit x264_pixel_sad_x4_16x16.diff

[Bug middle-end/120614] 525.x264_r is ~30% slower with AutoFDO

2025-06-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614 --- Comment #4 from kugan at gcc dot gnu.org --- x264_pixel_sad_x4_16x16.diff is at -O3 without -flto. Function level profiling is same even with -flto. x264_pixel_sad_x4_16x16 total:18508 head:4627 0: 4627 0.1: 0 0.2: 0 0.3: 0 0.4

<    1   2   3