Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments

2016-08-19 Thread Kugan Vivekanandarajah
Ping? https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00872.html Thanks, Kugan On 11 August 2016 at 09:09, kugan wrote: > Hi, > > > On 10/08/16 20:28, Richard Biener wrote: >> >> On Wed, Aug 10, 2016 at 10:57 AM, Jakub Jelinek wrote: >>> >>> On Wed, Aug

Re: [RFC][IPA-VRP] Early VRP Implementation

2016-08-22 Thread Kugan Vivekanandarajah
Hi, On 19 August 2016 at 21:41, Richard Biener wrote: > On Tue, Aug 16, 2016 at 9:45 AM, kugan > wrote: >> Hi Richard, >> >> On 12/08/16 20:43, Richard Biener wrote: >>> >>> On Wed, Aug 3, 2016 at 3:17 AM, kugan >>> wrote: >> >> &g

Re: [RFC][IPA-VRP] Add support for IPA VRP in ipa-cp/ipa-prop

2016-08-29 Thread Kugan Vivekanandarajah
ant here. The analysis phase should > not determine > anything if function is reachable non-locally. Removed it. >> +/* Info about value ranges. */ >> + >> +struct GTY(()) ipa_vr >> +{ >> + /* The data fields below are valid only if known is true. */ >

Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments

2016-09-02 Thread Kugan Vivekanandarajah
Hi Richard, On 25 August 2016 at 22:24, Richard Biener wrote: > On Thu, Aug 11, 2016 at 1:09 AM, kugan > wrote: >> Hi, >> >> >> On 10/08/16 20:28, Richard Biener wrote: >>> >>> On Wed, Aug 10, 2016 at 10:57 AM, Jakub Jelinek wrote: >>&

Re: [RFC][IPA-VRP] Early VRP Implementation

2016-09-02 Thread Kugan Vivekanandarajah
Ping ? Thanks, Kugan On 23 August 2016 at 12:11, Kugan Vivekanandarajah wrote: > Hi, > > On 19 August 2016 at 21:41, Richard Biener wrote: >> On Tue, Aug 16, 2016 at 9:45 AM, kugan >> wrote: >>> Hi Richard, >>> >>> On 12/08/16 20:43, Richard Bie

[RFC][SSA] Iterator to visit SSA

2016-09-04 Thread Kugan Vivekanandarajah
follow some consistent usage here. It might be also good to gave a FOR_EACH_SSAVAR iterator as we do in other case. Here is attempt to do this based on what is done in other places. Bootstrapped and regression tested on X86_64-linux-gnu with no new regressions. is this OK? Thanks, Kugan gcc

Re: [RFC][SSA] Iterator to visit SSA

2016-09-05 Thread Kugan Vivekanandarajah
Hi Richard, On 5 September 2016 at 17:57, Richard Biener wrote: > On Mon, Sep 5, 2016 at 7:26 AM, Kugan Vivekanandarajah > wrote: >> Hi All, >> >> While looking at gcc source, I noticed that we are iterating over SSA >> variable from 0 to num_ssa_names in some case

Re: [RFC][SSA] Iterator to visit SSA

2016-09-06 Thread Kugan Vivekanandarajah
Hi Richard, On 6 September 2016 at 19:08, Richard Biener wrote: > On Tue, Sep 6, 2016 at 2:24 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 5 September 2016 at 17:57, Richard Biener >> wrote: >>> On Mon, Sep 5, 2016 at 7:26 AM, Kugan V

Re: [RFC][SSA] Iterator to visit SSA

2016-09-06 Thread Kugan Vivekanandarajah
Hi Richard, On 6 September 2016 at 19:57, Richard Biener wrote: > On Tue, Sep 6, 2016 at 11:33 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 6 September 2016 at 19:08, Richard Biener >> wrote: >>> On Tue, Sep 6, 2016 at 2:24 AM, Kug

Re: [RFC][SSA] Iterator to visit SSA

2016-09-06 Thread Kugan Vivekanandarajah
Hi Richard, On 6 September 2016 at 19:08, Richard Biener wrote: > On Tue, Sep 6, 2016 at 2:24 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 5 September 2016 at 17:57, Richard Biener >> wrote: >>> On Mon, Sep 5, 2016 at 7:26 AM, Kugan V

[PR71252][PR71269] Fix trunk errors due to stmt_to_insert

2016-05-25 Thread Kugan Vivekanandarajah
insert. 3. In rewrite_expr_tree_parallel, build_and_add_sum relies on either of operand being inserted. If that is not the case, we have to insert the stmt_to_insert before calling build_and_add_sum. 4. I also moved all the other stmt_to_insert insertion after the use stmt are created. Also regression t

Re: [PR71252][PR71269] Fix trunk errors due to stmt_to_insert

2016-05-26 Thread Kugan Vivekanandarajah
Hi Jakub, On 26 May 2016 at 18:18, Jakub Jelinek wrote: > On Thu, May 26, 2016 at 02:17:56PM +1000, Kugan Vivekanandarajah wrote: >> --- a/gcc/tree-ssa-reassoc.c >> +++ b/gcc/tree-ssa-reassoc.c >> @@ -3767,8 +3767,10 @@ swap_ops_for_binary_stmt (vec ops, >>

[PATCH1][PR71252] Fix missing swap to stmt_to_insert

2016-05-27 Thread Kugan Vivekanandarajah
trunk if the testing is fine ? Thanks, Kugan gcc/ChangeLog: 2016-05-28 Kugan Vivekanandarajah * tree-ssa-reassoc.c (swap_ops_for_binary_stmt): Fix swap such that all fields including stmt_to_insert are swapped. gcc/testsuite/ChangeLog: 2016-05-28 Kugan Vivekanandarajah * gcc.dg/tree-ssa

[PATCH2][PR71252] Fix insertion point of stmt_to_insert

2016-05-27 Thread Kugan Vivekanandarajah
PRs Thanks, Kugan gcc/testsuite/ChangeLog: 2016-05-28 Kugan Vivekanandarajah * gcc.dg/tree-ssa/pr71269.c: New test. gcc/ChangeLog: 2016-05-28 Kugan Vivekanandarajah * tree-ssa-reassoc.c (insert_stmt_before_use): Use find_insert_point so that inserted stmt will not dominate

Re: [PATCH, AArch64] Disable reg offset in quad-word store for Falkor.

2017-10-30 Thread Kugan Vivekanandarajah
Ping ? I see that Jim has clarified the comments from Andrew. Thanks, Kugan On 13 October 2017 at 08:48, Jim Wilson wrote: > On Fri, 2017-09-22 at 14:11 -0700, Andrew Pinski wrote: >> On Fri, Sep 22, 2017 at 11:39 AM, Jim Wilson >> wrote: >> > >> > On Fri,

Re: [PATCH, AArch64] Disable reg offset in quad-word store for Falkor.

2017-10-31 Thread Kugan Vivekanandarajah
Hi Jim, On 1 November 2017 at 03:12, Jim Wilson wrote: > On Tue, 2017-10-31 at 14:35 +1100, Kugan Vivekanandarajah wrote: >> Ping ? >> >> I see that Jim has clarified the comments from Andrew. > > Andrew also suggested that we add a testcase to the testsuite. I >

Re: [PATCH, AArch64] Disable reg offset in quad-word store for Falkor.

2017-11-01 Thread Kugan Vivekanandarajah
Hi, On 1 November 2017 at 03:12, Jim Wilson wrote: > On Tue, 2017-10-31 at 14:35 +1100, Kugan Vivekanandarajah wrote: >> Ping ? >> >> I see that Jim has clarified the comments from Andrew. > > Andrew also suggested that we add a testcase to the testsuite. I >

[AARCH64] implements neon vld1_*_x2 intrinsics

2017-11-06 Thread Kugan Vivekanandarajah
Hi, Attached patch implements the vld1_*_x2 intrinsics as defined by the neon document. Bootstrap for the latest patch is ongoing on aarch64-linux-gnu. Is this OK for trunk if no regressions? Thanks, Kugan gcc/ChangeLog: 2017-11-06 Kugan Vivekanandarajah * config/aarch64/aarch64

Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2017-07-21 Thread Kugan Vivekanandarajah
Ping ? Thanks, Kugan On 27 June 2017 at 11:20, Kugan Vivekanandarajah wrote: > https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00614.html added this > workaround to get kernel building with when TARGET_FIX_ERR_A53_843419 > is enabled. > > This was added to support building

Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2017-08-10 Thread Kugan Vivekanandarajah
Ping^2? Thanks, Kugan On 21 July 2017 at 20:12, Kugan Vivekanandarajah wrote: > Ping ? > > Thanks, > Kugan > > On 27 June 2017 at 11:20, Kugan Vivekanandarajah > wrote: >> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00614.html added this >> workaround

Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2017-08-28 Thread Kugan Vivekanandarajah
ping^3 Thanks, Kugan On 11 August 2017 at 16:09, Kugan Vivekanandarajah wrote: > Ping^2? > > Thanks, > Kugan > > On 21 July 2017 at 20:12, Kugan Vivekanandarajah > wrote: >> Ping ? >> >> Thanks, >> Kugan >> >> On 27 June 2017 at 11:20,

Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2017-08-29 Thread Kugan Vivekanandarajah
Hi James, On 29 August 2017 at 21:31, James Greenhalgh wrote: > On Tue, Jun 27, 2017 at 11:20:02AM +1000, Kugan Vivekanandarajah wrote: >> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00614.html added this >> workaround to get kernel building with when TARGET_FIX_ERR_A53_84341

Re: [PATCH GCC][4/5]Improve loop distribution to handle hmmer

2017-06-04 Thread Kugan Vivekanandarajah
g the internal function for this to some extend but for some cases we should be able to say while in loop distribution itself that the control flow will not result in loop being vectorized. Btw, did you run Spec2006 with this? Any notable changes ? Thanks, Kugan On 2 June 2017 at 21:51, Bin Cheng

[RFC][PATCH 0/5] Loop unrolling and memory load streams

2017-09-14 Thread Kugan Vivekanandarajah
. Thanks, Kugan

[RFC][PATCH 1/5] Add separate parms for rtl unroller

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds separate params for rtl unroller so that they can be tunned accordingly. Default values I have are based on some testing on aarch64. I am happy to leave it as the current value and set them in the back-end. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah

[RFC][AARCH64][PATCH 2/5]: Add number of hw prefetchers available to cpu_prefetch_tune

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds number of hw prefetchers available to cpu_prefetch_tune so it can be used in loop unrolling decisions. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * config/aarch64/aarch64-protos.h (struct cpu_prefetch_tune): Add new field hw_prefetchers_avail

[RFC][PACH 3/5] Prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop

2017-09-14 Thread Kugan Vivekanandarajah
This patch prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * config/aarch64/aarch64.c (count_mem_load_streams): New. (aarch64_ok_to_unroll): New

[RFC][PATCH 4/5] Change iv_analyze_result to take const_rtx.

2017-09-14 Thread Kugan Vivekanandarajah
Change iv_analyze_result to take const_rtx. This is just to make the next patch compile. No functional changes: Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * cfgloop.h (iv_analyze_result): Change 2nd param from rtx to const_rtx. * df-core.c (df_find_def

[RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * cfgloop.h (iv_analyze_biv): export. * loop-iv.c: Likewise. * config/aarch64/aarch64.c (strided_load_p

Re: [RFC][AARCH64][PATCH 2/5]: Add number of hw prefetchers available to cpu_prefetch_tune

2017-09-16 Thread Kugan Vivekanandarajah
Hi Andrew, On 15 September 2017 at 13:20, Andrew Pinski wrote: > On Thu, Sep 14, 2017 at 6:28 PM, Kugan Vivekanandarajah > wrote: >> This patch adds number of hw prefetchers available to >> cpu_prefetch_tune so it can be used in loop unrolling decisions. > > Can yo

Re: [RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

2017-09-16 Thread Kugan Vivekanandarajah
Hi Ramana On 15 September 2017 at 18:40, Ramana Radhakrishnan wrote: > On Fri, Sep 15, 2017 at 2:33 AM, Kugan Vivekanandarajah > wrote: >> This patch adds aarch64_loop_unroll_adjust to limit partial unrolling >> in rtl based on strided-loads in loop. >> >> Thanks

Re: [RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

2017-09-17 Thread Kugan Vivekanandarajah
Hi Andrew, On 15 September 2017 at 13:36, Andrew Pinski wrote: > On Thu, Sep 14, 2017 at 6:33 PM, Kugan Vivekanandarajah > wrote: >> This patch adds aarch64_loop_unroll_adjust to limit partial unrolling >> in rtl based on strided-loads in loop. > > Can you expand on th

Re: [RFC][PATCH 1/5] Add separate parms for rtl unroller

2017-09-17 Thread Kugan Vivekanandarajah
Hi Richard, On 15 September 2017 at 19:31, Richard Biener wrote: > On Fri, Sep 15, 2017 at 3:27 AM, Kugan Vivekanandarajah > wrote: >> This patch adds separate params for rtl unroller so that they can be >> tunned accordingly. Default values I have are based on some testing o

Re: [RFC][PATCH 1/5] Add separate parms for rtl unroller

2017-09-18 Thread Kugan Vivekanandarajah
Hi Richard, On 18 September 2017 at 17:50, Richard Biener wrote: > On Mon, Sep 18, 2017 at 3:36 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 15 September 2017 at 19:31, Richard Biener >> wrote: >>> On Fri, Sep 15, 2017 at 3:27 AM, Kugan

Re: [RFC][SSA] Iterator to visit SSA

2016-09-07 Thread Kugan Vivekanandarajah
Hi Richard, On 7 September 2016 at 19:35, Richard Biener wrote: > On Wed, Sep 7, 2016 at 2:21 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 6 September 2016 at 19:08, Richard Biener >> wrote: >>> On Tue, Sep 6, 2016 at 2:24 AM, Kugan Vivek

[RFC] Type promotion pass and elimination of zext/sext

2016-05-15 Thread Kugan Vivekanandarajah
based on the feedback. Please let me know what you thing. Thanks, Kugan From 332e0e9f938c6af50e826d8224d07ebf3678a0e0 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Fri, 13 May 2016 13:41:01 +1000 Subject: [PATCH 4/4] Add new type promotion pass --- gcc/ChangeLog

[AARCH64] Remove static variable all_extensions from aarch64.c

2016-05-16 Thread Kugan Vivekanandarajah
Hi, static variable all_extensions in aarch64.c is not used and therefore dead. I don’t see any reason why it should be there. Attached patch removes this. Bootstrapped on aarch64-linux-gnu. Regression testing is ongoing. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2016-05-17 Kugan

Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z

2016-05-18 Thread Kugan Vivekanandarajah
+ || real_minus_onep (last->op)) Is this Still OK. Bootstrap and regression testing on ARM, AARCH64 and x86-64 didn’t have any new regressions. Thanks, Kugan diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c index e69de29..3a5a23a 100644 --- a

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-19 Thread Kugan Vivekanandarajah
We could try Martin Liška's approach, We could also move _17 = c_7(D) * 3; at tree-ssa-reassoc.c:3897 satisfy the gcc_assert. We could do this based on the use count of _17. This patch does this. I have no preferences. Any thoughts ? Thanks, Kugan On 19 May 2016 at 18:04, Martin Lišk

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-19 Thread Kugan Vivekanandarajah
On 19 May 2016 at 18:55, Richard Biener wrote: > On Thu, May 19, 2016 at 10:26 AM, Kugan > wrote: >> Hi, >> >> >> On 19/05/16 18:21, Richard Biener wrote: >>> On Thu, May 19, 2016 at 10:12 AM, Kugan Vivekanandarajah >>> wrote: >>>> Hi

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-19 Thread Kugan Vivekanandarajah
erting the multiplication to > rewrite_expr_tree time. For example by adding a ops->stmt_to_insert > member. > Here is an implementation based on above. Bootstrap on x86-linux-gnu is OK. regression testing is ongoing. Thanks, Kugan gcc/ChangeLog: 2016-05-20 Kugan Vivekanandarajah

[PATCH] Fix PR tree-optimization/71179

2016-05-19 Thread Kugan Vivekanandarajah
tested on x86-64-linux-gnu with no new regressions. Is this OK for trunk? Thanks, Kugan gcc/testsuite/ChangeLog: 2016-05-20 Kugan Vivekanandarajah * gcc.dg/tree-ssa/pr71179.c: New test. gcc/ChangeLog: 2016-05-20 Kugan Vivekanandarajah * tree-ssa-reassoc.c

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-20 Thread Kugan Vivekanandarajah
On 20 May 2016 at 21:07, Richard Biener wrote: > On Fri, May 20, 2016 at 1:51 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >>> I think it should have the same rank as op or op + 1 which is the current >>> behavior. Sth else doesn't work c

Re: [RFC] Type promotion pass and elimination of zext/sext

2016-05-22 Thread Kugan Vivekanandarajah
Hi Jeff, On 20 May 2016 at 04:17, Jeff Law wrote: > On 05/15/2016 06:45 PM, Kugan Vivekanandarajah wrote: >> >> Hi Richard, >> >> Now that stage1 is open, I would like to get the type promotion passes >> reviewed again. I have tested the patches on aarch64, x86-6

Re: [RFC] Type promotion pass and elimination of zext/sext

2016-05-22 Thread Kugan Vivekanandarajah
(optimized). I will also try to gather test-cases based on testing/benchmarking. Thanks, Kugan

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-23 Thread Kugan Vivekanandarajah
On 23 May 2016 at 21:35, Richard Biener wrote: > On Sat, May 21, 2016 at 8:08 AM, Kugan Vivekanandarajah > wrote: >> On 20 May 2016 at 21:07, Richard Biener wrote: >>> On Fri, May 20, 2016 at 1:51 AM, Kugan Vivekanandarajah >>> wrote: >>>> Hi Richard,

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-24 Thread Kugan Vivekanandarajah
On 24 May 2016 at 18:36, Christophe Lyon wrote: > On 24 May 2016 at 05:13, Kugan Vivekanandarajah > wrote: >> On 23 May 2016 at 21:35, Richard Biener wrote: >>> On Sat, May 21, 2016 at 8:08 AM, Kugan Vivekanandarajah >>> wrote: >>>> On 20 May 2016 at

[PR71252][PATCH] ICE: verify_ssa failed

2016-05-24 Thread Kugan Vivekanandarajah
reducing the test-case is appreciated. Regression testing on x86_64-linux-gnu and bootstrap didn’t find any new issues. Is this OK for trunk? Thanks, Kugan gcc/testsuite/ChangeLog: 2016-05-24 Kugan Vivekanandarajah * gfortran.dg/pr71252.f90: New test. gcc/ChangeLog: 2016-05-24 Kugan

[testcase] Fix absfloat16.c testcase

2024-09-29 Thread Kugan Vivekanandarajah
Hi, This patch Fixes absfloat16.c testcase to have the dg-add-options float16 at the correct order. Due to this mixup, this test is failing for some arm variants. Is this OK for trunk? Thanks, Kugan 0001-Fix-absfloat16.c-testcase.patch Description: 0001-Fix-absfloat16.c-testcase.patch

Re: [PR middle-end/114635] Set OMP safelen handling to INT_MAX when the pragma didn’t provide one.

2024-10-07 Thread Kugan Vivekanandarajah
ping? Thanks, Kugan From: Kugan Vivekanandarajah Sent: Tuesday, 20 August 2024 6:18 PM To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org ; richard.guent...@gmail.com ; richard.sandif...@arm.com Subject: Re: [PR middle-end/114635] Set OMP safelen handling to

Re: [PR middle-end/114635] Set OMP safelen handling to INT_MAX when the pragma didn’t provide one.

2024-10-13 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. > On 8 Oct 2024, at 7:15 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Mon, Aug 5, 2024 at 7:05 AM Kugan Vivekanandarajah > wrote: >> >> >> >>>

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-10-28 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. > On 25 Oct 2024, at 8:53 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Fri, Oct 25, 2024 at 12:22 AM Kugan Vivekanandarajah > wrote: >> &g

Re: [PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-29 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. > On 28 Oct 2024, at 9:18 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Mon, Oct 28, 2024 at 9:35 AM Kugan Vivekanandarajah > wrote: >> >> Hi, >> >> When ifcvt

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-10-31 Thread Kugan Vivekanandarajah
> On 31 Oct 2024, at 6:18 pm, Jakub Jelinek wrote: > > External email: Use caution opening links or attachments > > > On Tue, Oct 29, 2024 at 05:01:40AM +, Kugan Vivekanandarajah wrote: >> For param_vect_max_version_for_alias_checks of 15, the average code si

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-11-02 Thread Kugan Vivekanandarajah
> On 31 Oct 2024, at 7:29 pm, Jakub Jelinek wrote: > > External email: Use caution opening links or attachments > > > On Thu, Oct 31, 2024 at 08:21:09AM +, Kugan Vivekanandarajah wrote: >> >> >>> On 31 Oct 2024, at 6:18 pm, Jakub Jelinek wrote: >

[PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-28 Thread Kugan Vivekanandarajah
different from general dont_vectorize) specifically for loops versioned. BB vectorization does not need to honour this and still can vectorize. Bootstrapped and regression tested on aarch64-linux-gnu with no new regressions. Is this OK? Thanks, Kugan 0001-PATCH-Fix-SLP-when-ifcvt-versioned-loop-is

[RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-10-24 Thread Kugan Vivekanandarajah
f at teast 11 where as the current default is 10. Bootstrapped and regression tested on aarc64-linux-gnu with no new regressions. Thanks, Kugan 0001-RFC-PATCH-Adjust-param_vect_max_version_for_alias_ch.patch Description: 0001-RFC-PATCH-Adjust-param_vect_max_version_for_alias_ch.patch

[PATCH][AARCH64][PR115258]Fix excess moves

2024-10-24 Thread Kugan Vivekanandarajah
h one insm. Hence, when the operands are equal, split after reload. Bootstrapped and recession tested on aarch64-linux-gnu, Is this ok for trunk? Thanks, Kugan 0001-PATCH-AARCH64-PR115258-Fix-excess-moves.patch Description: 0001-PATCH-AARCH64-PR115258-Fix-excess-moves.patch

[testsuite] Fix bb-slp-77.c for x86

2024-10-31 Thread Kugan Vivekanandarajah
hen I force the loop to unroll for x86. Thus, to keep it simple, moving the test to gcc.target/aarch64. Regression tested on aarch64-linux-gnu. Is this OK? Thanks, Kugan 0001-testsuite-Fix-bb-slp-77.c.patch Description: 0001-testsuite-Fix-bb-slp-77.c.patch

Re: [PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-30 Thread Kugan Vivekanandarajah
Hi Richard, > On 29 Oct 2024, at 8:33 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Tue, Oct 29, 2024 at 9:24 AM Kugan Vivekanandarajah > wrote: >> >> Hi Richard, >> Thanks for the review. >> &

Re: [PATCH] MATCH: add abs support for half float

2024-09-20 Thread Kugan Vivekanandarajah
Hi Richard, > On 17 Sep 2024, at 7:36 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Tue, Sep 17, 2024 at 10:31 AM Kugan Vivekanandarajah > wrote: >> >> Hi Richard, >> >>> On 10 Sep 2024, at 9:33 

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-11-14 Thread Kugan Vivekanandarajah
Ping? Thanks, Kugan > On 2 Nov 2024, at 7:49 pm, Kugan Vivekanandarajah > wrote: > > External email: Use caution opening links or attachments > > > > On 31 Oct 2024, at 7:29 pm, Jakub Jelinek wrote: > > > > External email: Use caution opening links or a

Re: [PATCH][AARCH64][PR115258]Fix excess moves

2025-02-25 Thread Kugan Vivekanandarajah
Hi Richard, I want to follow up on this and see if you have a fix for this. Thanks, Kugan > On 29 Oct 2024, at 9:41 pm, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Kugan Vivekanandarajah writes: >> Hi, >

[AUTOFDO] Fix annotated profile for de-duplicated call

2025-05-08 Thread Kugan Vivekanandarajah
annotate profile for GIMPLE_CALL stmt and extract BB counts from edge counts. Regression tested on aarch64-linux-gnu with no new regression. Also successfully done autoprofiledbootstrap with the relevant patch. Is this OK for trunk? Thanks, Kugan 0001-AUTOFDO-Fix-annotated-profile-for-de

[AUTOFDO] Merge profiles of clones before annotating

2025-05-08 Thread Kugan Vivekanandarajah
. Regression tested on aarch64-linux-gnu with no new regression. Also successfully done autoprofiledbootstrap with the relevant patch. Is this OK for trunk? Thanks, Kugan 0002-AUTOFDO-Merge-profiles-of-clones-before-annotating.patch Description: 0002-AUTOFDO-Merge-profiles-of-clones-before

[AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-08 Thread Kugan Vivekanandarajah
new regression. Also successfully done autoprofiledbootstrap with the relevant patch. Is this OK for trunk? Thanks, Kugan 0004-AUTOFDO-AARCH64-Add-support-for-profilebootstrap.patch Description: 0004-AUTOFDO-AARCH64-Add-support-for-profilebootstrap.patch

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-13 Thread Kugan Vivekanandarajah
Adding Eugene and Andi to CC as Sam suggested. > On 13 May 2025, at 12:57 am, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Kugan Vivekanandarajah writes: >> diff --git a/configure.ac b/configure.ac >> inde

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-20 Thread Kugan Vivekanandarajah
Thanks Richard for the review. > On 20 May 2025, at 2:47 am, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Kugan Vivekanandarajah writes: >> diff --git a/Makefile.in b/Makefile.in >> index b1ed67d3d4f..b5e3e5

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-19 Thread Kugan Vivekanandarajah
> On 16 May 2025, at 12:10 am, Andi Kleen wrote: > > External email: Use caution opening links or attachments > > > On Wed, May 14, 2025 at 02:46:15AM +, Kugan Vivekanandarajah wrote: >> Adding Eugene and Andi to CC as Sam suggested. >> >>> On 13 M

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-25 Thread Kugan Vivekanandarajah
> On 26 May 2025, at 2:25 pm, Andrew Pinski wrote: > > External email: Use caution opening links or attachments > > > On Tue, May 20, 2025 at 3:09 AM Kugan Vivekanandarajah > wrote: >> >> Thanks Richard for the review. >> >>> On 20 May

Re: [AUTOFDO] Merge profiles of clones before annotating

2025-05-26 Thread Kugan Vivekanandarajah
es and only see if afdo annotations are there. Any thoughts? Thanks, Kugan > > Honza > <0002-AUTOFDO-Merge-profiles-of-clones-before-annotating.patch>

Re: [AUTOFDO] Merge profiles of clones before annotating

2025-05-26 Thread Kugan Vivekanandarajah
10:18.479228 1692721 symbol_map.cc:477] Adding loadable exec > segment: offset=1000 vaddr=401000 > > Did someone run SPEC recently? I made auto-FDO spec config and tested > -Ofast with ipa-icf, ipa-cp-clone and ipa-sra disabled (to get rid of > the clone merging). I get sort of com

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-27 Thread Kugan Vivekanandarajah
> On 26 May 2025, at 2:47 pm, Kugan Vivekanandarajah > wrote: > > External email: Use caution opening links or attachments > > > > On 26 May 2025, at 2:25 pm, Andrew Pinski wrote: > > > > External email: Use caution opening links or attachments >

Re: [AUTOFDO] Merge profiles of clones before annotating

2025-05-21 Thread Kugan Vivekanandarajah
Ping? Thanks, Kugan > On 9 May 2025, at 11:54 am, Kugan Vivekanandarajah > wrote: > > External email: Use caution opening links or attachments > > > This patch add support for merging profiles from multiple clones. > That is, when optimized binaries have clones suc

Re: [AUTOFDO] Fix annotated profile for de-duplicated call

2025-05-21 Thread Kugan Vivekanandarajah
Ping? Thanks, Kugan > On 9 May 2025, at 11:51 am, Kugan Vivekanandarajah > wrote: > > External email: Use caution opening links or attachments > > > This patch fixes wrong annotation of profiles when call statement is > de-duplicated. i.e., when we may have same st

Re: [AUTOFDO] Enable ipa-split for auto-profile

2025-05-21 Thread Kugan Vivekanandarajah
Ping? Thanks, Kugan > On 9 May 2025, at 11:55 am, Kugan Vivekanandarajah > wrote: > > ipa-split is not now run for auto-profile. IMO this was an oversight. > This patch enables it similar to PGO runs. > > gcc/ChangeLog: > >* ipa-split.cc pass_feedback_spl

[AutoFDO] Profile merging for clone test

2025-06-04 Thread Kugan Vivekanandarajah
: * auto-profile.cc (autofdo_source_profile::read): Dump message while merging profile. * pass_manager.h (get_pass_auto_profile): New. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/clone-merge-1.c: New test. Is this OK? Thanks, Kugan 0001-AutoFDO_v2-Profile-merging-for-clone

[PATCH] [AUTOFDO] Enable autofdo tests for aarch64

2025-05-28 Thread Kugan Vivekanandarajah
: Enable autofdo tests for aarch64. Is this OK? Thanks, Kugan 0001-AUTOFDO-Enable-autofdo-tests-for-aarch64.patch Description: 0001-AUTOFDO-Enable-autofdo-tests-for-aarch64.patch

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-06-06 Thread Kugan Vivekanandarajah
Hi Honza, > On 6 Jun 2025, at 6:34 pm, Jan Hubicka wrote: > > External email: Use caution opening links or attachments > > >> Kugan Vivekanandarajah writes: >>> Add support for autoprofiledbootstrap in aarch64. >>> This is similar to what is done for

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-06-06 Thread Kugan Vivekanandarajah
one used for gcc/config/). > > It is incovenient that the toplevel doesn't have access to the logic > used to set that variable though... I changed it to: +# Special case cpu_type for x86_64 as it shares AUTO_PROFILE from i386. +if test "${cpu_type}" = "x86_64" ; then + cpu_type="i386" +fs Is this ok? Tested on x86_64 and aarch64 linux-gnu. Thanks, Kugan > > Richard 0001-AutoFDO-Fix-profile-bootstrap-for-x86_64.patch Description: 0001-AutoFDO-Fix-profile-bootstrap-for-x86_64.patch

Re: [AutoFDO] Profile merging for clone test

2025-06-08 Thread Kugan Vivekanandarajah
> On 9 Jun 2025, at 9:43 am, Kugan Vivekanandarajah > wrote: > > > > > On 7 Jun 2025, at 3:30 pm, Kugan Vivekanandarajah > > wrote: > > > > Hi, > > > > > > > On 6 Jun 2025, at 4:15 pm, Kugan Vivekanandarajah > > > wrote

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-06-08 Thread Kugan Vivekanandarajah
n sticking to the one > that gcc/configure* already uses (i.e. the one used for gcc/config/). > > It is incovenient that the toplevel doesn't have access to the logic > used to set that variable though... > I changed it to: +# Special case cpu_type for x86_64 as it shares AU

Re: [AUTOFDO] Merge profiles of clones before annotating

2025-05-28 Thread Kugan Vivekanandarajah
elf. Private (static) functions with the same name also will have the same issue. Dhruv is working on an RFC for this. Thanks, Kugan > > Overwritting the data by the last clone is definitely bad, so the patch > is OK, but we should figure out what happens in the cases above. > >

Re: [PATCH] [AUTOFDO] Enable autofdo tests for aarch64

2025-05-29 Thread Kugan Vivekanandarajah
roll "Peeled loop 2, 1 times” I also noticed that some tests are only enabled for x86. I am also seeing: ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/pr66295.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/split-1.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/val-prof-10.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/val-prof-7.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/pr66295.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/split-1.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/val-prof-10.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/val-prof-7.c Thanks, Kugan > Honza

Re: [AutoFDO] Profile merging for clone test

2025-06-05 Thread Kugan Vivekanandarajah
zero, preserve quality info. */ - else if (count->nonzero_p () + else if (!count->nonzero_p () + || count->quality () == GUESSED_LOCAL || count->quality () == GUESSED) *count = profile_count::zero ().afdo (); } Thanks, Kugan > > Honza >> >> Thanks, >> Kugan

Re: [AutoFDO] Profile merging for clone test

2025-06-05 Thread Kugan Vivekanandarajah
Hi Andrew, > On 6 Jun 2025, at 8:18 am, Andrew Pinski wrote: > > External email: Use caution opening links or attachments > > > On Wed, Jun 4, 2025 at 12:02 AM Kugan Vivekanandarajah > wrote: >> >> This patch introduces a new testcase to verify the mergin

Re: [PATCH 0/1] [RFC][AutoFDO] Propagate inline information to outline definitions if not inlined

2025-06-13 Thread Kugan Vivekanandarajah
ay? >> >> Splitting out inlining as its own phase also means that it can >> eventually be handed off to ipa-inline to handle, thus making >> auto-profile independent of early inline. This will simplify the code a >> fair bit. Is this a good direction to go in? >

Re: [AutoFDO] Fix get_original_name to strip only names that are generated after auto-profile

2025-06-16 Thread Kugan Vivekanandarajah
> On 17 Jun 2025, at 4:18 pm, Dhruv Chawla wrote: > > On 17/06/25 06:10, Kugan Vivekanandarajah wrote: >> External email: Use caution opening links or attachments >> Hi, >> As discusses earlier, get_original_name is used to match profile binary >> names to >

[AutoFDO] Fix get_original_name to strip only names that are generated after auto-profile

2025-06-16 Thread Kugan Vivekanandarajah
running autoprofiledbootstrap and tree-prof check that exercises auto-profile pass. gcc/ChangeLog: * auto-profile.cc (isAsciiDigit): New. (get_original_name): Strip suffixes only for compiler generated names tat happens after auto-profile. Thanks, Kugan 0001-AutoFDO-Fix

Re: Improve static and AFDO profile combination

2025-06-17 Thread Kugan Vivekanandarajah
tialized. */ > struct cgraph_edge *new_edge > - = indirect_edge->make_speculative (direct_call, > -profile_count::uninitialized ()); > + = indirect_edge->make_speculative > + (direct_call, > +

Re: [AutoFDO] Fix get_original_name to strip only names that are generated after auto-profile

2025-06-18 Thread Kugan Vivekanandarajah
Hi, > On 17 Jun 2025, at 4:51 pm, Kugan Vivekanandarajah > wrote: > > External email: Use caution opening links or attachments > > >> On 17 Jun 2025, at 4:18 pm, Dhruv Chawla wrote: >> >> On 17/06/25 06:10, Kugan Vivekanandarajah wrote: >>>

Re: [AutoFDO] Fix get_original_name to strip only names that are generated after auto-profile

2025-06-18 Thread Kugan Vivekanandarajah
> Given that this is tail-recursive, I feel like recursion is not necessary here > and it would be more efficient to have this be a loop instead. The > implementation looks okay as is, though. IMO doing this in a loop would have to handle all the above cases and

Re: [PATCH 0/1] [RFC][AutoFDO]: Source filename tracking in GCOV

2025-06-18 Thread Kugan Vivekanandarajah
Number of samples to get to the desrired percentile. Should we also track the branch probability in GCOV. This should be easy to calculate from perf profille. This may help disambiguate profile counts. Thanks, Kugan > > seems like useful info to handle autoFDO 0s more orrectly, so

<    1   2   3   4   5   6