[RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-10-25 Thread Kugan Vivekanandarajah
2018-10-25 Kugan Vivekanandarajah * tree-scalar-evolution.c (expression_expensive_p): Make BUILTIN POPCOUNT as expensive when backend does not define it. gcc/testsuite/ChangeLog: 2018-10-25 Kugan Vivekanandarajah * gcc.target/aarch64/popcount4.c: New test.

[ABSU_EXPR] Add some of the missing patterns in match.pd

2018-10-25 Thread Kugan Vivekanandarajah
gcc/testsuite/ChangeLog: 2018-10-25 Kugan Vivekanandarajah * gcc.dg/gimplefe-30.c: New test. * gcc.dg/gimplefe-31.c: New test. * gcc.dg/gimplefe-32.c: New test. * gcc.dg/gimplefe-33.c: New test. gcc/ChangeLog: 2018-10-25 Kugan Vivekanandarajah * doc/generic.texi

[PR87469] ICE in record_estimate, at tree-ssa-loop-niter.c

2018-10-27 Thread Kugan Vivekanandarajah
this OK? Thanks, Kugan gcc/testsuite/ChangeLog: 2018-10-26 Kugan Vivekanandarajah PR middle-end/87469 * g++.dg/pr87469.C: New test. gcc/ChangeLog: 2018-10-26 Kugan Vivekanandarajah PR middle-end/87469 * tree-ssa-loop-niter.c (number_of_iterations_popcount): Fix niter max

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-10-28 Thread Kugan Vivekanandarajah
Hi Richard and Jeff, Thanks for your comments. On Fri, 26 Oct 2018 at 19:40, Richard Biener wrote: > > On Fri, Oct 26, 2018 at 4:55 AM Jeff Law wrote: > > > > On 10/25/18 4:33 PM, Kugan Vivekanandarajah wrote: > > > Hi, > > > > > > PR87528 sho

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-11-02 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. On Tue, 30 Oct 2018 at 01:25, Richard Biener wrote: > > On Mon, Oct 29, 2018 at 2:06 AM Kugan Vivekanandarajah > wrote: > > > > Hi Richard and Jeff, > > > > Thanks for your comments. > > > > On Fri, 26

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-11-11 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. On Thu, 8 Nov 2018 at 00:03, Richard Biener wrote: > > On Fri, Nov 2, 2018 at 10:02 AM Kugan Vivekanandarajah > wrote: > > > > Hi Richard, > > Thanks for the review. > > On Tue, 30 Oct 2018 at 01:25, Richard Biener > >

[SVE ACLE] svbic implementation

2019-03-19 Thread Kugan Vivekanandarajah
I have committed attached patch to aarch64/sve-acle-branch branch which implements svbic. Thanks, Kugan From 182bd15334874844bef5e317f55a6497f77e12ff Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Thu, 24 Jan 2019 20:57:19 +1100 Subject: [PATCH 1/3] svbic Change-Id

[PR89862] Fix ARM lto bootstrap

2019-03-28 Thread Kugan Vivekanandarajah
Hi All, LTO bootstrap for ARM fails with the commit commit 67c18bce7054934528ff5930cca283b4ac967dca * combine.c (record_dead_and_set_regs_1): Record the source unmodified for a paradoxical SUBREG on a WORD_REGISTER_OPERATIONS target. It fails with an internal compiler error: in operator+=, at pr

Re: [aarch64][RFA][rtl-optimization/87763] Fix insv_1 and insv_2 for aarch64

2019-04-22 Thread Kugan Vivekanandarajah
Hi Jeff, [...] + "#" + "&& 1" + [(const_int 0)] + "{ + /* If we do not have an RMW operand, then copy the input + to the output before this insn. Also modify the existing + insn in-place so we can have make_field_assignment actually + generate a suitable extraction. */ + if (!rtx_eq

[PATCH 0/2] [RFC][PR88834]

2019-05-14 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah In PR88834, IVOPT is not selecting the right addressing mode. Inorder to fix thix, we need to add support to add IV uses for IFN_MASK_LOAD_LANES and IFN_MASK_STORE_LANES. In addition, we also need to add IV candidate with scaled by the element or access size if

[PATCH 1/2] Add support for IVOPT

2019-05-14 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah gcc/ChangeLog: 2019-05-15 Kugan Vivekanandarajah PR target/88834 * tree-ssa-loop-ivopts.c (get_mem_type_for_internal_fn): Handle IFN_MASK_LOAD_LANES and IFN_MASK_STORE_LANES. (find_interesting_uses_stmt): Likewise

[PATCH 2/2] aarch64 back-end changes

2019-05-14 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah gcc/ChangeLog: 2019-05-15 Kugan Vivekanandarajah PR target/88834 * config/aarch64/aarch64.c (aarch64_classify_address): Relax allow_reg_index_p. gcc/testsuite/ChangeLog: 2019-05-15 Kugan Vivekanandarajah PR target/88834

[PATCH 1/2] [PR88836][aarch64] Set CC_REGNUM instead of clobber

2019-05-15 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah For aarch64 sve while_ult pattern, Set CC_REGNUM instead of clobbering. gcc/ChangeLog: 2019-05-16 Kugan Vivekanandarajah PR target/88834 * config/aarch64/aarch64-sve.md (while_ult): Set CC_REGNUM instead of clobbering. Change-Id

[PATCH 0/2][RFC][PR88836][AARCH64] Fix redundant ptest instruction

2019-05-15 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah Inorder to fix this PR. * We need to change the whilelo pattern in backend * Change RTL CSE such that: - Add support for VEC_DUPLICATE - When handling PARALLEL rtx in cse_insn, we kill CSE defined by all the parallel rtx at the end. For example, with

[PATCH 2/2] [PR88836][aarch64] Fix CSE to process parallel rtx dest one by one

2019-05-15 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah This patch changes cse_insn to process parallel rtx one by one such that any destination rtx in cse list is invalidated before processing the next. gcc/ChangeLog: 2019-05-16 Kugan Vivekanandarajah PR target/88834 * cse.c (safe_hash): Handle

Re: [PATCH 2/2] aarch64 back-end changes

2019-05-15 Thread Kugan Vivekanandarajah
Hi Richard, On Wed, 15 May 2019 at 23:24, Richard Earnshaw (lists) wrote: > > On 15/05/2019 13:48, Richard Earnshaw (lists) wrote: > > On 15/05/2019 03:39, kugan.vivekanandara...@linaro.org wrote: > >> From: Kugan Vivekanandarajah > >> > > > > The subje

Re: [PATCH 1/2] Add support for IVOPT

2019-05-16 Thread Kugan Vivekanandarajah
Hi Richard, On Wed, 15 May 2019 at 16:57, Richard Sandiford wrote: > > Thanks for doing this. > > kugan.vivekanandara...@linaro.org writes: > > From: Kugan Vivekanandarajah > > > > gcc/ChangeLog: > > > > 2019-05-15 Kugan Vivekanandarajah > >

Re: [PATCH 1/2] Add support for IVOPT

2019-05-16 Thread Kugan Vivekanandarajah
Hi Richard, On Thu, 16 May 2019 at 21:14, Richard Biener wrote: > > On Wed, May 15, 2019 at 4:40 AM wrote: > > > > From: Kugan Vivekanandarajah > > > > gcc/ChangeLog: > > > > 2019-05-15 Kugan Vivekanandarajah > > > >

Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-16 Thread Kugan Vivekanandarajah
Hi, On Fri, 17 May 2019 at 13:37, wrote: > > From: Kewen Lin > > Hi, > > Previous version link: > https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00654.html > > Comparing with the previous version, I moved the generic > parts of rs6000 target hook to IVOPTs. But I still kept > the target hook as

[SVE ACLE] Implements svabs, svnot, svneg and svsqrt

2019-01-15 Thread Kugan Vivekanandarajah
I committed the following patch which implements svabs, svnot, svneg and svsqrt to aarch64/sve-acle-branch. branch Thanks, Kugan From 2af9609a58cf7efbed93f15413224a2552b9696d Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Wed, 16 Jan 2019 07:45:52 +1100 Subject: [PATCH] [SVE ACLE

[SVE ACLE] Implements svmulh

2019-01-17 Thread Kugan Vivekanandarajah
I committed the following patch which implements svmulh to aarch64/sve-acle-branch. branch Thanks, Kugan From 33b76de8ef5f370dfacba0addef2fe0b1f2a61db Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Fri, 18 Jan 2019 07:33:26 +1100 Subject: [PATCH] [SVE ACLE] Implements svmulh Change

[SVE ACLE] Implements svdot

2019-01-17 Thread Kugan Vivekanandarajah
I committed the following patch which implements svdot to aarch64/sve-acle-branch. branch Thanks, Kugan From b75cd8ba8f911c137380677b85882c22a6467bf6 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Fri, 18 Jan 2019 09:07:10 +1100 Subject: [PATCH] [SVE ACLE] Implements svdot Change

Re: [RFC][PR61839]Convert CST BINOP COND_EXPR to COND_EXPR ? (CST BINOP 1) : (CST BINOP 0)

2016-08-08 Thread Kugan Vivekanandarajah
to optimize. In the attached test case (in function bar), we end up doing the conversion twice. Bootstrapped and regression tested on x86_64-linux-gnu without no new regressions. Is this OK for trunk? Thanks, Kugan gcc/testsuite/ChangeLog: 2016-08-09 Kugan Vivekanandarajah

[TREE-SSA-CCP] Issue warning when folding condition

2016-08-18 Thread Kugan Vivekanandarajah
-ssa-ccp.c. We might also run into some other similar issues in the future. Bootstrapped and regression tested on x86_64-linux-gnu with no new regressions. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2016-08-18 Kugan Vivekanandarajah * tree-ssa-ccp.c (ccp_fold_stmt): If the

Re: [TREE-SSA-CCP] Issue warning when folding condition

2016-08-18 Thread Kugan Vivekanandarajah
On 19 August 2016 at 12:09, Kugan Vivekanandarajah wrote: > The testcase pr33738.C for warning fails with early-vrp patch. The > reason is, with early-vrp ccp2 is folding the comparison that used to > be folded in simplify_stmt_for_jump_threading. Since early-vrp does > not perform ju

Re: [RFC][PR61839]Convert CST BINOP COND_EXPR to COND_EXPR ? (CST BINOP 1) : (CST BINOP 0)

2016-08-19 Thread Kugan Vivekanandarajah
rations we might want to apply this optimization - and then for both >> cases, >> rhs1 or rhs2 being constant. Like x / 5 and 5 / x. >> >> Note that you can rely on int_const_binop returning NULL_TREE for >> "invalid" >> ops like x % 0 or x / 0, so no need

Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments

2016-08-19 Thread Kugan Vivekanandarajah
xing this now is, in linearize_expr_tree, I set ops_changed > to true if we change NEGATE_EXPR to (-1) MULT_EXPR (OP). Then when we call > zero_one_operation with ops_changed = true, I replace all the LHS in > zero_one_operation with the new SSA and replace all the uses. I also call

Re: [RFC][IPA-VRP] Early VRP Implementation

2016-08-22 Thread Kugan Vivekanandarajah
Hi, On 19 August 2016 at 21:41, Richard Biener wrote: > On Tue, Aug 16, 2016 at 9:45 AM, kugan > wrote: >> Hi Richard, >> >> On 12/08/16 20:43, Richard Biener wrote: >>> >>> On Wed, Aug 3, 2016 at 3:17 AM, kugan >>> wrote: >> >> >> [SNIP] >> >>> >>> diff --git a/gcc/common.opt b/gcc/common.opt

Re: [RFC][IPA-VRP] Add support for IPA VRP in ipa-cp/ipa-prop

2016-08-29 Thread Kugan Vivekanandarajah
ant here. The analysis phase should > not determine > anything if function is reachable non-locally. Removed it. >> +/* Info about value ranges. */ >> + >> +struct GTY(()) ipa_vr >> +{ >> + /* The data fields below are valid only if known is true. */ >

Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments

2016-09-02 Thread Kugan Vivekanandarajah
Hi Richard, On 25 August 2016 at 22:24, Richard Biener wrote: > On Thu, Aug 11, 2016 at 1:09 AM, kugan > wrote: >> Hi, >> >> >> On 10/08/16 20:28, Richard Biener wrote: >>> >>> On Wed, Aug 10, 2016 at 10:57 AM, Jakub Jelinek wrote: On Wed, Aug 10, 2016 at 08:51:32AM +1000, kugan wrote

Re: [RFC][IPA-VRP] Early VRP Implementation

2016-09-02 Thread Kugan Vivekanandarajah
Ping ? Thanks, Kugan On 23 August 2016 at 12:11, Kugan Vivekanandarajah wrote: > Hi, > > On 19 August 2016 at 21:41, Richard Biener wrote: >> On Tue, Aug 16, 2016 at 9:45 AM, kugan >> wrote: >>> Hi Richard, >>> >>> On 12/08/16 20:43, Richard Bie

[RFC][SSA] Iterator to visit SSA

2016-09-04 Thread Kugan Vivekanandarajah
/ChangeLog: 2016-09-05 Kugan Vivekanandarajah * tree-ssanames.h (ssa_iterator::ssa_iterator): New. (ssa_iterator::get): Likewise. (ssa_iterator::next): Likewise. (FOR_EACH_SSAVAR): Likewise. * cfgexpand.c (update_alias_info_with_stack_vars): Use FOR_EACH_SSAVAR to iterate

Re: [RFC][SSA] Iterator to visit SSA

2016-09-05 Thread Kugan Vivekanandarajah
Hi Richard, On 5 September 2016 at 17:57, Richard Biener wrote: > On Mon, Sep 5, 2016 at 7:26 AM, Kugan Vivekanandarajah > wrote: >> Hi All, >> >> While looking at gcc source, I noticed that we are iterating over SSA >> variable from 0 to num_ssa_names in some case

Re: [RFC][SSA] Iterator to visit SSA

2016-09-06 Thread Kugan Vivekanandarajah
Hi Richard, On 6 September 2016 at 19:08, Richard Biener wrote: > On Tue, Sep 6, 2016 at 2:24 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 5 September 2016 at 17:57, Richard Biener >> wrote: >>> On Mon, Sep 5, 2016 at 7:26 AM, Kugan V

Re: [RFC][SSA] Iterator to visit SSA

2016-09-06 Thread Kugan Vivekanandarajah
Hi Richard, On 6 September 2016 at 19:57, Richard Biener wrote: > On Tue, Sep 6, 2016 at 11:33 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 6 September 2016 at 19:08, Richard Biener >> wrote: >>> On Tue, Sep 6, 2016 at 2:24 AM, Kug

Re: [RFC][SSA] Iterator to visit SSA

2016-09-06 Thread Kugan Vivekanandarajah
Hi Richard, On 6 September 2016 at 19:08, Richard Biener wrote: > On Tue, Sep 6, 2016 at 2:24 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 5 September 2016 at 17:57, Richard Biener >> wrote: >>> On Mon, Sep 5, 2016 at 7:26 AM, Kugan V

[PR71252][PR71269] Fix trunk errors due to stmt_to_insert

2016-05-25 Thread Kugan Vivekanandarajah
insert. 3. In rewrite_expr_tree_parallel, build_and_add_sum relies on either of operand being inserted. If that is not the case, we have to insert the stmt_to_insert before calling build_and_add_sum. 4. I also moved all the other stmt_to_insert insertion after the use stmt are created. Also regression t

Re: [PR71252][PR71269] Fix trunk errors due to stmt_to_insert

2016-05-26 Thread Kugan Vivekanandarajah
Hi Jakub, On 26 May 2016 at 18:18, Jakub Jelinek wrote: > On Thu, May 26, 2016 at 02:17:56PM +1000, Kugan Vivekanandarajah wrote: >> --- a/gcc/tree-ssa-reassoc.c >> +++ b/gcc/tree-ssa-reassoc.c >> @@ -3767,8 +3767,10 @@ swap_ops_for_binary_stmt (vec ops, >>

[PATCH1][PR71252] Fix missing swap to stmt_to_insert

2016-05-27 Thread Kugan Vivekanandarajah
trunk if the testing is fine ? Thanks, Kugan gcc/ChangeLog: 2016-05-28 Kugan Vivekanandarajah * tree-ssa-reassoc.c (swap_ops_for_binary_stmt): Fix swap such that all fields including stmt_to_insert are swapped. gcc/testsuite/ChangeLog: 2016-05-28 Kugan Vivekanandarajah * gcc.dg/tree-ssa

[PATCH2][PR71252] Fix insertion point of stmt_to_insert

2016-05-27 Thread Kugan Vivekanandarajah
PRs Thanks, Kugan gcc/testsuite/ChangeLog: 2016-05-28 Kugan Vivekanandarajah * gcc.dg/tree-ssa/pr71269.c: New test. gcc/ChangeLog: 2016-05-28 Kugan Vivekanandarajah * tree-ssa-reassoc.c (insert_stmt_before_use): Use find_insert_point so that inserted stmt will not dominate

Re: [PATCH, AArch64] Disable reg offset in quad-word store for Falkor.

2017-10-30 Thread Kugan Vivekanandarajah
Ping ? I see that Jim has clarified the comments from Andrew. Thanks, Kugan On 13 October 2017 at 08:48, Jim Wilson wrote: > On Fri, 2017-09-22 at 14:11 -0700, Andrew Pinski wrote: >> On Fri, Sep 22, 2017 at 11:39 AM, Jim Wilson >> wrote: >> > >> > On Fri, Sep 22, 2017 at 10:58 AM, Andrew Pins

Re: [PATCH, AArch64] Disable reg offset in quad-word store for Falkor.

2017-10-31 Thread Kugan Vivekanandarajah
Hi Jim, On 1 November 2017 at 03:12, Jim Wilson wrote: > On Tue, 2017-10-31 at 14:35 +1100, Kugan Vivekanandarajah wrote: >> Ping ? >> >> I see that Jim has clarified the comments from Andrew. > > Andrew also suggested that we add a testcase to the testsuite. I >

Re: [PATCH, AArch64] Disable reg offset in quad-word store for Falkor.

2017-11-01 Thread Kugan Vivekanandarajah
Hi, On 1 November 2017 at 03:12, Jim Wilson wrote: > On Tue, 2017-10-31 at 14:35 +1100, Kugan Vivekanandarajah wrote: >> Ping ? >> >> I see that Jim has clarified the comments from Andrew. > > Andrew also suggested that we add a testcase to the testsuite. I >

[AARCH64] implements neon vld1_*_x2 intrinsics

2017-11-06 Thread Kugan Vivekanandarajah
Hi, Attached patch implements the vld1_*_x2 intrinsics as defined by the neon document. Bootstrap for the latest patch is ongoing on aarch64-linux-gnu. Is this OK for trunk if no regressions? Thanks, Kugan gcc/ChangeLog: 2017-11-06 Kugan Vivekanandarajah * config/aarch64/aarch64

Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2017-07-21 Thread Kugan Vivekanandarajah
Ping ? Thanks, Kugan On 27 June 2017 at 11:20, Kugan Vivekanandarajah wrote: > https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00614.html added this > workaround to get kernel building with when TARGET_FIX_ERR_A53_843419 > is enabled. > > This was added to support building

Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2017-08-10 Thread Kugan Vivekanandarajah
Ping^2? Thanks, Kugan On 21 July 2017 at 20:12, Kugan Vivekanandarajah wrote: > Ping ? > > Thanks, > Kugan > > On 27 June 2017 at 11:20, Kugan Vivekanandarajah > wrote: >> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00614.html added this >> workaround

Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2017-08-28 Thread Kugan Vivekanandarajah
ping^3 Thanks, Kugan On 11 August 2017 at 16:09, Kugan Vivekanandarajah wrote: > Ping^2? > > Thanks, > Kugan > > On 21 July 2017 at 20:12, Kugan Vivekanandarajah > wrote: >> Ping ? >> >> Thanks, >> Kugan >> >> On 27 June 2017 at 11:20,

Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2017-08-29 Thread Kugan Vivekanandarajah
Hi James, On 29 August 2017 at 21:31, James Greenhalgh wrote: > On Tue, Jun 27, 2017 at 11:20:02AM +1000, Kugan Vivekanandarajah wrote: >> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00614.html added this >> workaround to get kernel building with when TARGET_FIX_ERR_A53_84341

Re: [PATCH GCC][4/5]Improve loop distribution to handle hmmer

2017-06-04 Thread Kugan Vivekanandarajah
Hi Bin, Thanks for posting the patch. I haven't looked in detail yet but have couple of quick questions. 1. Shouldn’t the run time alias check for versioning happen only when vectorisation is enabled? You seems to be using the IFN_LOOP_DIST_ALIAS when vectoring but seems to be versioning otherwis

[RFC][PATCH 0/5] Loop unrolling and memory load streams

2017-09-14 Thread Kugan Vivekanandarajah
While loop unrolling helps to keep the pipeline busy in modern processors, it also can increase the memory streams resulting in collisions for the hardware prefetcher that can impact performance. This patch series tries to detect this and limit the loop unrolling. Patch 1 : Add separate parms for

[RFC][PATCH 1/5] Add separate parms for rtl unroller

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds separate params for rtl unroller so that they can be tunned accordingly. Default values I have are based on some testing on aarch64. I am happy to leave it as the current value and set them in the back-end. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah

[RFC][AARCH64][PATCH 2/5]: Add number of hw prefetchers available to cpu_prefetch_tune

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds number of hw prefetchers available to cpu_prefetch_tune so it can be used in loop unrolling decisions. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * config/aarch64/aarch64-protos.h (struct cpu_prefetch_tune): Add new field hw_prefetchers_avail

[RFC][PACH 3/5] Prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop

2017-09-14 Thread Kugan Vivekanandarajah
This patch prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * config/aarch64/aarch64.c (count_mem_load_streams): New. (aarch64_ok_to_unroll): New

[RFC][PATCH 4/5] Change iv_analyze_result to take const_rtx.

2017-09-14 Thread Kugan Vivekanandarajah
Change iv_analyze_result to take const_rtx. This is just to make the next patch compile. No functional changes: Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * cfgloop.h (iv_analyze_result): Change 2nd param from rtx to const_rtx. * df-core.c (df_find_def

[RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * cfgloop.h (iv_analyze_biv): export. * loop-iv.c: Likewise. * config/aarch64/aarch64.c (strided_load_p

Re: [RFC][AARCH64][PATCH 2/5]: Add number of hw prefetchers available to cpu_prefetch_tune

2017-09-16 Thread Kugan Vivekanandarajah
Hi Andrew, On 15 September 2017 at 13:20, Andrew Pinski wrote: > On Thu, Sep 14, 2017 at 6:28 PM, Kugan Vivekanandarajah > wrote: >> This patch adds number of hw prefetchers available to >> cpu_prefetch_tune so it can be used in loop unrolling decisions. > > Can yo

Re: [RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

2017-09-16 Thread Kugan Vivekanandarajah
Hi Ramana On 15 September 2017 at 18:40, Ramana Radhakrishnan wrote: > On Fri, Sep 15, 2017 at 2:33 AM, Kugan Vivekanandarajah > wrote: >> This patch adds aarch64_loop_unroll_adjust to limit partial unrolling >> in rtl based on strided-loads in loop. >> >> Thanks

Re: [RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

2017-09-17 Thread Kugan Vivekanandarajah
Hi Andrew, On 15 September 2017 at 13:36, Andrew Pinski wrote: > On Thu, Sep 14, 2017 at 6:33 PM, Kugan Vivekanandarajah > wrote: >> This patch adds aarch64_loop_unroll_adjust to limit partial unrolling >> in rtl based on strided-loads in loop. > > Can you expand on th

Re: [RFC][PATCH 1/5] Add separate parms for rtl unroller

2017-09-17 Thread Kugan Vivekanandarajah
Hi Richard, On 15 September 2017 at 19:31, Richard Biener wrote: > On Fri, Sep 15, 2017 at 3:27 AM, Kugan Vivekanandarajah > wrote: >> This patch adds separate params for rtl unroller so that they can be >> tunned accordingly. Default values I have are based on some testing o

Re: [RFC][PATCH 1/5] Add separate parms for rtl unroller

2017-09-18 Thread Kugan Vivekanandarajah
Hi Richard, On 18 September 2017 at 17:50, Richard Biener wrote: > On Mon, Sep 18, 2017 at 3:36 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 15 September 2017 at 19:31, Richard Biener >> wrote: >>> On Fri, Sep 15, 2017 at 3:27 AM, Kugan

Re: [RFC][SSA] Iterator to visit SSA

2016-09-07 Thread Kugan Vivekanandarajah
Hi Richard, On 7 September 2016 at 19:35, Richard Biener wrote: > On Wed, Sep 7, 2016 at 2:21 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >> On 6 September 2016 at 19:08, Richard Biener >> wrote: >>> On Tue, Sep 6, 2016 at 2:24 AM, Kugan Vivek

[RFC] Type promotion pass and elimination of zext/sext

2016-05-15 Thread Kugan Vivekanandarajah
based on the feedback. Please let me know what you thing. Thanks, Kugan From 332e0e9f938c6af50e826d8224d07ebf3678a0e0 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Fri, 13 May 2016 13:41:01 +1000 Subject: [PATCH 4/4] Add new type promotion pass --- gcc/ChangeLog

[AARCH64] Remove static variable all_extensions from aarch64.c

2016-05-16 Thread Kugan Vivekanandarajah
Hi, static variable all_extensions in aarch64.c is not used and therefore dead. I don’t see any reason why it should be there. Attached patch removes this. Bootstrapped on aarch64-linux-gnu. Regression testing is ongoing. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2016-05-17 Kugan

Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z

2016-05-18 Thread Kugan Vivekanandarajah
>>> Please move the whole thing under the else { } case of the ops.length >>> == 0, ops.length == 1 test chain >>> as you did for the actual emit of the negate. >>> >> >> I see your point. However, when we remove the (-1) from the ops list, that >> intern can result in ops.length becoming 1. Theref

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-19 Thread Kugan Vivekanandarajah
Hi Martin, Thanks for the fix. Just to elaborate (as mentioned in PR) At tree-ssa-reassoc.c:3897, we have: stmt: _15 = _4 + c_7(D); oe->op def_stmt: _17 = c_7(D) * 3; : a1_6 = s_5(D) * 2; _1 = (long int) a1_6; x1_8 = _1 + c_7(D); a2_9 = s_5(D) * 4; _2 = (long int) a2_9; a3_11 = s_5(D) * 6; _3

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-19 Thread Kugan Vivekanandarajah
On 19 May 2016 at 18:55, Richard Biener wrote: > On Thu, May 19, 2016 at 10:26 AM, Kugan > wrote: >> Hi, >> >> >> On 19/05/16 18:21, Richard Biener wrote: >>> On Thu, May 19, 2016 at 10:12 AM, Kugan Vivekanandarajah >>> wrote: >>>> Hi

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-19 Thread Kugan Vivekanandarajah
erting the multiplication to > rewrite_expr_tree time. For example by adding a ops->stmt_to_insert > member. > Here is an implementation based on above. Bootstrap on x86-linux-gnu is OK. regression testing is ongoing. Thanks, Kugan gcc/ChangeLog: 2016-05-20 Kugan Vivekanandarajah

[PATCH] Fix PR tree-optimization/71179

2016-05-19 Thread Kugan Vivekanandarajah
tested on x86-64-linux-gnu with no new regressions. Is this OK for trunk? Thanks, Kugan gcc/testsuite/ChangeLog: 2016-05-20 Kugan Vivekanandarajah * gcc.dg/tree-ssa/pr71179.c: New test. gcc/ChangeLog: 2016-05-20 Kugan Vivekanandarajah * tree-ssa-reassoc.c

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-20 Thread Kugan Vivekanandarajah
On 20 May 2016 at 21:07, Richard Biener wrote: > On Fri, May 20, 2016 at 1:51 AM, Kugan Vivekanandarajah > wrote: >> Hi Richard, >> >>> I think it should have the same rank as op or op + 1 which is the current >>> behavior. Sth else doesn't work c

Re: [RFC] Type promotion pass and elimination of zext/sext

2016-05-22 Thread Kugan Vivekanandarajah
Hi Jeff, On 20 May 2016 at 04:17, Jeff Law wrote: > On 05/15/2016 06:45 PM, Kugan Vivekanandarajah wrote: >> >> Hi Richard, >> >> Now that stage1 is open, I would like to get the type promotion passes >> reviewed again. I have tested the patches on aarch64, x86-6

Re: [RFC] Type promotion pass and elimination of zext/sext

2016-05-22 Thread Kugan Vivekanandarajah
Hi Richard, > So what does this mean for this pass? It means that we need to think > about the immediate goal we want to fulfil - which might be to just > promote things that we can fully promote, avoiding the necessity to > prevent passes from undoing our work. That said - we need a set of > te

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-23 Thread Kugan Vivekanandarajah
On 23 May 2016 at 21:35, Richard Biener wrote: > On Sat, May 21, 2016 at 8:08 AM, Kugan Vivekanandarajah > wrote: >> On 20 May 2016 at 21:07, Richard Biener wrote: >>> On Fri, May 20, 2016 at 1:51 AM, Kugan Vivekanandarajah >>> wrote: >>>> Hi Richard,

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-24 Thread Kugan Vivekanandarajah
On 24 May 2016 at 18:36, Christophe Lyon wrote: > On 24 May 2016 at 05:13, Kugan Vivekanandarajah > wrote: >> On 23 May 2016 at 21:35, Richard Biener wrote: >>> On Sat, May 21, 2016 at 8:08 AM, Kugan Vivekanandarajah >>> wrote: >>>> On 20 May 2016 at

[PR71252][PATCH] ICE: verify_ssa failed

2016-05-24 Thread Kugan Vivekanandarajah
reducing the test-case is appreciated. Regression testing on x86_64-linux-gnu and bootstrap didn’t find any new issues. Is this OK for trunk? Thanks, Kugan gcc/testsuite/ChangeLog: 2016-05-24 Kugan Vivekanandarajah * gfortran.dg/pr71252.f90: New test. gcc/ChangeLog: 2016-05-24 Kugan

[testcase] Fix absfloat16.c testcase

2024-09-29 Thread Kugan Vivekanandarajah
Hi, This patch Fixes absfloat16.c testcase to have the dg-add-options float16 at the correct order. Due to this mixup, this test is failing for some arm variants. Is this OK for trunk? Thanks, Kugan 0001-Fix-absfloat16.c-testcase.patch Description: 0001-Fix-absfloat16.c-testcase.patch

Re: [PR middle-end/114635] Set OMP safelen handling to INT_MAX when the pragma didn’t provide one.

2024-10-07 Thread Kugan Vivekanandarajah
ping? Thanks, Kugan From: Kugan Vivekanandarajah Sent: Tuesday, 20 August 2024 6:18 PM To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org ; richard.guent...@gmail.com ; richard.sandif...@arm.com Subject: Re: [PR middle-end/114635] Set OMP safelen handling to

Re: [PR middle-end/114635] Set OMP safelen handling to INT_MAX when the pragma didn’t provide one.

2024-10-13 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. > On 8 Oct 2024, at 7:15 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Mon, Aug 5, 2024 at 7:05 AM Kugan Vivekanandarajah > wrote: >> >> >> >>>

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-10-28 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. > On 25 Oct 2024, at 8:53 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Fri, Oct 25, 2024 at 12:22 AM Kugan Vivekanandarajah > wrote: >> &g

Re: [PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-29 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. > On 28 Oct 2024, at 9:18 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Mon, Oct 28, 2024 at 9:35 AM Kugan Vivekanandarajah > wrote: >> >> Hi, >> >> When ifcvt

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-10-31 Thread Kugan Vivekanandarajah
> On 31 Oct 2024, at 6:18 pm, Jakub Jelinek wrote: > > External email: Use caution opening links or attachments > > > On Tue, Oct 29, 2024 at 05:01:40AM +, Kugan Vivekanandarajah wrote: >> For param_vect_max_version_for_alias_checks of 15, the average code si

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-11-02 Thread Kugan Vivekanandarajah
> On 31 Oct 2024, at 7:29 pm, Jakub Jelinek wrote: > > External email: Use caution opening links or attachments > > > On Thu, Oct 31, 2024 at 08:21:09AM +, Kugan Vivekanandarajah wrote: >> >> >>> On 31 Oct 2024, at 6:18 pm, Jakub Jelinek wrote: >

[PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-28 Thread Kugan Vivekanandarajah
Hi, When ifcvt version a loop, it sets dont_vectorize to the scalar loop. If the vector loop is not vectorized and removed, the scalar loop is still left with dont_vectorize. As a result, BB vectorization will not happen. This patch adds a new attribute called dont_loop_vectorize (that is differe

[RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-10-24 Thread Kugan Vivekanandarajah
Hi, This patch sets param_vect_max_version_for_alias_checks to 15. This was causing GCC to miss vectorization opportunities in one internal application making it slower than LLVM by about ~14%. I've tested different param_vect_max_version_for_alias_checks such as 15 and 100 and the SPEC2017 resu

[PATCH][AARCH64][PR115258]Fix excess moves

2024-10-24 Thread Kugan Vivekanandarajah
Hi, Fix for PR115258 cases a performance regression in some of the TSVC kernels by adding additional mov instructions. This patch fixes this. i.e., When operands are equal, it is likely that all of them get the same register similar to: (insn 19 15 20 3 (set (reg:V2x16QI 62 v30 [117])

[testsuite] Fix bb-slp-77.c for x86

2024-10-31 Thread Kugan Vivekanandarajah
This test bb-slp-77.c extracted relies on the completely unrolling of the inner loop. However, for x86 in gcc.dg/vect/, loop is not unrolled and the inner loop is vectorized thus not triggering expected BB SLP Also noticed that the "vectorizing stmts using SLP” count is different when I for

Re: [PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-30 Thread Kugan Vivekanandarajah
Hi Richard, > On 29 Oct 2024, at 8:33 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Tue, Oct 29, 2024 at 9:24 AM Kugan Vivekanandarajah > wrote: >> >> Hi Richard, >> Thanks for the review. >> &

Re: [PATCH] MATCH: add abs support for half float

2024-09-20 Thread Kugan Vivekanandarajah
Hi Richard, > On 17 Sep 2024, at 7:36 pm, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Tue, Sep 17, 2024 at 10:31 AM Kugan Vivekanandarajah > wrote: >> >> Hi Richard, >> >>> On 10 Sep 2024, at 9:33 

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-11-14 Thread Kugan Vivekanandarajah
Ping? Thanks, Kugan > On 2 Nov 2024, at 7:49 pm, Kugan Vivekanandarajah > wrote: > > External email: Use caution opening links or attachments > > > > On 31 Oct 2024, at 7:29 pm, Jakub Jelinek wrote: > > > > External email: Use caution opening links or a

Re: [PATCH][AARCH64][PR115258]Fix excess moves

2025-02-25 Thread Kugan Vivekanandarajah
Hi Richard, I want to follow up on this and see if you have a fix for this. Thanks, Kugan > On 29 Oct 2024, at 9:41 pm, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Kugan Vivekanandarajah writes: >> Hi, >

[AUTOFDO] Fix annotated profile for de-duplicated call

2025-05-08 Thread Kugan Vivekanandarajah
This patch fixes wrong annotation of profiles when call statement is de-duplicated. i.e., when we may have same stmt executing from more than one path (by jumping to same statment). Thus, the profile we get will be for multiple paths and would make the annotated profile wrong. As a fix, we dont ann

[AUTOFDO] Merge profiles of clones before annotating

2025-05-08 Thread Kugan Vivekanandarajah
This patch add support for merging profiles from multiple clones. That is, when optimized binaries have clones such as IPA-CP clone or SRA clones, genarted gcov will have profiled them spereately. Currently we pick one and ignore the rest. This patch fixes this by merging the profiles. Regression

[AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-08 Thread Kugan Vivekanandarajah
Add support for autoprofiledbootstrap in aarch64. This is similar to what is done for i386. Added gcc/config/aarch64/gcc-auto-profile for aarch64 profile creation. How to run: configure --with-build-config=bootstrap-lto make autoprofiledbootstrap Regression tested on aarch64-linux-gnu with no ne

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-13 Thread Kugan Vivekanandarajah
Adding Eugene and Andi to CC as Sam suggested. > On 13 May 2025, at 12:57 am, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Kugan Vivekanandarajah writes: >> diff --git a/configure.ac b/configure.ac >> inde

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-20 Thread Kugan Vivekanandarajah
Thanks Richard for the review. > On 20 May 2025, at 2:47 am, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Kugan Vivekanandarajah writes: >> diff --git a/Makefile.in b/Makefile.in >> index b1ed67d3d4f..b5e3e5

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-19 Thread Kugan Vivekanandarajah
> On 16 May 2025, at 12:10 am, Andi Kleen wrote: > > External email: Use caution opening links or attachments > > > On Wed, May 14, 2025 at 02:46:15AM +, Kugan Vivekanandarajah wrote: >> Adding Eugene and Andi to CC as Sam suggested. >> >>> On 13 M

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-25 Thread Kugan Vivekanandarajah
> On 26 May 2025, at 2:25 pm, Andrew Pinski wrote: > > External email: Use caution opening links or attachments > > > On Tue, May 20, 2025 at 3:09 AM Kugan Vivekanandarajah > wrote: >> >> Thanks Richard for the review. >> >>> On 20 May

Re: [AUTOFDO] Merge profiles of clones before annotating

2025-05-26 Thread Kugan Vivekanandarajah
> On 26 May 2025, at 5:34 pm, Jan Hubicka wrote: > > External email: Use caution opening links or attachments > > > Hi, > also, please, can you add an testcase? We should have some coverage for > auto-fdo specific issues I was looking for this too. AFIK we dont do any testing currently.

Re: [AUTOFDO] Merge profiles of clones before annotating

2025-05-26 Thread Kugan Vivekanandarajah
Hi Homza. > On 26 May 2025, at 7:48 pm, Jan Hubicka wrote: > > External email: Use caution opening links or attachments > > >> >> >>> On 26 May 2025, at 5:34 pm, Jan Hubicka wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> Hi, >>> also, please, can you

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-27 Thread Kugan Vivekanandarajah
> On 26 May 2025, at 2:47 pm, Kugan Vivekanandarajah > wrote: > > External email: Use caution opening links or attachments > > > > On 26 May 2025, at 2:25 pm, Andrew Pinski wrote: > > > > External email: Use caution opening links or attachments >

Re: [AUTOFDO] Merge profiles of clones before annotating

2025-05-21 Thread Kugan Vivekanandarajah
Ping? Thanks, Kugan > On 9 May 2025, at 11:54 am, Kugan Vivekanandarajah > wrote: > > External email: Use caution opening links or attachments > > > This patch add support for merging profiles from multiple clones. > That is, when optimized binaries have clones suc

<    1   2   3   >