Hello All:
This patch add new pass to replace contiguous addresses vector load lxv with
mma instruction
lxvp.
Bootstrapped and regtested with powepc64-linux-gnu.
Thanks & Regards
Ajit
rs6000: Add new pass for replacement of contiguous lxv with lxvp
New pass to replace contiguous addresses vec
Hello All:
This patch add new pass to replace contiguous addresses vector load lxv with
mma instruction
lxvp.
Bootstrapped and regtested with powepc64-linux-gnu.
Thanks & Regards
Ajit
rs6000: Add new pass for replacement of contiguous lxv with lxvp.
New pass to replace contiguous addresses l
Hello All:
This patch add new pass to replace contiguous addresses vector load lxv with
mma instruction
lxvp. This patch addresses one regressions failure in ARM architecture.
Bootstrapped and regtested with powepc64-linux-gnu.
Thanks & Regards
Ajit
rs6000: Add new pass for replacement of con
Hello Segher:
Here is the patch that uses xxlor instead of fmr where possible.
Performance results shows that fmr is better in power9 and
power10 architectures whereas xxlor is better in power7 and
power 8 architectures. fmr is the only option before p7.
Incorporated review comments.
Bootstrapp
This patch improves code sinking pass to sink statements before call to reduce
register pressure.
Review comments are incorporated. Synced and modified with latest trunk sources.
For example :
void bar();
int j;
void foo(int a, int b, int c, int d, int e, int f)
{
int l;
l = a + b + c + d +e
This patch improves code sinking pass to sink statements before call to reduce
register pressure.
Review comments are incorporated. Synced with latest sources and modify the
code changes
accordingly.
For example :
void bar();
int j;
void foo(int a, int b, int c, int d, int e, int f)
{
int l;
This patch improves code sinking pass to sink statements before call to reduce
register pressure.
Review comments are incorporated.
Synced with latest trunk sources and modify the sinking pass accordingly.
For example :
void bar();
int j;
void foo(int a, int b, int c, int d, int e, int f)
{
int
Hello Segher:
Please review.
Thanks & Regards
Ajit
Forwarded Message
Subject: PATCH v3] rs6000: fmr gets used instead of faster xxlor [PR93571]
Date: Tue, 10 Oct 2023 18:14:00 +0530
From: Ajit Agarwal
To: gcc-patches
CC: Segher Boessenkool , Peter Bergner
, Kewen
Hello All:
Please review.
Thanks & Regards
Ajit
Forwarded Message
Subject: [PATCH v2] rs6000: Add new pass for replacement of contiguous
addresses vector load lxv with lxvp
Date: Sun, 8 Oct 2023 00:34:27 +0530
From: Ajit Agarwal
To: gcc-patches
CC: Segher Boessen
0530
From: Ajit Agarwal
To: gcc-patches
CC: Jeff Law , Vineet Gupta ,
Richard Biener , Segher Boessenkool
, Peter Bergner
Hello All:
This version 8 of the patch uses abi interfaces to remove zero and sign
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.
Incorporated all
sage
Subject: [PATCH v2 3/4] Improve functionality of ree pass with various
constants with AND operation.
Date: Tue, 19 Sep 2023 14:51:16 +0530
From: Ajit Agarwal
To: gcc-patches
CC: Jeff Law , Vineet Gupta ,
Richard Biener , Peter Bergner
, Segher Boessenkool
Hello Jeff:
This patch elimin
Hello Richard:
On 17/10/23 2:03 pm, Richard Biener wrote:
> On Thu, Oct 12, 2023 at 10:42 AM Ajit Agarwal wrote:
>>
>> This patch improves code sinking pass to sink statements before call to
>> reduce
>> register pressure.
>> Review comments are incorporated
Currently, code sinking will sink code at the use points with loop having same
nesting depth. The following patch improves code sinking by placing the sunk
code in immediate dominator with same loop nest depth.
Review comments are incorporated.
For example :
void bar();
int j;
void foo(int a, in
Hello Richard:
Below review comments are incorporated in version 10 of the patch,
Please review and let me know if its okay for trunk.
Thanks & Regards
Ajit
On 17/10/23 2:47 pm, Richard Biener wrote:
> On Tue, Oct 17, 2023 at 10:53 AM Ajit Agarwal wrote:
>>
>> Hello Richard
Hello All:
This version 9 of the patch uses abi interfaces to remove zero and sign
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.
In this version (version 9) of the patch following review comments are
incorporated.
a) Removal of hard code zero_extend and sign_extend i
Hello Vineet:
Thanks for your time and valuable comments.
On 21/10/23 5:26 am, Vineet Gupta wrote:
> On 10/19/23 23:50, Ajit Agarwal wrote:
>> Hello All:
>>
>> This version 9 of the patch uses abi interfaces to remove zero and sign
>> extension elimination.
>&
Hello Vineet and Jeff:
This version 10 of the patch uses abi interfaces to remove zero and sign
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.
In this version (version 9) of the patch following review comments are
incorporated.
a) Removal of hard code zero_extend and s
Hello Vineet, Jeff and Bernhard:
This version 11 of the patch uses abi interfaces to remove zero and sign
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.
In this version (version 11) of the patch following review comments are
incorporated.
a) Removal of hard code zero_e
Hello All:
Addressed below review comments in the version 11 of the patch.
Please review and please let me know if its ok for trunk.
Thanks & Regards
Ajit
On 22/10/23 12:56 am, rep.dot@gmail.com wrote:
> On 21 October 2023 01:56:16 CEST, Vineet Gupta wrote:
>> On 10/19/2
Ping ^1.
Forwarded Message
Subject: [PING ^0][PATCH v2] rs6000: Add new pass for replacement of contiguous
addresses vector load lxv with lxvp
Date: Sun, 15 Oct 2023 17:43:24 +0530
From: Ajit Agarwal
To: gcc-patches
CC: Segher Boessenkool , Kewen.Lin
, Peter Bergner
Hello Vineet, Jeff and Bernhard:
This version 11 of the patch uses abi interfaces to remove zero and sign
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.
In this version (version 11) of the patch following review comments are
incorporated.
a) Removal of hard code zero_e
Hello Bernhard:
On 23/10/23 7:40 pm, Bernhard Reutner-Fischer wrote:
> On Mon, 23 Oct 2023 12:16:18 +0530
> Ajit Agarwal wrote:
>
>> Hello All:
>>
>> Addressed below review comments in the version 11 of the patch.
>> Please review and please let me know if it
Hello Vineet:
On 24/10/23 12:02 am, Vineet Gupta wrote:
>
>
> On 10/22/23 23:46, Ajit Agarwal wrote:
>> Hello All:
>>
>> Addressed below review comments in the version 11 of the patch.
>> Please review and please let me know if its ok for trunk.
>>
>>
Hello Vineet, Jeff and Bernhard:
This version 13 of the patch uses abi interfaces to remove zero and sign
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.
In this version (version 13) of the patch following review comments are
incorporated.
a) Removal of hard code zero_e
On 24/10/23 1:10 pm, Ajit Agarwal wrote:
> Hello Vineet:
>
> On 24/10/23 12:02 am, Vineet Gupta wrote:
>>
>>
>> On 10/22/23 23:46, Ajit Agarwal wrote:
>>> Hello All:
>>>
>>> Addressed below review comments in the version 11 of the patch.
On 19/09/23 1:57 am, Vineet Gupta wrote:
> Hi Ajit,
>
> On 9/17/23 22:59, Ajit Agarwal wrote:
>> This new version of patch 6 use improve ree pass for rs6000 target using
>> defined ABI interfaces.
>> Bootstrapped and regtested on power64-linux-gnu.
>>
Hello Vineet, Jeff and Bernhard:
This version 14 of the patch uses abi interfaces to remove zero and sign
extension elimination.
This fixes aarch64 regressions failures with aggressive CSE.
Bootstrapped and regtested on powerpc-linux-gnu.
In this version (version 14) of the patch following revi
On 25/10/23 2:19 am, Vineet Gupta wrote:
> On 10/24/23 13:36, rep.dot@gmail.com wrote:
>> As said, I don't see why the below was not cleaned up before the V1
>> submission.
>> Iff it breaks when manually CSEing, I'm curious why?
The function below looks identical in v12 of
On 25/10/23 2:06 am, rep.dot@gmail.com wrote:
> On 24 October 2023 09:36:22 CEST, Ajit Agarwal wrote:
>> Hello Bernhard:
>>
>> On 23/10/23 7:40 pm, Bernhard Reutner-Fischer wrote:
>>> On Mon, 23 Oct 2023 12:16:18 +0530
>>> Ajit Agarwal wrote:
>&g
On 24/10/23 11:47 pm, Vineet Gupta wrote:
>
>
> On 10/24/23 10:03, Ajit Agarwal wrote:
>> Hello Vineet, Jeff and Bernhard:
>>
>> This version 14 of the patch uses abi interfaces to remove zero and sign
>> extension elimination.
>> This fixes aarch64 r
Hello Vineet, Jeff and Bernhard:
This version 15 of the patch uses abi interfaces to remove zero and sign
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.
In this version (version 15) of the patch following review comments are
incorporated.
a) Removal of hard code zero_e
On 27/10/23 10:46 pm, Bernhard Reutner-Fischer wrote:
> On Wed, 25 Oct 2023 16:41:07 +0530
> Ajit Agarwal wrote:
>
>> On 25/10/23 2:19 am, Vineet Gupta wrote:
>>> On 10/24/23 13:36, rep.dot@gmail.com wrote:
>>>>>>>> As said, I don'
On 28/10/23 4:09 am, Vineet Gupta wrote:
>
>
> On 10/27/23 10:16, Bernhard Reutner-Fischer wrote:
>> On Wed, 25 Oct 2023 16:41:07 +0530
>> Ajit Agarwal wrote:
>>
>>> On 25/10/23 2:19 am, Vineet Gupta wrote:
>>>> On 10/24/23 13:36, rep.dot
Hello Vineet, Jeff and Bernhard:
This version 15 of the patch uses abi interfaces to remove zero and sign
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.
In this version (version 15) of the patch following review comments are
incorporated.
a) Removal of hard code zero_e
On 28/10/23 3:55 pm, Ajit Agarwal wrote:
>
>
> On 27/10/23 10:46 pm, Bernhard Reutner-Fischer wrote:
>> On Wed, 25 Oct 2023 16:41:07 +0530
>> Ajit Agarwal wrote:
>>
>>> On 25/10/23 2:19 am, Vineet Gupta wrote:
>>>> On 10/24/23 13:36, rep.dot..
On 28/10/23 3:56 pm, Ajit Agarwal wrote:
>
>
> On 28/10/23 4:09 am, Vineet Gupta wrote:
>>
>>
>> On 10/27/23 10:16, Bernhard Reutner-Fischer wrote:
>>> On Wed, 25 Oct 2023 16:41:07 +0530
>>> Ajit Agarwal wrote:
>>>
>>>>
Hello Richard:
Currently, code sinking will sink code at the use points with loop having same
nesting depth. The following patch improves code sinking by placing the sunk
code in immediate dominator with same loop nest depth.
Review comments are incorporated.
For example :
void bar();
int j;
vo
Hello Richard:
On 17/10/23 2:47 pm, Richard Biener wrote:
> On Tue, Oct 17, 2023 at 10:53 AM Ajit Agarwal wrote:
>>
>> Hello Richard:
>>
>> On 17/10/23 2:03 pm, Richard Biener wrote:
>>> On Thu, Oct 12, 2023 at 10:42 AM Ajit Agarwal
>>> wrote:
>&
On 30/10/23 5:51 pm, Ajit Agarwal wrote:
> Hello Richard:
>
> On 17/10/23 2:47 pm, Richard Biener wrote:
>> On Tue, Oct 17, 2023 at 10:53 AM Ajit Agarwal wrote:
>>>
>>> Hello Richard:
>>>
>>> On 17/10/23 2:03 pm, Richard Biener wrote:
Hello All:
Currently code sinking heuristics are based on profile data like
basic block count and sink frequency threshold. We have removed
such heuristics and added register pressure heuristics based on
live-in and live-out of early blocks and immediate dominator of
use blocks of the same loop ne
Hello Richard:
On 03/11/23 12:51 pm, Richard Biener wrote:
> On Thu, Nov 2, 2023 at 9:50 PM Ajit Agarwal wrote:
>>
>> Hello All:
>>
>> Currently code sinking heuristics are based on profile data like
>> basic block count and sink frequency threshold. We have rem
Hello Richard:
On 03/11/23 7:06 pm, Richard Biener wrote:
> On Fri, Nov 3, 2023 at 11:20 AM Ajit Agarwal wrote:
>>
>> Hello Richard:
>>
>> On 03/11/23 12:51 pm, Richard Biener wrote:
>>> On Thu, Nov 2, 2023 at 9:50 PM Ajit Agarwal wrote:
>>>
1235 16.6 *
554.roms_r1268 5.92 *
Est. SPECrate(R)2017_fp_base8.00
Thanks & Regards
Ajit
On 03/11/23 8:24 pm, Ajit Agarwal wrote:
> Hello Richard:
>
>
> On 03/11/23 7:06 pm, Richard Biener wrote:
>> On Fri, Nov 3,
Hello Segher:
On 01/03/24 3:02 am, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Feb 19, 2024 at 04:24:37PM +0530, Ajit Agarwal wrote:
>> --- a/gcc/config.gcc
>> +++ b/gcc/config.gcc
>> @@ -518,7 +518,7 @@ or1k*-*-*)
>> ;;
>> powerpc*-*-*)
>>
Hello All:
Currently, code sinking will sink code at the use points with loop having same
nesting depth. The following patch improves code sinking by placing the sunk
code in immediate dominator with same loop nest depth.
Changes since v11:
Reorganization of the code.
For example :
void bar();
Hello All:
For rs6000 target we see redundant zero and sign extension and done to improve
ree pass to eliminate such redundant zero and sign extension. Support of
zero_extend/sign_extend/AND. Also support of AND with extension with different
constants like 0x7/0x7F/0x7 other than 1.
Changes s
Hello Richard:
Currently, code sinking will sink code at the use points with loop having same
nesting depth. The following patch improves code sinking by placing the sunk
code in begining of the block after the labels.
For example :
void bar();
int j;
void foo(int a, int b, int c, int d, int e,
Hello All:
Common infrastructure using generic code for load store fusion of rs6000
target.
This patch is split-patch 0 which uses generic code are implemented and defined
that can be used in target specific code for aarch64 and rs6000 target.
Generic code are implemeneted in gcc/pair-fusion-bas
Hello All:
Common infrastructure using generic code for load store fusion of rs6000
target.
Generic code are implemented and defined that can be used in target specific
code for aarch64 and rs6000 target.
Generic code are implemeneted in gcc/pair-fusion-base.h,
gcc/pair-fusion-common.cc
and gc
Hello All:
Common infrastructure using generic code for load store fusion of rs6000
target.
This patch is split-patch 0 which uses generic code are implemented and defined
that can be used in target specific code for aarch64 and rs6000 target.
Generic code are implemeneted in gcc/pair-fusion-b
Hello All:
Common infrastructure using generic code for load store fusion of rs6000
target.
Generic code are implemented and defined that can be used in target specific
code for aarch64 and rs6000 target.
Generic code are implemeneted in gcc/pair-fusion-base.h,
gcc/pair-fusion-common.cc
and gc
Hello Richard/Alex:
Ping!
Please reply.
Thanks & Regards
Ajit
On 27/02/24 12:33 pm, Ajit Agarwal wrote:
> Hello Richard/Alex:
>
> This patch has better diff with changed and unchanged code.
> Unchanged code and some of the changed code will be extracted
> into target inde
Hello All:
When using FlexiBLAS with OpenBLAS we noticed corruption of
the parameters passed to OpenBLAS functions. FlexiBLAS
basically provides a BLAS interface where each function
is a stub that forwards the arguments to a real BLAS lib,
like OpenBLAS.
Fixes the corruption of caller frame chec
Hello Jakub:
When using FlexiBLAS with OpenBLAS we noticed corruption of
the parameters passed to OpenBLAS functions. FlexiBLAS
basically provides a BLAS interface where each function
is a stub that forwards the arguments to a real BLAS lib,
like OpenBLAS.
Fixes the corruption of caller frame che
Hello Jakub:
Addressed the below comments and sent version 1 of the patch
for review.
Thanks & Regards
Ajit
On 22/03/24 1:15 pm, Jakub Jelinek wrote:
> On Fri, Mar 22, 2024 at 01:00:21PM +0530, Ajit Agarwal wrote:
>> When using FlexiBLAS with OpenBLAS we noticed corruption of
>
Hello All:
This is version-2 of the patch with review comments addressed.
When using FlexiBLAS with OpenBLAS we noticed corruption of
the parameters passed to OpenBLAS functions. FlexiBLAS
basically provides a BLAS interface where each function
is a stub that forwards the arguments to a real BLAS
Hello Jakub:
Thanks for review. Addressed below review comments and sent
version 2 of the patch for review.
Thanks & Regards
Ajit
On 22/03/24 3:06 pm, Jakub Jelinek wrote:
> On Fri, Mar 22, 2024 at 02:55:43PM +0530, Ajit Agarwal wrote:
>> rs6000: Stackoverflow in optimized code on
Hello Peter:
On 23/03/24 10:07 am, Peter Bergner wrote:
> On 3/22/24 5:15 AM, Ajit Agarwal wrote:
>> When using FlexiBLAS with OpenBLAS we noticed corruption of
>> the parameters passed to OpenBLAS functions. FlexiBLAS
>> basically provides a BLAS interface where each funct
Hello All:
When using FlexiBLAS with OpenBLAS, we noticed corruption of the caller
stack frame when calling OpenBLAS functions. This was caused by the
FlexiBLAS C/C++ caller and OpenBLAS Fortran callee disagreeing on the
number of function parameters in the callee due to hidden Fortran
parameters
Hello Peter:
Sent version-3 of the patch addressing below review comments.
Thanks & Regards
Ajit
On 23/03/24 3:03 pm, Ajit Agarwal wrote:
> Hello Peter:
>
> On 23/03/24 10:07 am, Peter Bergner wrote:
>> On 3/22/24 5:15 AM, Ajit Agarwal wrote:
>>> When using FlexiB
On 23/03/24 9:33 pm, Peter Bergner wrote:
> On 3/23/24 4:33 AM, Ajit Agarwal wrote:
>>>> - else if (align_words < GP_ARG_NUM_REG)
>>>> + else if (align_words < GP_ARG_NUM_REG
>>>> + || (cum->hidden_string_length
>>>
Hello Alex/Richard:
All review comments are incorporated.
Common infrastructure of load store pair fusion is divided into target
independent and target dependent changed code.
Target independent code is the Generic code with pure virtual function
to interface betwwen target independent and depen
Hello Alex:
On 03/04/24 8:51 pm, Alex Coplan wrote:
> On 23/02/2024 16:41, Ajit Agarwal wrote:
>> Hello Richard/Alex/Segher:
>
> Hi Ajit,
>
> Sorry for the delay and thanks for working on this.
>
> Generally this looks like the right sort of approach (IMO) but I
Hello Alex/Richard:
All review comments are addressed.
Common infrastructure of load store pair fusion is divided into target
independent and target dependent changed code.
Target independent code is the Generic code with pure virtual function
to interface betwwen target independent and dependen
On 05/04/24 10:03 pm, Alex Coplan wrote:
> On 05/04/2024 13:53, Ajit Agarwal wrote:
>> Hello Alex/Richard:
>>
>> All review comments are incorporated.
>
> Thanks, I was kind-of expecting you to also send the renaming patch as a
> preparatory patch as we discusse
Hello Alex:
On 09/04/24 7:29 pm, Alex Coplan wrote:
> On 09/04/2024 17:30, Ajit Agarwal wrote:
>>
>>
>> On 05/04/24 10:03 pm, Alex Coplan wrote:
>>> On 05/04/2024 13:53, Ajit Agarwal wrote:
>>>> Hello Alex/Richard:
>>>>
>>>> A
Hello Alex:
On 09/04/24 8:39 pm, Alex Coplan wrote:
> On 09/04/2024 20:01, Ajit Agarwal wrote:
>> Hello Alex:
>>
>> On 09/04/24 7:29 pm, Alex Coplan wrote:
>>> On 09/04/2024 17:30, Ajit Agarwal wrote:
>>>>
>>>>
>>>> On 05/04/24 10
Hello Alex:
On 10/04/24 1:42 pm, Alex Coplan wrote:
> Hi Ajit,
>
> On 09/04/2024 20:59, Ajit Agarwal wrote:
>> Hello Alex:
>>
>> On 09/04/24 8:39 pm, Alex Coplan wrote:
>>> On 09/04/2024 20:01, Ajit Agarwal wrote:
>>>> Hello Alex:
>>>>
Hello Alex/Richard:
All comments are addressed in this version-1 of the patch.
Common infrastructure of load store pair fusion is divded into target
independent and target dependent changed code.
Target independent code is the Generic code with pure virtual function
to interface betwwen target i
Hello Alex:
On 10/04/24 7:52 pm, Alex Coplan wrote:
> Hi Ajit,
>
> On 10/04/2024 15:31, Ajit Agarwal wrote:
>> Hello Alex:
>>
>> On 10/04/24 1:42 pm, Alex Coplan wrote:
>>> Hi Ajit,
>>>
>>> On 09/04/2024 20:59, Ajit Agarwal wrote:
>>>
Hello Alex:
On 24/01/24 10:13 pm, Alex Coplan wrote:
> Hi Ajit,
>
> On 21/01/2024 19:57, Ajit Agarwal wrote:
>>
>> Hello All:
>>
>> New pass to replace adjacent memory addresses lxv with lxvp.
>> Added common infrastructure for load store fusion for
>>
Hello Richard:
On 14/02/24 4:03 pm, Richard Sandiford wrote:
> Hi,
>
> Thanks for working on this.
>
> You posted a version of this patch on Sunday too. If you need to repost
> to fix bugs or make other improvements, could you describe the changes
> that you've made since the previous version?
On 14/02/24 7:22 pm, Ajit Agarwal wrote:
> Hello Richard:
>
>
> On 14/02/24 4:03 pm, Richard Sandiford wrote:
>> Hi,
>>
>> Thanks for working on this.
>>
>> You posted a version of this patch on Sunday too. If you need to repost
>> to fix bug
Hello Sam:
On 14/02/24 10:50 pm, Sam James wrote:
>
> Ajit Agarwal writes:
>
>> Hello Richard:
>>
>>
>> On 14/02/24 4:03 pm, Richard Sandiford wrote:
>>> Hi,
>>>
>>> Thanks for working on this.
>>>
>>> You posted a
On 14/02/24 10:56 pm, Richard Sandiford wrote:
> Ajit Agarwal writes:
>>>> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
>>>> index 88ee0dd67fc..a8d0ee7c4db 100644
>>>> --- a/gcc/df-problems.cc
>>>> +++ b/gcc/df-problems.cc
>>>
Hello Richard:
On 14/02/24 10:45 pm, Richard Sandiford wrote:
> Ajit Agarwal writes:
>>>> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
>>>> index 1856fa4884f..ffc47a6eaa0 100644
>>>> --- a/gcc/emit-rtl.cc
>>>> +++ b/gcc/emit-rtl.cc
>>&
Hello Richard:
On 15/02/24 1:14 am, Richard Sandiford wrote:
> Ajit Agarwal writes:
>> On 14/02/24 10:56 pm, Richard Sandiford wrote:
>>> Ajit Agarwal writes:
>>>>>> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
>>>>>> index 88ee0dd67
Hello Richard:
On 15/02/24 2:21 am, Richard Sandiford wrote:
> Ajit Agarwal writes:
>> Hello Richard:
>>
>>
>> On 14/02/24 10:45 pm, Richard Sandiford wrote:
>>> Ajit Agarwal writes:
>>>>>> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
>
Hello Richard:
As per your suggestion I have divided the patch into target independent
and target dependent for aarch64 target. I kept aarch64-ldp-fusion same
and did not change that.
Common infrastructure of load store pair fusion is divided into
target independent and target dependent code for
Hello Alex:
On 15/02/24 10:12 pm, Alex Coplan wrote:
> On 15/02/2024 21:24, Ajit Agarwal wrote:
>> Hello Richard:
>>
>> As per your suggestion I have divided the patch into target independent
>> and target dependent for aarch64 target. I kept aarch64-ldp-fusion same
On 15/02/24 10:43 pm, Alex Coplan wrote:
> So IIUC Richard was suggesting splitting into target-independent and
> target-dependent pieces within aarch64-ldp-fusion.cc as a first step,
> i.e. you introduce the abstractions (virtual functions) needed within
> that file. That should hopefully be a
Hello Alex/Richard:
I have placed target indpendent and target dependent code in
aarch64-ldp-fusion for load store fusion.
Common infrastructure of load store pair fusion is divided into
target independent and target dependent code.
Target independent code is the Generic code with pure virtual
f
Hello All:
This patch is for load store fusion for rs6000 target using common
infrastructure.
Common infrastructure using generic code for load store fusion of rs6000
target.
Generic code are implemented and defined that can be used in target specific
code for aarch64 and rs6000 target.
Gene
Hello All:
Changes in V3 since V2 patch.
Fdllowing changes are done in this patch.
a) Remove commented asserted code in rtl-ssa/changes.cc
b) Handle such code in rs6000-vecload-fusion.cc.
Same as V2:
Common infrastructure using generic code for load store fusion of rs6000
target.
Generic code
Hello Richard/Alex/Segher:
This patch adds the changed code for target independent and
dependent code for load store fusion.
Common infrastructure of load store pair fusion is
divided into target independent and target dependent
changed code.
Target independent code is the Generic code with
pure
Hello Richard:
On 23/02/24 1:19 am, Richard Sandiford wrote:
> Ajit Agarwal writes:
>> Hello Alex/Richard:
>>
>> I have placed target indpendent and target dependent code in
>> aarch64-ldp-fusion for load store fusion.
>>
>> Common infrastructure of
-fusion
Please review.
Thanks & Regards
Ajit
On 23/02/24 4:41 pm, Ajit Agarwal wrote:
> Hello Richard/Alex/Segher:
>
> This patch adds the changed code for target independent and
> dependent code for load store fusion.
>
> Common infrastructure of load store pair fusion is
Hello All:
This patch add the vecload pass to replace adjacent memory accesses lxv with
lxvp
instructions. This pass is added before ira pass.
vecload pass removes one of the defined adjacent lxv (load) and replace with
lxvp.
Due to removal of one of the defined loads the allocno is has only us
Hello All:
Following performance gains for spec2017 FP benchmarks.
554.roms_r 16% gains
544.nab_r 9.98% gains
521.wrf_r 6.89% gains.
Thanks & Regards
Ajit
On 14/01/24 8:55 pm, Ajit Agarwal wrote:
> Hello All:
>
> This patch add the vecload pass to replace adjacent memory acce
Hello Richard:
On 15/01/24 3:03 pm, Richard Biener wrote:
> On Sun, Jan 14, 2024 at 4:29 PM Ajit Agarwal wrote:
>>
>> Hello All:
>>
>> This patch add the vecload pass to replace adjacent memory accesses lxv with
>> lxvp
>> instructions. This pass is ad
On 15/01/24 6:14 pm, Ajit Agarwal wrote:
> Hello Richard:
>
> On 15/01/24 3:03 pm, Richard Biener wrote:
>> On Sun, Jan 14, 2024 at 4:29 PM Ajit Agarwal wrote:
>>>
>>> Hello All:
>>>
>>> This patch add the vecload pass to replace adjacent
Hello Richard:
On 15/01/24 6:25 pm, Ajit Agarwal wrote:
>
>
> On 15/01/24 6:14 pm, Ajit Agarwal wrote:
>> Hello Richard:
>>
>> On 15/01/24 3:03 pm, Richard Biener wrote:
>>> On Sun, Jan 14, 2024 at 4:29 PM Ajit Agarwal wrote:
>>>>
>>>
Hello Kewen:
On 17/01/24 12:32 pm, Kewen.Lin wrote:
> on 2024/1/16 06:22, Ajit Agarwal wrote:
>> Hello Richard:
>>
>> On 15/01/24 6:25 pm, Ajit Agarwal wrote:
>>>
>>>
>>> On 15/01/24 6:14 pm, Ajit Agarwal wrote:
>>>> Hello Richard:
>&
Hello Michael:
On 17/01/24 7:58 pm, Michael Matz wrote:
> Hello,
>
> On Wed, 17 Jan 2024, Ajit Agarwal wrote:
>
>>> first is even, since OOmode is only ok for even vsx register and its
>>> size makes it take two consecutive vsx registers.
>>>
&g
Hello All:
New pass to replace adjacent memory addresses lxv with lxvp.
Added common infrastructure for load store fusion for
different targets.
Common routines are refactored in fusion-common.h.
AARCH64 load/store fusion pass is not changed with the
common infrastructure.
For AARCH64 archit
Hello All:
New pass to replace adjacent memory addresses lxv with lxvp.
Added common infrastructure for load store fusion for
different targets.
Common routines are refactored in fusion-common.h.
AARCH64 load/store fusion pass is not changed with the
common infrastructure.
For AARCH64 archit
Hello Alex:
Thanks for your valuable review comments.
I am incorporating the comments and would send the patch with rs6000 and
AARCH64 changes.
Thanks & Regards
Ajit
On 24/01/24 10:13 pm, Alex Coplan wrote:
> Hi Ajit,
>
> On 21/01/2024 19:57, Ajit Agarwal wrote:
>>
>
Hello All:
This pass is registered before ira rtl pass.
Bootstrapped and regtested for powerpc64-linux-gnu.
No regressions for spec 2017 benchmarks and improvements for some of the
FP and INT benchmarks.
Vladimir:
I did modify IRA and LRA register Allocators. Please review.
Thanks & Regards
Aj
Hello Kewen:
On 24/11/23 3:01 pm, Kewen.Lin wrote:
> Hi Ajit,
>
> Don't forget to CC David (CC-ed) :), some comments are inlined below.
>
> on 2023/10/8 03:04, Ajit Agarwal wrote:
>> Hello All:
>>
>> This patch add new pass to replace contiguous
On 28/11/23 3:14 pm, Kewen.Lin wrote:
> on 2023/11/28 15:05, Michael Meissner wrote:
>> I tried using this patch to compare with the vector size attribute patch I
>> posted. I could not build it as a cross compiler on my x86_64 because the
>> assembler gives the following error:
>>
>> Error: op
1 - 100 of 320 matches
Mail list logo