Attached is a new version of the patch.
> -Original Message-
> From: Richard Biener
> Sent: Friday, October 6, 2023 5:33 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_widt
Hello and Ping,
Thanks,
Di
> -Original Message-
> From: Di Zhao OS
> Sent: Monday, October 9, 2023 12:40 AM
> To: Richard Biener
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
>
>
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, December 13, 2023 5:01 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
>
> On Wed, Dec 13, 2023 at
Hello Thomas,
> -Original Message-
> From: Thomas Schwinge
> Sent: Friday, December 15, 2023 5:46 PM
> To: Di Zhao OS ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener
> Subject: RE: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
>
Updated the fix in attachment.
Is it OK for trunk?
Tested on aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu.
Thanks,
Di Zhao
> -Original Message-
> From: Di Zhao OS
> Sent: Sunday, December 17, 2023 8:31 PM
> To: Thomas Schwinge ; gcc-patches@gcc.gnu.org
> Cc:
Committed at 6cec7b06b3c8187b36fc05cfd4dd38b42313d727
Thanks,
Di
> -Original Message-
> From: Richard Biener
> Sent: Friday, December 22, 2023 11:40 PM
> To: Di Zhao OS
> Cc: Thomas Schwinge ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] [tree-optimization/11027
This patch adds a new tuning option 'AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA',
to consider fully pipelined FMAs in reassociation. Also, set this option
by default for Ampere CPUs.
Tested on aarch64-unknown-linux-gnu. Is this OK for trunk?
Thanks,
Di Zhao
gcc/ChangeLog:
* config/aarch64/a
> -Original Message-
> From: Richard Sandiford
> Sent: Friday, December 29, 2023 6:24 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] aarch64: add 'AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA'
>
> Di Zhao OS writes:
>
Hello Richard,
Thank you for the review. Fixed the problems and committed to master.
Thanks,
Di
> -Original Message-
> From: Richard Earnshaw
> Sent: Thursday, November 30, 2023 8:21 PM
> To: Di Zhao OS ; gcc-patches@gcc.gnu.org
> Cc: Philipp Tomsich
> Subject: R
Hello Richard,
> -Original Message-
> From: Richard Biener
> Sent: Monday, December 11, 2023 7:01 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
>
> On Wed, Nov
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, October 31, 2023 9:48 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
>
> On Sun, Oct 8, 2023 at
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, November 21, 2023 9:01 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
>
> On Thu, Nov 9, 2023 at
This patch modifies tunings for ampere1/ampere1a/ampere1b, to:
1. Allow reassociation on FP additions.
2. Avoid generating loop-dependant FMA chains. Added a tuning
option for this.
Bootstrapped and tested. Is this ok for trunk?
Thanks,
Di Zhao
gcc/ChangeLog:
* config/aarch64/aarch64-t
Hi,
Shall I push this if no objection?
Thanks,
Di Zhao
> -Original Message-
> From: Di Zhao OS
> Sent: Tuesday, June 18, 2024 9:52 AM
> To: Jeff Law
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PING][PATCH] [tree-optimization/110279] fix testcase pr110279-1.c
>
&
The test case is for targets that support FMA. Previously
the "target" selector is missed in dg-final command.
Tested on x86_64-pc-linux-gnu.
Thanks
Di Zhao
gcc/testsuite/ChangeLog:
* gcc.dg/pr110279-1.c: add target selector.
---
gcc/testsuite/gcc.dg/pr110279-1.c | 2 +-
1 file change
> -Original Message-
> From: Jeff Law
> Sent: Wednesday, May 22, 2024 11:14 PM
> To: Di Zhao OS ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] [tree-optimization/110279] fix testcase pr110279-1.c
>
>
>
> On 5/22/24 5:46 AM, Di Zhao OS wrote:
> >
This is OK for trunk?
Thanks,
Di Zhao
> -Original Message-
> From: Di Zhao OS
> Sent: Thursday, May 23, 2024 5:55 PM
> To: Jeff Law
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH] [tree-optimization/110279] fix testcase pr110279-1.c
>
> > -Original
This patch tries to fix pr114760 by checking for the
variants explicitly. When recognizing bit counting idiom,
include pattern "x * 2" for "x << 1", and "x / 2" for
"x >> 1" (given x is unsigned).
Bootstrapped and tested on x86_64-linux-gnu.
Thanks,
Di Zhao
---
gcc/ChangeLog:
PR tree-op
Fixed the problems and committed to trunk.
Thanks,
Di Zhao
> -Original Message-
> From: Richard Biener
> Sent: Friday, May 10, 2024 8:56 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] tree-optimization/114760 - check variants of >>
Committed to trunk.
Thanks,
Di Zhao
> -Original Message-
> From: Jeff Law
> Sent: Monday, September 30, 2024 6:28 AM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] [tree-optimization/110279] fix testcase pr110279-1.c
>
>
>
>
Sorry I've missed the recent updates on trunk regarding handling FMA.
I'll measure again if something in this still helps.
Thanks,
Di Zhao
> -Original Message-----
> From: Di Zhao OS
> Sent: Friday, May 26, 2023 3:15 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [RFC][
Hello Lili Cui,
Since I'm also trying to improve this lately, I've tested your patch on
several aarch64 machines we have, including neoverse-n1 and ampere1
architectures. However, I haven't reproduced the 6.00% improvement of
503.bwaves_r single copy run you mentioned. Could you share more inform
Cherry-picked this to gcc-13.
Thanks,
Di Zhao
> -Original Message-
> From: Richard Sandiford
> Sent: Monday, June 26, 2023 10:28 PM
> To: Philipp Tomsich
> Cc: Di Zhao OS via Gcc-patches ; Di Zhao OS
>
> Subject: Re: [PATCH] Change fma_reassoc_width tuning for
Updated the patch in the attachment, so it can apply.
Thanks,
Di Zhao
> -Original Message-
> From: Di Zhao OS
> Sent: Sunday, May 29, 2022 11:59 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Biener
> Subject: [PATCH v5] tree-optimization/101186 - extend FRE with "
Hi,
The previous version of this patch tries to solve two problems
at the same time. For better clarity, I'll separate them and
only deal with the "nested" FMA in this version. I plan to
propose another patch in avoiding bad shaped FMA (deferring FMA).
Other changes:
1. Added new testcases for
/ChangeLog:
* gcc.dg/pr110279-1.c: New test.
* gcc.dg/pr110279-2.c: New test.
* gcc.dg/pr110279-3.c: New test.
> -Original Message-
> From: Di Zhao OS
> Sent: Thursday, August 10, 2023 12:53 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Biener
> Subjec
This patch is to fix the regressions found in SPEC2017 fprate cases
on aarch64.
1. Reused code in pass widening_mul to check for nested FMA chains
(those connected by MULT_EXPRs), since re-writing to parallel
generates worse codes.
2. Avoid re-arrange to produce less FMA chains that can be slo
This patch enables reassociation of floating-point additions on ampere1.
This brings about 1% overall benefit on spec2017 fprate cases. (There
are minor regressions in 510.parest_r and 508.namd_r, analyzed here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110279 .)
Bootstrapped and tested on aarc
1.3%
508.namd_r 1.58%
overall 0.42%
Thanks,
Di Zhao
> -Original Message-----
> From: Di Zhao OS
> Sent: Friday, June 16, 2023 4:51 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH] tree-optimization/110279- Check for nested FMA chains in
> reassoc
&
Attached is an updated version of the patch.
Based on Philipp's review, some changes:
1. Defined new enum fma_state to describe the state of FMA candidates
for a list of operands. (Since the tests seems simple after the
change, I didn't add predicates on it.)
2. Changed return type of conve
As GCC's reassociation pass does not have knowledge of FMA, when
transforming expression lists to parallel, it reduces the
opportunities to generate FMAs. Currently there's a workaround
on AArch64 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114),
that is, to disable the parallelization with flo
I'm very sorry there seems to be encoding issue in the attachment
in my last email. Attached is the new patch.
Thanks,
Di Zhao
> -Original Message-
> From: Di Zhao OS
> Sent: Tuesday, November 16, 2021 1:24 AM
> To: 'Richard Biener'
> Cc: gcc-patches@gcc
Sorry for the late update. I've been on a vacation and then I
spent some time updating and verifying the patch.
Attached is a new version of the patch. There are some changes:
1. Store equivalences in a vn_pval chain in vn_ssa_aux, rather than
in the expression hash table. (Following Richard's
Gentle ping again.
Thanks,
Di Zhao
> -Original Message-
> From: Di Zhao OS
> Sent: Tuesday, July 12, 2022 2:08 AM
> To: 'gcc-patches@gcc.gnu.org'
> Cc: 'Richard Biener'
> Subject: PING: [PATCH v5] tree-optimization/101186 - extend FRE with
> &
If the first predicate value is different and copied, the comparison will then
be between val->result and the copied one, which seems to be a bug. That can
cause inserting extra vn_pvals.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
Regards,
Di Zhao
gcc/ChangeLog:
* tree-ssa-s
Sorry about updating on this after so long. It took me much time to work out a
new plan and pass the tests.
The new idea is to use one variable to represent a set of equal variables at
some basic-block. This variable is called a "equivalence head" or "equiv-head"
in the code. (There's no-longer a
Thanks,
Di
-Original Message-
From: Gcc-patches
On Behalf Of Di
Zhao OS via Gcc-patches
Sent: Friday, September 17, 2021 2:13 AM
To: gcc-patches@gcc.gnu.org
Subject: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence
map" for condition prediction
Sorry abou
Hi,
Gentle ping on this.
Di Zhao
-Original Message-
From: Di Zhao OS
Sent: Monday, October 25, 2021 3:03 AM
To: Richard Biener
Cc: gcc-patches@gcc.gnu.org
Subject: RE: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence
map" for condition prediction
Hi,
Att
Attached is the updated patch. Fixed some errors in testcases.
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, November 10, 2021 5:44 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.org; Andrew MacLeod
> Subject: Re: [PATCH v2] tree-optimization/101186
Hi,
Attached is a new version of the patch, mainly for improving performance
and simplifying the code.
First, regarding the comments:
> -Original Message-
> From: Richard Biener
> Sent: Friday, October 1, 2021 9:00 PM
> To: Di Zhao OS
> Cc: gcc-patches@gcc.gnu.or
I tried to improve the patch following your advices and to catch more
opportunities. Hope it'll be helpful.
On 6/24/21 8:29 AM, Richard Biener wrote:
> On Thu, Jun 24, 2021 at 11:55 AM Di Zhao via Gcc-patches patc...@gcc.gnu.org> wrote:
>
> I have some reservations about extending the ad-hoc "
This patch tries to fix the 2% regression in 510.parest_r on
ampere1 in the tracker. (Previous discussion is here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624893.html)
1. Add testcases for the problem. For an op list in the form of
"acc = a * b + c * d + acc", currently reassociation d
This patch tries to improve alias-analysis between an SSA_NAME and
a declaration a little. For a case like:
int array1[10], array2[10];
ptr1 = array1 + x;
ptr2 = ptr1 + y;
, *ptr2 should not alias with array2.
If we can't disambiguate from points-to information, this patc
Hi,
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, August 29, 2023 3:41 PM
> To: Jeff Law ; Martin Jambor
> Cc: Di Zhao OS ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] [tree-optimization/110279] swap operands in reassoc to
> reduce cross backedge
Hi,
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, August 29, 2023 4:09 PM
> To: Di Zhao OS
> Cc: Jeff Law ; Martin Jambor ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH] [tree-optimization/110279] swap operands in reassoc to
> reduce cross
Hello Richard,
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, August 29, 2023 7:11 PM
> To: Di Zhao OS
> Cc: Jeff Law ; Martin Jambor ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH] [tree-optimization/110279] swap operands in reassoc to
> r
> -Original Message-
> From: Richard Biener
> Sent: Thursday, August 31, 2023 8:23 PM
> To: Di Zhao OS
> Cc: Jeff Law ; Martin Jambor ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH] [tree-optimization/110279] swap operands in reassoc to
> reduce cross backedge
This is a new version of the patch on "nested FMA".
Sorry for updating this after so long, I've been studying and
writing micro cases to sort out the cause of the regression.
First, following previous discussion:
(https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629080.html)
1. From testi
Hi,
I saw that Stage 1 of GCC 13 development is just ended. So is this
considered? Or should I bring this up when general development is
reopened?
Thanks,
Di Zhao
> -Original Message-
> From: Di Zhao OS
> Sent: Tuesday, October 25, 2022 8:18 AM
> To: gcc-patches@gcc.
Hi, attached is a new version of the patch. The changes are:
- Skip using temporary equivalences for floating-point values, because
folding expressions can generate incorrect values. For example,
operations on 0.0 and -0.0 may have different results.
- Avoid inserting duplicated back-refs from val
A few minor updates on the patch:
- Simplify function record_equiv_from_prev_phi_1 by removing an argument.
- Fixed two small bugs that can lead to losing optimize opportunities.
Thanks,
Di Zhao
---
Extend FRE with temporary equivalences.
2021-12-13 Di Zhao
gcc/ChangeLog:
PR tree-op
Here's a brief summary on the patch:
v4 (this version):
- In process_bb's condition-prediction code: update equivalence-heads if
value-numbers have changed, otherwise some chances can be lost.
v3 (a few minor updates):
- Simplify function record_equiv_from_prev_phi_1 by removing an argument.
-
52 matches
Mail list logo