Hi Philipp,

> -----Original Message-----
> From: Philipp Tomsich <philipp.toms...@vrull.eu>
> Sent: Monday, April 3, 2023 12:46 PM
> To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> <richard.sandif...@arm.com>; Tamar Christina
> <tamar.christ...@arm.com>; Manolis Tsamis <manolis.tsa...@vrull.eu>
> Subject: Re: [PATCH] aarch64: update ampere1 vectorization cost
> 
> Kyrill,
> 
> We reran on GCC12 and GCC11, reproducing the same improvements (e.g.,
> on fotonik3d) that prompted the changes.
> I'll apply the backports later this week, unless you have any further
> concerns…

Ok, thanks for checking.
Kyrill

> 
> Thanks,
> Philipp.
> 
> 
> On Mon, 27 Mar 2023 at 11:24, Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Philipp Tomsich <philipp.toms...@vrull.eu>
> > > Sent: Monday, March 27, 2023 9:50 AM
> > > To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> > > <richard.sandif...@arm.com>; Tamar Christina
> > > <tamar.christ...@arm.com>; Manolis Tsamis
> <manolis.tsa...@vrull.eu>
> > > Subject: Re: [PATCH] aarch64: update ampere1 vectorization cost
> > >
> > > On Mon, 27 Mar 2023 at 16:45, Kyrylo Tkachov
> <kyrylo.tkac...@arm.com>
> > > wrote:
> > > >
> > > > Hi Philipp,
> > > >
> > > > > -----Original Message-----
> > > > > From: Gcc-patches <gcc-patches-
> > > > > bounces+kyrylo.tkachov=arm....@gcc.gnu.org> On Behalf Of Philipp
> > > > > Tomsich
> > > > > Sent: Monday, March 27, 2023 8:47 AM
> > > > > To: gcc-patches@gcc.gnu.org
> > > > > Cc: Richard Sandiford <richard.sandif...@arm.com>; Tamar
> Christina
> > > > > <tamar.christ...@arm.com>; Philipp Tomsich
> > > <philipp.toms...@vrull.eu>;
> > > > > Manolis Tsamis <manolis.tsa...@vrull.eu>
> > > > > Subject: [PATCH] aarch64: update ampere1 vectorization cost
> > > > >
> > > > > The original submission of AmpereOne (-mcpu=ampere1) costs
> occurred
> > > > > prior to exhaustive testing of vectorizable workloads against
> > > > > hardware.
> > > > >
> > > > > Adjust the vector costs to achieve the best results and more closely
> > > > > match the underlying hardware.
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >       * config/aarch64/aarch64.cc: Update vector costs for ampere1.
> > > > >
> > > > > Co-Authored-By: Manolis Tsamis <manolis.tsa...@vrull.eu>
> > > > >
> > > > > Signed-off-by: Philipp Tomsich <philipp.toms...@vrull.eu>
> > > > > ---
> > > > > We would like to get this into GCC 13 to avoid having to backport at
> > > > > the start of the next cycle.
> > > > >
> > > >
> > > > Given this affects only the ampere1 costs that sounds fine to me and
> fairly
> > > low risk, you are being trusted that these costs are actually desirable 
> > > and
> > > properly validated on the hardware involved.
> > > >
> > > > > OK for backports?
> > > >
> > > > This is ok for trunk (GCC 13). Do you also want to backport this to 
> > > > other
> > > branches?
> > >
> > > Ampere1 (with the older vector costs) are in GCC12 and GCC11.
> > > I would like to backport to those as well.
> >
> > Ok then, though you may want to run the benchmarks on the branches as
> well to make sure the costs give the expected benefit there as well.
> > Thanks,
> > Kyrill
> >
> > >
> > > Thanks,
> > > Philipp.
> > >
> > > > Thanks,
> > > > Kyrill
> > > >
> > > > >
> > > > >  gcc/config/aarch64/aarch64.cc | 12 ++++++------
> > > > >  1 file changed, 6 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/gcc/config/aarch64/aarch64.cc
> > > b/gcc/config/aarch64/aarch64.cc
> > > > > index b27f4354031..661fff65cea 100644
> > > > > --- a/gcc/config/aarch64/aarch64.cc
> > > > > +++ b/gcc/config/aarch64/aarch64.cc
> > > > > @@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost
> > > > > thunderx3t110_vector_cost =
> > > > >
> > > > >  static const advsimd_vec_cost ampere1_advsimd_vector_cost =
> > > > >  {
> > > > > -  3, /* int_stmt_cost  */
> > > > > +  1, /* int_stmt_cost  */
> > > > >    3, /* fp_stmt_cost  */
> > > > >    0, /* ld2_st2_permute_cost  */
> > > > >    0, /* ld3_st3_permute_cost  */
> > > > > @@ -1148,17 +1148,17 @@ static const advsimd_vec_cost
> > > > > ampere1_advsimd_vector_cost =
> > > > >    8, /* store_elt_extra_cost  */
> > > > >    6, /* vec_to_scalar_cost  */
> > > > >    7, /* scalar_to_vec_cost  */
> > > > > -  5, /* align_load_cost  */
> > > > > -  5, /* unalign_load_cost  */
> > > > > -  2, /* unalign_store_cost  */
> > > > > -  2  /* store_cost  */
> > > > > +  4, /* align_load_cost  */
> > > > > +  4, /* unalign_load_cost  */
> > > > > +  1, /* unalign_store_cost  */
> > > > > +  1  /* store_cost  */
> > > > >  };
> > > > >
> > > > >  /* Ampere-1 costs for vector insn classes.  */
> > > > >  static const struct cpu_vector_cost ampere1_vector_cost =
> > > > >  {
> > > > >    1, /* scalar_int_stmt_cost  */
> > > > > -  1, /* scalar_fp_stmt_cost  */
> > > > > +  3, /* scalar_fp_stmt_cost  */
> > > > >    4, /* scalar_load_cost  */
> > > > >    1, /* scalar_store_cost  */
> > > > >    1, /* cond_taken_branch_cost  */
> > > > > --
> > > > > 2.34.1
> > > >

Reply via email to