Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2025-01-15 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Richard, > >> Sorry to be awkward, but I don't think we should put >> AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT in base. >> CHEAP_SHIFT_EXTEND is a good base flag because it means we can make full >> use of a certain group of instructions.  FULLY_PIPELINED_FMA simila

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2025-01-14 Thread Wilco Dijkstra
Hi Richard, > Sorry to be awkward, but I don't think we should put > AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT in base. > CHEAP_SHIFT_EXTEND is a good base flag because it means we can make full > use of a certain group of instructions.  FULLY_PIPELINED_FMA similarly > means that FMA chains beh

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2025-01-10 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Kyrill, > >>> Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and >>> AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT >>> to the baseline tuning since all modern cores use it.  Fix the >>> neoverse512tvb tuning to be >>> like Neoverse V1/V2. >> >> For neoversev512tvb this me

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2025-01-10 Thread Wilco Dijkstra
Hi Kyrill, >> Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and >> AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT >> to the baseline tuning since all modern cores use it.  Fix the >> neoverse512tvb tuning to be >> like Neoverse V1/V2. > > For neoversev512tvb this means adding AARCH64_EXTRA_TUNE_AVOI

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2025-01-10 Thread Kyrylo Tkachov
> On 10 Jan 2025, at 15:54, Wilco Dijkstra wrote: > > ping > > > Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and > AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT > to the baseline tuning since all modern cores use it. Fix the neoverse512tvb > tuning to be > like Neoverse V1/V2. For neovers

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2025-01-10 Thread Wilco Dijkstra
ping   Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT to the baseline tuning since all modern cores use it.  Fix the neoverse512tvb tuning to be like Neoverse V1/V2. gcc/ChangeLog:     * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TU

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2024-11-19 Thread Richard Sandiford
Kyrylo Tkachov writes: >> On 15 Nov 2024, at 12:33, Wilco Dijkstra wrote: >> >> Hi Kyrill, >> >>> This would make USE_NEW_VECTOR_COSTS effectively the default. >>> Jennifer has been trying to do that as well and then to remove it (as it >>> would be always true) but there are some codegen regr

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2024-11-15 Thread Kyrylo Tkachov
> On 15 Nov 2024, at 12:33, Wilco Dijkstra wrote: > > Hi Kyrill, > >> This would make USE_NEW_VECTOR_COSTS effectively the default. >> Jennifer has been trying to do that as well and then to remove it (as it >> would be always true) but there are some codegen regressions that still > >> need

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2024-11-15 Thread Wilco Dijkstra
Hi Kyrill, > This would make USE_NEW_VECTOR_COSTS effectively the default. > Jennifer has been trying to do that as well and then to remove it (as it > would be always true) but there are some codegen regressions that still > > need to be addressed. Yes, that's the goal - we should use good tun

Re: [PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2024-11-15 Thread Kyrylo Tkachov
Hi Wilco, > On 14 Nov 2024, at 18:44, Wilco Dijkstra wrote: > > > Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and > AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT > to the baseline tuning since all modern cores use it. Fix the neoverse512tvb > tuning to be > like Neoverse V1/V2. > This would

[PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2024-11-14 Thread Wilco Dijkstra
Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT to the baseline tuning since all modern cores use it. Fix the neoverse512tvb tuning to be like Neoverse V1/V2. gcc/ChangeLog: * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNE_BASE