Re: [PATCH] RISC-V: Allow redirecting function calls to the same FMV clone

2025-03-23 Thread Andrew Carlotti
Two brief comments, since I'm on holiday until 31st but happened to notice this patch anyway. On Mon, Mar 24, 2025 at 02:19:21AM +0800, Yangyu Chen wrote: > This behavior does not ensure that if any higher priority callee version > were selected at runtime, then a higher priority caller version wo

Re: [PATCH] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2025-03-15 Thread Andrew Carlotti
On Thu, Mar 13, 2025 at 05:10:07PM +, Andre Vieira (lists) wrote: > Apologies for the delay, had been waiting on some other relevant patches to > go in to make sure we didn't break any valid existing behaviours. It should > all be working properly now. I think I've also addressed all your comme

[committed v4] aarch64: Ignore target pragmas while defining intrinsics

2025-03-03 Thread Andrew Carlotti
Compared to v3, this version: - moves the sve_alignment_switcher in handle_arm_sve_h to function scope (and fixes an inaccurate changelog message); - updates affected Makefile dependencies. The patch was preapproved by Richard with the first change, and the second change is obvious, so I've comm

[PATCH v3] aarch64: Ignore target pragmas while defining intrinsics

2025-02-27 Thread Andrew Carlotti
Compared to v2, this splits out the alignment switching into a new class and merges the rest of the switching functionality into aarch64_target_switcher, as agreed with Richard in the previous review discussion. Bootstrapped and regression tested on aarch64. Is this ok for master? --- Refactor t

Re: [PATCH v2] aarch64: Ignore target pragmas while defining intrinsics

2025-02-21 Thread Andrew Carlotti
On Fri, Feb 21, 2025 at 11:28:14AM +, Richard Sandiford wrote: > Andrew Carlotti writes: > > On Wed, Feb 19, 2025 at 12:17:55PM +, Richard Sandiford wrote: > >> Andrew Carlotti writes: > >> > /* Print a list of CANDIDATES for an argument, and try

Re: [PATCH v2] aarch64: Ignore target pragmas while defining intrinsics

2025-02-20 Thread Andrew Carlotti
On Wed, Feb 19, 2025 at 12:17:55PM +, Richard Sandiford wrote: > Andrew Carlotti writes: > > [...] > > @@ -204,6 +207,18 @@ static constexpr aarch64_processor_info all_cores[] = > >{NULL, aarch64_no_cpu, aarch64_no_arch, 0} > > }; > > > > +/* R

[PATCH v2] aarch64: Ignore target pragmas while defining intrinsics

2025-02-18 Thread Andrew Carlotti
Compared to v1, I've added a new function aarch64_get_required_features to avoid having to pass a long list of explicit features. I also changed aarch64_target_switcher to only disable TARGET_GENERAL_REGS_ONLY if the requested flags include FP, to address Richard's comment. Bootstrapped and regre

[PATCH] aarch64: Ignore target pragmas while defining intrinsics

2025-02-18 Thread Andrew Carlotti
When initialising intrinsics with `#pragma GCC aarch64 "arm_*.h"`, we often set an explicit target, but currently leave current_target_pragma unchanged. This results in the target pragma being applied to each simulated intrinsic on top of our explicit target, which is clearly undesirable. As far

Re: [PATCH v1 04/16] Remove unecessary `record` argument from maybe_version_functions.

2025-02-07 Thread Andrew Carlotti
On Mon, Feb 03, 2025 at 01:04:08PM +, Alfie Richards wrote: > > The `record` argument in maybe_version_function was intended to allow > controlling recording the relationship of versions. However, it only > exercised this if both input funcitons were already marked as versioned, > and this sam

[PATCH] aarch64: Update fp8 dependencies

2025-02-07 Thread Andrew Carlotti
We agreed with LLVM developer to not enforce the architectural dependencies between fp8 multiplication features, and they have already been removed from LLVM and Binutils. Remove them from GCC as well. I have bootstrapped and regression tested this. There are no test result changes between GCC

[PATCH] testsuite: Enable reduced parallel batch sizes

2025-02-05 Thread Andrew Carlotti
Various aarch64 tests attempt to reduce the batch size for parallel test execution to a single test per batch, but it looks like the necessary changes to gcc_parallel_test_run_p were accidentally omitted when the aarch64-*-acle-asm.exp files were merged. This patch corrects that omission. This do

[PATCH v2 3/3] aarch64: Add +cpa feature flag

2025-01-24 Thread Andrew Carlotti
This doesn't enable anything within the compiler, but this allows the flag to be passed the assembler. There also doesn't appear to be a kernel cpuinfo name yet. Ok for master? gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V9_5A): Add CPA. * config/aarch64/aarch64-option-

[PATCH v2 11/11] aarch64: Make AARCH64_FL_CRYPTO always unset

2025-01-24 Thread Andrew Carlotti
This feature flag bit only exists to support the +crypto alias. Outside of option processing this bit needs to be set or unset consistently. This patch goes with the latter option. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc: Assert that CRYPTO bit is not set.

[PATCH v2 10/11] aarch64: Refactor aarch64_rewrite_mcpu

2025-01-24 Thread Andrew Carlotti
Use aarch64_validate_cpu instead of the existing duplicate (and worse) version of the -mcpu parsing code. The original code used fatal_error; I'm guessing that using error instead should be ok. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_rewrite_selected_cpu

[PATCH v2 01/11] aarch64: Improve mcpu/march conflict check

2025-01-24 Thread Andrew Carlotti
Features from a cpu or base architecture that were explicitly disabled by a +nofeat option were being incorrectly added back in before checking for conflicts between -mcpu and -march options. This patch instead compares the returned feature masks directly. gcc/ChangeLog: * config/aarch64

[PATCH v2 06/11] aarch64: Inline aarch64_print_hint_for_core_or_arch

2025-01-24 Thread Andrew Carlotti
It seems odd that we add "native" to the list for -march but not for -mcpu. This is probably a bug, but for now we'll preserve the existing behaviour. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_print_hint_candidates): New helper function. (aarch64_print_hint_for_

[PATCH v2 09/11] aarch64: Rewrite architecture strings for assembler

2025-01-24 Thread Andrew Carlotti
Add infrastructure to allow rewriting the architecture strings passed to the assembler (either as -march options or .arch directives). There was already canonicalisation everywhere except for an -march driver option passed directly to the compiler; this patch applies the same canonicalisation ther

[PATCH v2 2/3] docs: Add +wfxt and +xs to armv9.2-a

2025-01-24 Thread Andrew Carlotti
I missed that the documentation doesn't include armv8.7-a within armv9.2-a. I'll commit this as obvious. gcc/ChangeLog: * doc/invoke.texi: Add +wfxt and +xs to armv9.2-a diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 819a684393423e53bcbb1462e5a323e3a33217b9..c8721064f91

[PATCH v2 1/3] aarch64: Add command line support for armv9.5-a

2025-01-24 Thread Andrew Carlotti
--- I've added a test that I accidentally omitted from the v1 patch. I've also made the changes Richard requested for the whole patch series. I'll push this once my regression test finishes if there aren't any issues raised before then. gcc/ChangeLog: * config/aarch64/aarch64-arches.d

[PATCH v2 05/11] aarch64: Adjust option parsing parameter types.

2025-01-24 Thread Andrew Carlotti
Replace `const struct processor *` in output parameters with `aarch64_arch` or `aarch64_cpu`. Replace `std:string` parameter in aarch64_print_hint_for_extensions with `char *`. Also name the return parameters more clearly and consistently. gcc/ChangeLog: * config/aarch64/aarch64.cc

[PATCH v2 04/11] aarch64: Rename info structs in aarch64-common.cc

2025-01-24 Thread Andrew Carlotti
Also add a (currently unused) processor field to aarch64_processor_info, and change name from "" to NULL for the terminating array entries. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (struct aarch64_option_extension): Rename to.. (struct aarch64_extension_inf

[PATCH v2 07/11] aarch64: Move arch/cpu parsing to aarch64-common.cc

2025-01-24 Thread Andrew Carlotti
Aside from moving the functions, the only changes are to make them non-static, and to use the existing info arrays within aarch64-common.cc instead of the info arrays remaining in aarch64.cc. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_get_all_extension_candi

[PATCH v2 08/11] aarch64: Inline aarch64_get_all_extension_candidates

2025-01-24 Thread Andrew Carlotti
gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_get_all_extension_candidates): Inline into... (aarch64_print_hint_for_extensions): ...this. diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc index c3e

[PATCH v2 02/11] aarch64: Replace duplicate cpu enums

2025-01-24 Thread Andrew Carlotti
Replace `enum aarch64_processor` and `enum target_cpus` with `enum aarch64_cpu`, and prefix the entries with `AARCH64_CPU_`. Also rename aarch64_none to aarch64_no_cpu. gcc/ChangeLog: * config/aarch64/aarch64-opts.h (enum aarch64_processor): Rename to... (enum aarch64_cpu)

[PATCH v2 03/11] aarch64: Remove redundant generic cpu entry

2025-01-24 Thread Andrew Carlotti
The list of cores in aarch64-common.cc included an explicit "generic" entry, despite this entry also being present in aarch64-cores.def. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (all_cores): Remove explicit generic entry. diff --git a/gcc/common/config/aarch64/aa

[PATCH v2 00/11] aarch64: Refactor target parsing

2025-01-24 Thread Andrew Carlotti
This series also fixes a couple of minor bugs. I've made all the changes Richard requested against v1, and will push this if it passes testing. Each commit has again been built and tested with: RUNTESTFLAGS="aarch64.exp=feature-*\ spellcheck*\ mv*\ arch-diag*\ cpu-diag*\ target_attr_crypto_ice*

Re: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-24 Thread Andrew Carlotti
On Tue, Jan 21, 2025 at 08:21:45PM +, Iain Sandoe wrote: > > > > On 20 Jan 2025, at 18:33, Andrew Carlotti wrote: > > > > On Mon, Jan 20, 2025 at 06:29:12PM +, Tamar Christina wrote: > >>> -Original Message- > >>> From: Iain Sand

Re: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-20 Thread Andrew Carlotti
On Mon, Jan 20, 2025 at 06:29:12PM +, Tamar Christina wrote: > > -Original Message- > > From: Iain Sandoe > > Sent: Monday, January 20, 2025 6:15 PM > > To: Andrew Carlotti > > Cc: Kyrylo Tkachov ; GCC Patches > patc...@gcc.gnu.org>; Tamar Chr

Re: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-20 Thread Andrew Carlotti
On Mon, Jan 20, 2025 at 06:14:57PM +, Iain Sandoe wrote: > > > > On 20 Jan 2025, at 17:38, Andrew Carlotti wrote: > > > > On Sun, Jan 19, 2025 at 09:14:17PM +, Iain Sandoe wrote: > >> > > >> Please note that the Darwin assembler is Apple’s

Re: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-20 Thread Andrew Carlotti
On Sun, Jan 19, 2025 at 09:14:17PM +, Iain Sandoe wrote: > All: > > Thank you all for looking at this - there are a large number of moving parts > and I could > easily be making incorrect assumptions. FWIW the highest weighting in the > inputs I have > are given to DDI0487L_a_a-profile and

Re: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-13 Thread Andrew Carlotti
On Sat, Jan 11, 2025 at 01:21:13PM +, Iain Sandoe wrote: > Hi, > > I originally made this patch for the Darwin Arm64 development branch, > however in discussions on IRC, it seems that it is also relevant to > Linux - since there are implementations running on Apple hardware with > the M1..3 CP

[PATCH 3/3] aarch64: Add +cpa feature flag

2025-01-10 Thread Andrew Carlotti
This doesn't enable anything within the compiler, but this allows the flag to be passed the assembler. There also doesn't appear to be a kernel cpuinfo name yet. Ok for master? gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V9_5A): Add CPA. * config/aarch64/aarch64-option-

[PATCH 2/3] docs: Add +wfxt and +xs to armv9.2-a

2025-01-10 Thread Andrew Carlotti
I missed that the documentation doesn't include armv8.7-a within armv9.2-a. I'll commit this as obvious. gcc/ChangeLog: * doc/invoke.texi: Add +wfxt and +xs to armv9.2-a diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 0be372079e9947e22cb43f723b51e1a5a8dd4ef7..07c1b982d32

[PATCH 1/3] aarch64: Add command line support for armv9.5-a

2025-01-10 Thread Andrew Carlotti
Ok for master? gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V9_5A): New. * doc/invoke.texi: Document armv9.5-a option. diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def index fd4881a8ebfbd3446e2995b9dcf1133918665be6..dacb7b6f37a3

[PATCH 06/11] aarch64: Inline aarch64_print_hint_for_core_or_arch

2025-01-10 Thread Andrew Carlotti
It seems odd that we add "native" to the list for -march but not for -mcpu. This is probably a bug, but for now we'll preserve the existing behaviour. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_print_hint_for_core_or_arch): Inline into... (aarch64_print_hint_for_

[PATCH 11/11] aarch64: Make AARCH64_FL_CRYPTO always unset

2025-01-10 Thread Andrew Carlotti
This feature flag bit only exists to support the +crypto alias. Outside of option processing this bit needs to be set or unset consistently. This patch goes with the latter option. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc: Assert that CRYPTO bit is not set.

[PATCH 03/11] aarch64: Remove redundant generic cpu entry

2025-01-10 Thread Andrew Carlotti
The list of cores in aarch64-common.cc included an explicit "generic" entry, despite this entry also being present in aarch64-cores.def. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (all_cores): Remove explicit generic entry. diff --git a/gcc/common/config/aarch64/aa

[PATCH 09/11] aarch64: Rewrite architecture strings for assembler

2025-01-10 Thread Andrew Carlotti
Add infrastructure to allow rewriting the architecture strings passed to the assembler (either as -march options or .arch directives). There was already canonicalisation everywhere except for an -march driver option passed directly to the compiler; this patch applies the same canonicalisation ther

[PATCH 10/11] aarch64: Refactor aarch64_rewrite_mcpu

2025-01-10 Thread Andrew Carlotti
Use aarch64_validate_cpu instead of the existing duplicate (and worse) version of the -mcpu parsing code. The original code used fatal_error; I'm guessing that using error instead should be ok. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_rewrite_selected_cpu

[PATCH 07/11] aarch64: Move arch/cpu parsing to aarch64-common.cc

2025-01-10 Thread Andrew Carlotti
Aside from moving the functions, the only changes are to make them non-static, and to use the existing info arrays within aarch64-common.cc instead of the info arrays remaining in aarch64.cc. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_get_all_extension_candi

[PATCH 08/11] aarch64: Inline aarch64_get_all_extension_candidates

2025-01-10 Thread Andrew Carlotti
gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_get_all_extension_candidates): Inline into... (aarch64_print_hint_for_extensions): ...this. diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc index 5cc

[PATCH 01/11] aarch64: Improve mcpu/march conflict check

2025-01-10 Thread Andrew Carlotti
Features from a cpu or base architecture that were explicitly disabled by a +nofeat option were being incorrectly added back in before checking for conflicts between -mcpu and -march options. This patch instead compares the returned feature masks directly. gcc/ChangeLog: * config/aarch64

[PATCH 05/11] aarch64: Adjust option parsing parameter types.

2025-01-10 Thread Andrew Carlotti
Replace `const struct processor *` in output parameters with `aarch64_arch` or `aarch64_cpu`. Replace `std:string` parameter in aarch64_print_hint_for_extensions with `char *`. Also name the return parameters more clearly and consistently. gcc/ChangeLog: * config/aarch64/aarch64.cc

[PATCH 04/11] aarch64: Rename info structs in aarch64-common.cc

2025-01-10 Thread Andrew Carlotti
Also add a (currently unused) processor field to processor_info, and change name from "" to NULL for the terminating array entries. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (struct aarch64_option_extension): Rename to.. (struct extension_info): ...this.

[PATCH 02/11] aarch64: Replace duplicate cpu enums

2025-01-10 Thread Andrew Carlotti
Replace `enum aarch64_processor` and `enum target_cpus` with `enum aarch64_cpu`, and prefix the entries with `AARCH64_CPU_`. Also rename aarch64_none to aarch64_no_cpu. gcc/ChangeLog: * config/aarch64/aarch64-opts.h (enum aarch64_processor): Rename to... (enum aarch64_cpu)

[PATCH 00/11] aarch64: Refactor target parsing

2025-01-10 Thread Andrew Carlotti
This series also fixes a couple of minor bugs (see 01/11 and 11/11). The aim of the refactor is to get all of the target parsing logic (whether via command line options, or attributes) into a single location with as much code regularity and reuse as is reasonably possible. This includes reorderin

Re: [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler

2025-01-10 Thread Andrew Carlotti
On Thu, Jan 09, 2025 at 06:00:34PM +, Richard Sandiford wrote: > Richard Sandiford writes: > > Andrew Carlotti writes: > >> On Mon, Nov 25, 2024 at 11:26:39PM +, Richard Sandiford wrote: > >>> Sorry for the slow review. > >>> > >>> A

Re: [PATCH] docs: Document new hardreg PRE pass

2025-01-10 Thread Andrew Carlotti
On Tue, Jan 07, 2025 at 06:17:02PM +, Richard Sandiford wrote: > Andrew Carlotti writes: > > I forgot to include this in the earlier patch; is this ok for master (once > > the > > pass is merged, of course)? > > > > gcc/ChangeLog: > > > >

Re: [PATCH v3] AArch64: Add LUTI ACLE for SVE2

2025-01-08 Thread Andrew Carlotti
On Wed, Jan 08, 2025 at 11:13:41AM +, Richard Sandiford wrote: > writes: > > This patch introduces support for LUTI2/LUTI4 ACLE for SVE2. > > > > LUTI instructions are used for efficient table lookups with 2-bit > > or 4-bit indices. LUTI2 reads indexed 8-bit or 16-bit elements from > > the lo

[PATCH] Disable a broken multiversioning optimisation

2025-01-08 Thread Andrew Carlotti
This patch skips redirect_to_specific clone for aarch64 and riscv, because the optimisation has two flaws: 1. It checks the value of the "target" attribute, even on targets that don't use this attribute for multiversioning. 2. The algorithm used is too aggressive, and will eliminate the indirecti

[PATCH] docs: Document new hardreg PRE pass.

2025-01-07 Thread Andrew Carlotti
I forgot to include this in the earlier patch; is this ok for master (once the pass is merged, of course)? gcc/ChangeLog: * doc/passes.texi: Document hardreg PRE pass. diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi index 639f6b325c8be47bffd64269340c4dd8ea0f321c..5c2a174a7495404

Re: [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler

2025-01-07 Thread Andrew Carlotti
On Mon, Nov 25, 2024 at 11:26:39PM +, Richard Sandiford wrote: > Sorry for the slow review. > > Andrew Carlotti writes: > > These new flags (+fcma, +jscvt, +rcpc2, +jscvt, +frintts, +wfxt and +xs) > > were only recently added to the assembler. To improve compati

[PING][PATCH v2] Add new hardreg PRE pass

2025-01-06 Thread Andrew Carlotti
On Tue, Dec 17, 2024 at 11:53:24AM +, Andrew Carlotti wrote: > This pass is used to optimise assignments to the FPMR register in > aarch64. I chose to implement this as a middle-end pass because it > mostly reuses the existing RTL PRE code within gcse.cc. > > Compared to R

Re: [PATCH] Add new hardreg PRE pass

2024-12-30 Thread Andrew Carlotti
On Sun, Dec 29, 2024 at 10:54:03AM -0700, Jeff Law wrote: > > > On 12/5/24 8:45 AM, Andrew Carlotti wrote: > > > > So at a 30k foot level, one thing to be very leery of is extending the > > > lifetime of any hard register. It's probably not a big deal on aarc

Re: [PATCH] Fix comment typos in tree-assume.cc

2024-12-19 Thread Andrew Carlotti
On Thu, Dec 19, 2024 at 09:22:02AM -0500, Andrew MacLeod wrote: > I have no issues. ok by me.� I clearly need a proofreader :-) > > Andrew Thanks! It applies cleanly to your gcc-14 backport, so I've pushed to that branch as well. > On 12/18/24 11:22, Andrew Carlotti wrote:

[PATCH] Fix comment typos in tree-assume.cc

2024-12-18 Thread Andrew Carlotti
I think this counts as obvious, but I'll leave it a few days before committing in case Andrew (or anyone else) disagrees. gcc/ChangeLog: * tree-assume.cc: Fix comment typos. diff --git a/gcc/tree-assume.cc b/gcc/tree-assume.cc index 883338bcef1e41e15a67fd015834d74319ca11af..9a934f21dc0

[PATCH v2] Add new hardreg PRE pass

2024-12-17 Thread Andrew Carlotti
This pass is used to optimise assignments to the FPMR register in aarch64. I chose to implement this as a middle-end pass because it mostly reuses the existing RTL PRE code within gcse.cc. Compared to RTL PRE, the key difference in this new pass is that we insert new writes directly to the destin

Re: [PATCH] Add new hardreg PRE pass

2024-12-05 Thread Andrew Carlotti
On Thu, Dec 05, 2024 at 04:16:22PM +, Andrew Carlotti wrote: > On Sun, Dec 01, 2024 at 03:54:25PM -0700, Jeff Law wrote: > > > > > > On 11/13/24 12:03 PM, Richard Sandiford wrote: > > > Andrew Carlotti writes: > > > > > > > > > &

Re: [PATCH] Add new hardreg PRE pass

2024-12-05 Thread Andrew Carlotti
On Sun, Dec 01, 2024 at 03:54:25PM -0700, Jeff Law wrote: > > > On 11/13/24 12:03 PM, Richard Sandiford wrote: > > Andrew Carlotti writes: > > > > > > > > > I think this is mostly my ignorance of the code, and would be obvious > > > > if

Re: [PATCH] Add new hardreg PRE pass

2024-12-05 Thread Andrew Carlotti
On Mon, Dec 02, 2024 at 08:59:20AM -0700, Jeff Law wrote: > > > On 10/31/24 12:29 PM, Andrew Carlotti wrote: > > This pass is used to optimise assignments to the FPMR register in > > aarch64. I chose to implement this as a middle-end pass because it > > mostly reuse

Re: [PATCH v2] Fix MV clones can not redirect to specific target on some targets

2024-11-20 Thread Andrew Carlotti
On Sun, Oct 27, 2024 at 04:00:43PM +, Yangyu Chen wrote: > Following the implementation of commit b8ce8129a5 ("Redirect call > within specific target attribute among MV clones (PR ipa/82625)"), > we can now optimize calls by invoking a versioned function callee > from a caller that shares the s

Re: [PATCH] Add new hardreg PRE pass

2024-11-15 Thread Andrew Carlotti
On Tue, Nov 12, 2024 at 10:42:50PM +, Richard Sandiford wrote: > Sorry for the slow review. I think Jeff's much better placed to comment > on this than I am, but here's a stab. Mostly it looks really good to me > FWIW. > > Andrew Carlotti writes: > >

Re: [PATCH] Add new hardreg PRE pass

2024-11-13 Thread Andrew Carlotti
On Wed, Nov 13, 2024 at 07:03:44PM +, Richard Sandiford wrote: > Andrew Carlotti writes: > > On Tue, Nov 12, 2024 at 10:42:50PM +, Richard Sandiford wrote: > >> Sorry for the slow review. I think Jeff's much better placed to comment > >> on this than I a

Re: [PATCH] Add new hardreg PRE pass

2024-11-13 Thread Andrew Carlotti
On Tue, Nov 12, 2024 at 10:42:50PM +, Richard Sandiford wrote: > Sorry for the slow review. I think Jeff's much better placed to comment > on this than I am, but here's a stab. Mostly it looks really good to me > FWIW. > > Andrew Carlotti writes: > >

Re: [PATCH 0/10] aarch64: Add new flags for existing features

2024-11-12 Thread Andrew Carlotti
On Fri, Oct 04, 2024 at 06:50:36PM +0100, Andrew Carlotti wrote: > This patch series adds 7 new flags for features that were previously available > in GCC only as part of an architecture version. It also fixes one other > instance where an architecture version was used in a check ins

[PATCH 10/10] aarch64: Try to avoid passing new flags to assembler

2024-11-12 Thread Andrew Carlotti
These new flags (+fcma, +jscvt, +rcpc2, +jscvt, +frintts, +wfxt and +xs) were only recently added to the assembler. To improve compatibility with older assemblers, we try to avoid passing these new flags to the assembler if we can express the targetted architecture without them. We do so by using

[PATCH 9/10] docs: Add new AArch64 flags

2024-11-12 Thread Andrew Carlotti
gcc/ChangeLog: * doc/invoke.texi: Add new AArch64 flags. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 7146163d66d068522f5aa19f59badc1b05d05114..56186e98ca6a4d28d1c315746ade89cdc835219e 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21439,11 +21439,11 @@

Re: [PATCH] testsuite: Adjust jump threading test expectation

2024-11-12 Thread Andrew Carlotti
On Wed, Nov 06, 2024 at 02:28:39PM -0800, Andrew Pinski wrote: > On Tue, Nov 5, 2024 at 4:53 AM Andrew Carlotti > wrote: > > > > This test started failing on aarch64 after 0cfc9c95 in 2023 ("Phi > > analyzer - Initialize with range instead of a tree."). >

[PATCH] testsuite: Adjust jump threading test expectation

2024-11-05 Thread Andrew Carlotti
This test started failing on aarch64 after 0cfc9c95 in 2023 ("Phi analyzer - Initialize with range instead of a tree."). The only change visible in the pass dumps prior to thread2 is the upper bounds of some ranges are reduced from +INF to 7, consistent with the bitamsk information. After thread2

Re: [PATCH] testsuite: arm: Use check-function-bodies in fp16-aapcs-* tests

2024-11-04 Thread Andrew Carlotti
On Tue, Oct 22, 2024 at 07:18:55PM +0200, Torbjorn SVENSSON wrote: > > > On 2024-10-22 13:36, Richard Earnshaw (lists) wrote: > > On 20/10/2024 16:48, Torbj�rn SVENSSON wrote: > > > Ok for trunk and releases/gcc-14? > > > > > > -- > > > > > > Converted the tests to use check-function-bodies

[PATCH] Add new hardreg PRE pass

2024-10-31 Thread Andrew Carlotti
This pass is used to optimise assignments to the FPMR register in aarch64. I chose to implement this as a middle-end pass because it mostly reuses the existing RTL PRE code within gcse.cc. Compared to RTL PRE, the key difference in this new pass is that we insert new writes directly to the destin

[PATCH v2 2/2] aarch64: Add mfloat vreinterpret intrinsics

2024-10-23 Thread Andrew Carlotti
This patch splits out some of the qualifier handling from the v1 patch, and adjusts the VREINTERPRET* macros to include support for mf8 intrinsics. Bootstrapped and regression tested on aarch64; ok for master? gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (MODE_d_mf8): New.

[PATCH v2 1/2] aarch64: Add support for mfloat8x{8|16}_t types

2024-10-23 Thread Andrew Carlotti
Compared to v1, I've split changes that aren't used for the type definitions into a separate patch. I've also added some tests, mostly along the lines suggested by Richard S. Bootstrapped and regression tested on aarch64; ok for master? gcc/ChangeLog: * config/aarch64/aarch64-builtins.c

Re: [PATCH] Introduce TARGET_FMV_ATTR_SEPARATOR

2024-10-15 Thread Andrew Carlotti
On Tue, Oct 15, 2024 at 02:18:43PM +0800, Yangyu Chen wrote: > Some architectures may use ',' in the attribute string, but it is not > used as the separator for different targets. To avoid conflict, we > introduce a new macro TARGET_FMV_ATTR_SEPARATOR to separate different > clones. This is only f

Re: [PATCH 1/2] aarch64: Split FCMA feature bit from Armv8.3-A

2024-10-04 Thread Andrew Carlotti
On Wed, Oct 02, 2024 at 06:13:38PM +0100, Andre Vieira wrote: > > This patch splits out FCMA as a feature from Armv8.3-A and adds it as a > separate > feature bit which now controls 'TARGET_COMPLEX'. > > gcc/ChangeLog: > > * config/aarch64/aarch64-arches.def (FCMA): New feature bit, can n

[PATCH 8/8] aarch64: Add new +xs flag

2024-10-04 Thread Andrew Carlotti
GCC does not emit tlbi instructions, so this only affects the flags passed through to the assembler. gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V8_7A): Add XS. * config/aarch64/aarch64-option-extensions.def (XS): New flag. diff --git a/gcc/config/aarch64/aarch64-arches.

[PATCH 6/8] aarch64: Add new +rcpc2 flag

2024-10-04 Thread Andrew Carlotti
gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V8_4A): Add RCPC2. * config/aarch64/aarch64-option-extensions.def (RCPC2): New flag. (RCPC3): Add RCPC2 dependency. * config/aarch64/aarch64.h (TARGET_RCPC2): Use new flag. gcc/testsuite/ChangeLog:

[PATCH 7/8] aarch64: Add new +wfxt flag

2024-10-04 Thread Andrew Carlotti
GCC does not currently emit the wfet or wfit instructions, so this primarily affects the flags passed through to the assembler. gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V8_7A): Add WFXT. * config/aarch64/aarch64-option-extensions.def (WFXT): New flag. diff --git a/gcc

[PATCH 5/8] aarch64: Add new +flagm2 flag

2024-10-04 Thread Andrew Carlotti
GCC does not currently emit the axflag or xaflag instructions, so this primarily affects the flags passed through to the assembler. gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V8_5A): Add FLAGM2. * config/aarch64/aarch64-option-extensions.def (FLAGM2): New flag. gcc/tests

[PATCH 4/8] aarch64: Add new +frintts flag

2024-10-04 Thread Andrew Carlotti
gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V8_5A): Add FRINTTS * config/aarch64/aarch64-option-extensions.def (FRINTTS): New flag. * config/aarch64/aarch64.h (TARGET_FRINT): Use new flag. * config/aarch64/arm_acle.h: Use new flag for frintts intrinsics.

[PATCH 3/8] aarch64: Add new +jscvt flag

2024-10-04 Thread Andrew Carlotti
gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V8_3A): Add JSCVT. * config/aarch64/aarch64-option-extensions.def (JSCVT): New flag. * config/aarch64/aarch64.h (TARGET_JSCVT): Use new flag. * config/aarch64/arm_acle.h: Use new flag for jscvt intrinsics. gcc/tes

[PATCH 2/8] aarch64: Add new +fcma flag

2024-10-04 Thread Andrew Carlotti
This includes +fcma as a dependency of +sve, and means that we can finally support fcma intrinsics on a64fx. Also add fcma to the Features list in several cpunative testcases that incorrectly included sve without fcma. gcc/ChangeLog: * config/aarch64/aarch64-arches.def (V8_3A): Add FCMA.

[PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places

2024-10-04 Thread Andrew Carlotti
gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_expand_epilogue): Use TARGET_PAUTH. * config/aarch64/aarch64.md: Update comment. diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index e7bb3278a27eca44c46afd26069d608218198a54..cf1107127fd5d9e

[PATCH 0/8] aarch64: Add new flags for existing features

2024-10-04 Thread Andrew Carlotti
This patch series adds 7 new flags for features that were previously available in GCC only as part of an architecture version. It also fixes one other instance where an architecture version was used in a check instead of a feature flag. Bootstrapped and regression tested as a whole on aarch64. I

[PATCH] aarch64: Fix costing of move to/from MOVEABLE_SYSREGS

2024-10-01 Thread Andrew Carlotti
This is necessary to prevent reload assuming that a direct FP->FPMR move is valid. Bootstrapped and regression tested; ok for master? gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_register_move_cost): Increase costs involving MOVEABLE_SYSREGS. diff --git a/gcc/config/aarc

[PATCH] aarch64: Add support for mfloat8x{8|16}_t types

2024-10-01 Thread Andrew Carlotti
I've tested this with a hacked in FP8 intrinsic. Is this patch ok for master, or should it wait until we've implemented the intrinsics? gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (MODE_d_mf8): New. (MODE_q_mf8): New. (QUAL_mf8): New. (aarch64_lookup_simd_

Re: [RFC PATCH] Allow functions with target_clones attribute to be inlined

2024-09-18 Thread Andrew Carlotti
On Thu, Sep 19, 2024 at 01:01:39AM +0800, Yangyu Chen wrote: > > > > On Sep 18, 2024, at 23:36, Andrew Carlotti wrote: > > > > On Wed, Sep 18, 2024 at 09:46:15AM +0100, Richard Sandiford wrote: > >> Yangyu Chen writes: > >>> I recently found th

Re: [RFC PATCH] Allow functions with target_clones attribute to be inlined

2024-09-18 Thread Andrew Carlotti
On Wed, Sep 18, 2024 at 09:46:15AM +0100, Richard Sandiford wrote: > Yangyu Chen writes: > > I recently found that target_clones functions cannot inline even when > > the caller has exactly the same target. However, if we only use target > > attributes in C++ and let the compiler generate IFUNC fo

Re: [PATCH v3 0/5] aarch64: Fix intrinsic availability [PR112108]

2024-09-04 Thread Andrew Carlotti
On Mon, Aug 19, 2024 at 03:52:58PM +0100, Andrew Carlotti wrote: > On Fri, Aug 16, 2024 at 07:17:24AM +, Kyrylo Tkachov wrote: > > > > > > > On 15 Aug 2024, at 18:48, Andrew Carlotti wrote: > > > > > > External email: Use caution opening links or a

Re: [PATCH v3 0/5] aarch64: Fix intrinsic availability [PR112108]

2024-08-19 Thread Andrew Carlotti
On Fri, Aug 16, 2024 at 07:17:24AM +, Kyrylo Tkachov wrote: > > > > On 15 Aug 2024, at 18:48, Andrew Carlotti wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Thu, Aug 15, 2024 at 05:15:03PM +0100, Ric

Re: [PATCH v3 0/5] aarch64: Fix intrinsic availability [PR112108]

2024-08-15 Thread Andrew Carlotti
On Thu, Aug 15, 2024 at 05:15:03PM +0100, Richard Sandiford wrote: > Andrew Carlotti writes: > > This series of patches fixes issues with some intrinsics being incorrectly > > gated by global target options, instad of just using function-specific > > target > > optio

[PATCH v3 5/5] aarch64: Fix ls64 intrinsic availability

2024-08-15 Thread Andrew Carlotti
The availability of ls64 intrinsics and data types were determined solely by the globally specified architecture features, which did not reflect any changes specified in target pragmas or attributes. This patch removes the initialisation-time guards for the intrinsics, and replaces them with check

[PATCH v3 4/5] aarch64: Fix memtag intrinsic availability

2024-08-15 Thread Andrew Carlotti
The availability of memtag intrinsics and data types were determined solely by the globally specified architecture features, which did not reflect any changes specified in target pragmas or attributes. This patch removes the initialisation-time guards for the intrinsics, and replaces them with che

[PATCH v3 2/5] aarch64: Move check_required_extensions

2024-08-15 Thread Andrew Carlotti
Move SVE extension checking functionality to aarch64-builtins.cc, so that it can be shared by non-SVE intrinsics. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins.cc (check_builtin_call) (expand_builtin): Update calls to the below. (report_missing_extension, report_mis

[PATCH v3 3/5] aarch64: Fix tme intrinsic availability

2024-08-15 Thread Andrew Carlotti
The availability of tme intrinsics was previously gated at both initialisation time (using global target options) and usage time (accounting for function-specific target options). This patch removes the check at initialisation time, and also moves the intrinsics out of the header file to allow for

[PATCH v3 1/5] aarch64: Refactor check_required_extensions

2024-08-15 Thread Andrew Carlotti
Replace TARGET_GENERAL_REGS_ONLY check with an explicit check that aarch64_isa_flags enables all required extensions. This will be more flexible when repurposing this function for non-SVE intrinsics. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins.cc (check_required_register

[PATCH v3 0/5] aarch64: Fix intrinsic availability [PR112108]

2024-08-15 Thread Andrew Carlotti
This series of patches fixes issues with some intrinsics being incorrectly gated by global target options, instad of just using function-specific target options. These issues have been present since the +tme, +memtag and +ls64 intrinsics were introduced. Compared to the previous version, this ser

[PING^4][PATCH v2] docs: Update function multiversioning documentation

2024-08-13 Thread Andrew Carlotti
I'm still waiting for review for this patch. I've asked Richard Sandiford about it, and he'd like a docs maintainer to review the patch (so I've cc'd the rest of them now as well). On Wed, Jul 10, 2024 at 01:09:41PM +0100, Andrew Carlotti wrote: > > On Mon, J

Re: [PATCH v2 2/4] aarch64: Fix tme intrinsic availability

2024-08-09 Thread Andrew Carlotti
On Thu, Aug 08, 2024 at 04:48:38PM +0100, Richard Sandiford wrote: > Andrew Carlotti writes: > > The availability of tme intrinsics was previously gated at both > > initialisation time (using global target options) and usage time > > (accounting for function-specific target

[PATCH v2 3/4] aarch64: Fix memtag intrinsic availability

2024-08-08 Thread Andrew Carlotti
The availability of memtag intrinsics and data types were determined solely by the globally specified architecture features, which did not reflect any changes specified in target pragmas or attributes. This patch removes the initialisation-time guards for the intrinsics, and replaces them with che

  1   2   3   >