Re: [PATCH v2] aarch64, Darwin: Initial implementation of Apple cores [PR113257].

Kyrylo Tkachov Mon, 31 Mar 2025 05:44:11 -0700

Hi Iain,

> On 22 Mar 2025, at 15:31, Iain Sandoe <iains....@gmail.com> wrote:
> 
> 0. Sorry this has taken some time to close off; partly because of waiting
>   for input, but mostly that I've been stretched with other work.
> 1. As per the commit message, the apparent non-conformance with 8.5/6
>   because FEAT_SPECRES returns 0, is a result of the query operating
>   at user priv.  The cores are confirmed to support this for priv.
>   code.
> 2. I added entries for the apple-m1,2,3 cores in invoke.texi.
> 3. Following Andrew's suggestion and with some measurements by Tamar
>   and me, figured out the LITTLE.big chip ids (at least for a sub-
>   set).
> 
> This has been in use for a while on aarch64-darwin branches and I've
> checked manually that it gives the right .arch lines on cfarm185.
> 
> OK for trunk? (if so, when?)
> thanks
> Iain
> 
> --- 8< ---
> 
> After discussion with the open source support team at Apple, we have
> established that the cores conform to the 8.5 and 8.6 requirements.
> One of the mandatory features (FEAT_SPECRES) is not exposed (or
> available) in user-space code but is supported for privileged code.
> 
> The values for chip IDs and the LITTLE.big variants have been taken
> from lists in the XNU and LLVM sources.
> 
> PR target/113257
> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Apple-a12,
> Apple-M1, Apple-M2, Apple-M3 with expanded names to allow for the
> LITTLE.big versions.
> * config/aarch64/aarch64-tune.md: Regenerate.
> * doc/invoke.texi: Add apple-m1,2 and 3 cores to the ones listed
> for arch and tune selections.
> 
> Signed-off-by: Iain Sandoe <i...@sandoe.co.uk>
> ---
> gcc/config/aarch64/aarch64-cores.def | 16 ++++++++++++++++
> gcc/config/aarch64/aarch64-tune.md   |  2 +-
> gcc/doc/invoke.texi                  |  5 +++--
> 3 files changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 0e22d72976e..7f204fd0ac9 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -173,6 +173,22 @@ AARCH64_CORE("cortex-a76.cortex-a55",  
> cortexa76cortexa55, cortexa53, V8_2A,  (F
> AARCH64_CORE("cortex-r82", cortexr82, cortexa53, V8R, (), cortexa53, 0x41, 
> 0xd15, -1)
> AARCH64_CORE("cortex-r82ae", cortexr82ae, cortexa53, V8R, (), cortexa53, 
> 0x41, 0xd14, -1)
> 
> +/* Apple (A12 and M) cores.
> +   Known part numbers as listed in other public sources.
> +   Placeholders for schedulers, generic_armv8_a for costs.
> +   A12 seems mostly 8.3, M1 is 8.5 without BTI, M2 and M3 are 8.6
> +   From measurements made so far the odd-number core IDs are performance.  */
> +AARCH64_CORE("apple-a12", applea12, cortexa53, V8_3A,  (), generic_armv8_a, 
> 0x61, 0x12, -1)
> +AARCH64_CORE("apple-m1", applem1_0, cortexa57, V8_5A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x21, 0x20), -1)
> +AARCH64_CORE("apple-m1", applem1_1, cortexa57, V8_5A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x23, 0x22), -1)
> +AARCH64_CORE("apple-m1", applem1_2, cortexa57, V8_5A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x25, 0x24), -1)
> +AARCH64_CORE("apple-m1", applem1_3, cortexa57, V8_5A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x29, 0x28), -1)
> +AARCH64_CORE("apple-m2", applem2_0, cortexa57, V8_6A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x31, 0x30), -1)
> +AARCH64_CORE("apple-m2", applem2_1, cortexa57, V8_6A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x33, 0x32), -1)
> +AARCH64_CORE("apple-m2", applem2_2, cortexa57, V8_6A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x35, 0x34), -1)
> +AARCH64_CORE("apple-m2", applem2_3, cortexa57, V8_6A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x39, 0x38), -1)
> +AARCH64_CORE("apple-m3", applem3_0, cortexa57, V8_6A,  (), generic_armv8_a, 
> 0x61, AARCH64_BIG_LITTLE (0x49, 0x48), -1)


I don’t think we have precedent of different MIDR part numbers resolving to the 
same -mcpu string, but I think it should all work as expected.
As long as you and Tamar are happy with the feature set here no objections from 
me.
Looks ok to me for GCC 15 with a documentation comment below…

> +
> /* Armv9.0-A Architecture Processors.  */
> 
> /* Arm ('A') cores. */
> diff --git a/gcc/config/aarch64/aarch64-tune.md 
> b/gcc/config/aarch64/aarch64-tune.md
> index 56a914f12b9..982074c2c21 100644
> --- a/gcc/config/aarch64/aarch64-tune.md
> +++ b/gcc/config/aarch64/aarch64-tune.md
> @@ -1,5 +1,5 @@
> ;; -*- buffer-read-only: t -*-
> ;; Generated automatically by gentune.sh from aarch64-cores.def
> (define_attr "tune"
> - 
> "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,fujitsu_monaka,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexr82ae,cortexa510,cortexa520,cortexa520ae,cortexa710,cortexa715,cortexa720,cortexa720ae,cortexa725,cortexx2,cortexx3,cortexx4,cortexx925,neoversen2,cobalt100,neoversen3,neoversev2,grace,neoversev3,neoversev3ae,demeter,olympus,generic,generic_armv8_a,generic_armv9_a"
> + 
> "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,fujitsu_monaka,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexr82ae,applea12,applem1_0,applem1_1,applem1_2,applem1_3,applem2_0,applem2_1,applem2_2,applem2_3,applem3_0,cortexa510,cortexa520,cortexa520ae,cortexa710,cortexa715,cortexa720,cortexa720ae,cortexa725,cortexx2,cortexx3,cortexx4,cortexx925,neoversen2,cobalt100,neoversen3,neoversev2,grace,neoversev3,neoversev3ae,demeter,olympus,generic,generic_armv8_a,generic_armv9_a"
> (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 515d91ac2e3..f8f712d1877 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -21763,7 +21763,8 @@ performance of the code.  Permissible values for this 
> option are:
> @samp{cortex-x2}, @samp{cortex-x3}, @samp{cortex-x4}, @samp{cortex-a510},
> @samp{cortex-a520}, @samp{cortex-a520ae}, @samp{cortex-a710}, 
> @samp{cortex-a715},
> @samp{cortex-a720}, @samp{cortex-a720ae}, @samp{ampere1}, @samp{ampere1a},
> -@samp{ampere1b}, @samp{cobalt-100} and @samp{native}.
> +@samp{ampere1b}, @samp{cobalt-100}, @samp{apple-m1}, @samp{apple-m2},
> +@samp{apple-m3} and @samp{native}.
> 
> The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53},
> @samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53},
> @@ -23842,7 +23843,7 @@ Permissible names are: @samp{arm7tdmi}, 
> @samp{arm7tdmi-s}, @samp{arm710t},
> @samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1}, @samp{xscale},
> @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}, @samp{fa526}, @samp{fa626},
> @samp{fa606te}, @samp{fa626te}, @samp{fmp626}, @samp{fa726te}, 
> @samp{star-mc1},
> -@samp{xgene1}.
> +@samp{xgene1}, @samp{apple-m1}, @samp{apple-m2}, @samp{apple-m3}.

This looks like the section for (32-bit) arm rather than aarch64.


> 
> Additionally, this option can specify that GCC should tune the performance
> of the code for a big.LITTLE system.  Permissible names are:

There is a similar section about big.LITTLE in the aarch64 section where the 
b.L options you add in the patch should be listed.

> -- 
> 2.39.2 (Apple Git-143)
>

Re: [PATCH v2] aarch64, Darwin: Initial implementation of Apple cores [PR113257].

Reply via email to