Hi Iain, > On 11 Jan 2025, at 14:21, Iain Sandoe <iains....@gmail.com> wrote: > > Hi, > > I originally made this patch for the Darwin Arm64 development branch, > however in discussions on IRC, it seems that it is also relevant to > Linux - since there are implementations running on Apple hardware with > the M1..3 CPUs. It might also be helpful to the resolution of > PR113257 - although it is not a solution on its own. > > Bootstrapped and tested manually (that it gives the expected .arch lines) > on aarch64-linux. > > OK for trunk? > thanks > Iain > > --- 8< --- > > This covers the M1-M3 cores used in Apple desktop hardware that is also > sometimes used with Linux as the OS. > > It does not cover the wider range that might be used in iOS and other > embedded platform versions. > > Some of the content is estimates/best guesses - based on the following > public sources of information: > * XNU (only for the Apple Implementer ID) > * sysctl -a | grep hw on various M1, M2 and machines > * AArch64.td from the Apple Open Source repo for LLVM. > * What XCode-14 clang passes to cc1. >
How about the llvm/lib/TargetParser/Host.cpp in upstream LLVM for the part numbers? I see it has different values for the M1,M2,M3 ones that you have in your patch. > Unfortunately, these sources are in conflict; in particular the clang-claimed > feature set disagrees with the output of sysctl -a, and the base Arm revs. > claimed in some cases miss features that ARM DDI 0487J.a lists as mandatory > for the rev. > > This latter point might not be actually significant - but for the sake of > caution I've made the spec use the lower arch rev + the additional features > that are consistently claimed by both sysctl and clang. > I think going for the lowest common denominator of features you can deduce is fine. > GCC does not seem to have a scheduler that is similar to the "Cyclone" one > in LLVM - so I've guessed to use cortex57 (but, maybe we miss 8-issue, it's > not clear - and my experience with the scheduler is ≈ 0). > Yes, that’s probably good enough. We haven’t had a new “big core” scheduling model in a while and cortexa57 tends to be good enough as a fallback. I’d like us to have something new for SVE-enabled cores but that’s not relevant in this case. > Likewise we do not (yet) have specific cost models, so choose the generic > Armv8 one. > > Thus, the choices here are intended to be conservative. > > * Currently, we do not seem to have any way to specify that M2/M3 has support > for FEAT_BTI, but because of missing feaures is not compliant with the Arm > base rev that implies this. > * Proper version numbers are not readily available. > * Since we have FIRESTORM/ICESTORM and similar pairs for the performance and > efficiency cores on various machines, perhaps we should be using a > big.LITTLE > configuration; OTOH currently, I have no idea if that is usable in any way > with the hardware as configured. Modulo Tamar’s comments in the PR about unknown CPUs it should work fine under Linux as long as /proc/cpuinfo contains entries for both types of cores. You can use a mock cpuinfo for testing through the GCC_CPUINFO environment variable, like the tests in gcc.target/aarch64/cpunative. Once that is detected you can specify its other tuning and arch parameters like any other -mcpu option. > > gcc/ChangeLog: > > * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add apple-a12, > apple-m1, apple-m2, apple-m3. > * config/aarch64/aarch64-tune.md: Regenerate. These need entries in the documentation too. Thanks, Kyrill > > Signed-off-by: Iain Sandoe <i...@sandoe.co.uk> > --- > gcc/config/aarch64/aarch64-cores.def | 12 ++++++++++++ > gcc/config/aarch64/aarch64-tune.md | 2 +- > 2 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/aarch64/aarch64-cores.def > b/gcc/config/aarch64/aarch64-cores.def > index caf61437d18..0bd3e80cf7f 100644 > --- a/gcc/config/aarch64/aarch64-cores.def > +++ b/gcc/config/aarch64/aarch64-cores.def > @@ -173,6 +173,18 @@ AARCH64_CORE("cortex-a76.cortex-a55", > cortexa76cortexa55, cortexa53, V8_2A, (F > AARCH64_CORE("cortex-r82", cortexr82, cortexa53, V8R, (), cortexa53, 0x41, > 0xd15, -1) > AARCH64_CORE("cortex-r82ae", cortexr82ae, cortexa53, V8R, (), cortexa53, > 0x41, 0xd14, -1) > > +/* Apple (A12 and M) cores based on Armv8. > + Apple implementer ID from xnu, > + Guesses for part # and suitable scheduler ident, generic_armv8_a for > costs. > + A12 seems mostly 8.3, > + M1 seems to be 8.4 + extras (see comments in option-extensions about > f16fml), > + M2 mostly 8.5 but with missing mandatory features. > + M3 is essentially the same as M2 for the features declared here. */ > +AARCH64_CORE("apple-a12", applea12, cortexa53, V8_3A, (), generic_armv8_a, > 0x61, 0x12, -1) > +AARCH64_CORE("apple-m1", applem1, cortexa57, V8_4A, (F16, SB, SSBS), > generic_armv8_a, 0x61, 0x23, -1) > +AARCH64_CORE("apple-m2", applem2, cortexa57, V8_4A, (I8MM, BF16, F16, SB, > SSBS), generic_armv8_a, 0x61, 0x23, -1) > +AARCH64_CORE("apple-m3", applem3, cortexa57, V8_4A, (I8MM, BF16, F16, SB, > SSBS), generic_armv8_a, 0x61, 0x23, -1) > + > /* Armv9.0-A Architecture Processors. */ > > /* Arm ('A') cores. */ > > -- > 2.39.2 (Apple Git-143) >