> On 20 Jan 2025, at 18:33, Andrew Carlotti <andrew.carlo...@arm.com> wrote:
>
> On Mon, Jan 20, 2025 at 06:29:12PM +0000, Tamar Christina wrote:
>>> -----Original Message-----
>>> From: Iain Sandoe <i...@sandoe.co.uk>
>>> Sent: Monday, January 20, 2025 6:15 PM
>>> To: Andrew Carlotti <andrew.carlo...@arm.com>
>>> Cc: Kyrylo Tkachov <ktkac...@nvidia.com>; GCC Patches <gcc-
>>> patc...@gcc.gnu.org>; Tamar Christina <tamar.christ...@arm.com>; Richard
>>> Sandiford <richard.sandif...@arm.com>; Sam James <s...@gentoo.org>
>>> Subject: Re: [PATCH] aarch64: Provide initial specifications for Apple CPU
>>> cores.
>>>
>>>
>>>
>>>> On 20 Jan 2025, at 17:38, Andrew Carlotti <andrew.carlo...@arm.com> wrote:
>>>>
>>>> On Sun, Jan 19, 2025 at 09:14:17PM +0000, Iain Sandoe wrote:
>>
>> I would say if we can find both core IDs we should use them, otherwise this
>> is already
>> an improvement on the situation.
>
> There are some part numbers listed at:
> https://github.com/AsahiLinux/docs/wiki/HW:ARM-System-Registers
>
> That only seems to cover apple-a12 and apple-m1.
The latest llvm Host.cpp contains the opposite problem:
apple-m1 is mapped to : 0x20, 21, 22, 23, 24, 25, 28, 29
apple-m2 is mapped to : 0x30, 31, 32, 33, 34, 35, 38, 39
apple-m3 is mapped to : 0x48, 49
(if I had to speculate I might say that odd or even numbers related to
big/litte - but the
number of options ….)
>>>>>> Comparing to LLVM's AArch64Processors.td, this seems to be missing a few
>>> things:
>>>>>> - Crpyto extensions (SHA2 and AES, and SHA3 from apple-m1);
>>>>>
>>>>> I do not see FEAT_SHA2 listed in either the Arm doc, or the output from
>>>>> the
>>> sysctl.
>>>>> FEAT_AES: 1
>>>>> FEAT_SHA3: 1
>>>>> So I’ve added those to the three entries.
>>>>
>>>> There some architecture feature names that are effectively aliases in the
>>>> spec,
>>>> although identifying this requires reading the restrictions of the id
>>>> register
>>>> fields (and at least one version of the spec accidentally omitted one of
>>>> the
>>>> dependencies). In summary:
>>>> - +sha2 = FEAT_SHA1 and FEAT_SHA256
>>>> - +aes = FEAT_AES and FEAT_PMULL
>>>> - +sha3 = FEAT_SHA512 and FEAT_SHA3
>>>
>>> thanks - that was not obvious.
>>>
>>> However, if I add any of these to the 8.4 spec, LLVM’s back end (at least
>>> the ones
>>> via xcode) drops the arch rev down and we fail to build libgcc because of
>>> missing
>>> support for fp16.
>>>
>>> This is likely a bug - but I don’t really know how to describe it at the
>>> moment - and
>>> it won’t make any difference to the assemblers already in the wild - so I
>>> will leave
>>> these out of the list for now.
>>>
>>>>>> - New flags I just added (FRINTTS and FLAGM2 from apple-m1);
>>>>> FEAT_FRINTTS: 1
>>>>> FEAT_FlagM2: 1
>>>>> So I;ve added those.
>>>
>>> The build with these added succeeded with no change in test results.
So I have found a way that LLVM’s backend is happy with (I need to test across
more xcode versions .. but it’s a start):
AARCH64_CORE("apple-m1", applem1, cortexa57, V8_4A, (AES, SHA2, SHA3, F16FML,
SB, SSBS, FRINTTS, FLAGM2), generic_armv8_a, 0x61, 0x023, -1)
AARCH64_CORE("apple-m2", applem2, cortexa57, V8_4A, (I8MM, BF16, AES, SHA2,
SHA3, F16FML, SB, SSBS, FRINTTS, FLAGM2), generic_armv8_a, 0x61, 0x033, -1)
AARCH64_CORE("apple-m3", applem3, cortexa57, V8_4A, (I8MM, BF16, AES, SHA2,
SHA3, F16FML, SB, SSBS, FRINTTS, FLAGM2), generic_armv8_a, 0x61, 0x048, -1)
So, although FP16FML is implicit in 8.4 but F16 is not - it seems that I cannot
specify the missing F16 without causing the other part to get switched off).
Specifying F16FML is OK because that switches on F16 and is part of 8.4 anyway….
>>>>>> - PREDRES (from apple-m1)
>>>>>
>>>>> I cannot find FEAT_PREDRES …
>>>>> … however we do have
>>>>> FEAT_SPECRES: 0
>>>>
>>>> FEAT_SPECRES in the architecture spec is the same as the +predres toolchain
>>>> flag. LLVM seems to think the is supported from apple-m1.
So what do we do about this?
Do we assume that this is a bug in the sysctl reporting?
Is there anyone on the LLVM toolchain team or within Apple you folks could
query?
I can try via the Apple Open Source folks - but not sure how long that will
take.
if we could go with 8.5 and 8.6 that would simplify things;
Also I want to add apple-m4 soon and that also reports FEAT_SPECRES = 0.
thanks,
Iain