Richard Sandiford <[email protected]> writes:
> Victor Do Nascimento <[email protected]> writes:
>> At present, Evaluation of both `has_lse2(hwcap)' and
>> `has_lse128(hwcap)' may require issuing an `mrs' instruction to query
>> a system register. This instruction, when issued from user-space
>> results in a trap by the kernel which then returns the value read in
>> by the system register. Given the undesirable nature of the
>> computational expense associated with the context switch, it is
>> important to implement mechanisms to, wherever possible, forgo the
>> operation.
>>
>> In light of this, given how other architectural requirements serving
>> as prerequisites have long been assigned HWCAP bits by the kernel, we
>> can inexpensively query for their availability before attempting to
>> read any system registers. Where one of these early tests fail, we
>> can assert that the main feature of interest (be it LSE2 or LSE128)
>> cannot be present, allowing us to return from the function early and
>> skip the unnecessary expensive kernel-mediated access to system
>> registers.
>>
>> libatomic/ChangeLog:
>>
>> * config/linux/aarch64/host-config.h (has_lse2): Add test for LSE.
>> (has_lse128): Add test for LSE2.
>> ---
>> libatomic/config/linux/aarch64/host-config.h | 13 ++++++++++---
>> 1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/libatomic/config/linux/aarch64/host-config.h
>> b/libatomic/config/linux/aarch64/host-config.h
>> index c5485d63855..3be4db6e5f8 100644
>> --- a/libatomic/config/linux/aarch64/host-config.h
>> +++ b/libatomic/config/linux/aarch64/host-config.h
>> @@ -53,8 +53,13 @@
>> static inline bool
>> has_lse2 (unsigned long hwcap)
>> {
>> + /* Check for LSE2. */
>> if (hwcap & HWCAP_USCAT)
>> return true;
>> + /* No point checking further for atomic 128-bit load/store if LSE
>> + prerequisite not met. */
>> + if (!(hwcap & HWCAP_ATOMICS))
>> + return false;
>
> This part is OK.
>
>> if (!(hwcap & HWCAP_CPUID))
>> return false;
>>
>> @@ -76,12 +81,14 @@ has_lse2 (unsigned long hwcap)
>> static inline bool
>> has_lse128 (unsigned long hwcap)
>> {
>> - if (!(hwcap & HWCAP_CPUID))
>> - return false;
>> + /* In the absence of HWCAP_CPUID, we are unable to check for LSE128,
>> return.
>> + If feature check available, check LSE2 prerequisite before proceeding.
>> */
>> + if (!(hwcap & HWCAP_CPUID) || !(hwcap & HWCAP_USCAT))
>> + return false;
>
> The inconsistency feels wrong here. If we're saying that HWCAP_USCAT
> is now so old that we don't need to fall back on CPUID, then it feels
> like we should have the courage of our convictions and do the same for
> has_lse2. If instead we still want to support libcs that predate
> HWCAP_USCAT, we should do the same here too.
Sorry, scratch that, I'd misread has_lse2. The CPUID fallback there is
only for Neoverse N1, which we know doesn't support LSE128.
So the patch is OK with the formatting fixed: the returns should be
indented by their original amount.
Thanks,
Richard
>> unsigned long isar0;
>> asm volatile ("mrs %0, ID_AA64ISAR0_EL1" : "=r" (isar0));
>> if (AT_FEAT_FIELD (isar0) >= 3)
>> - return true;
>> + return true;
>
> The original formatting was correct.
>
> Thanks,
> Richard
>
>> return false;
>> }