On Wed, Jan 21, 2026 at 02:46:15PM +0000, Wilco Dijkstra wrote:
> Hi Alice,
> 
> 
> +ENTRY (__arm_get_current_vg)
> +������ /* Check if SVE is available.� */
> +������ adrp��� x16, __aarch64_cpu_features
> +������ ldr���� x16, [x16, :lo12:__aarch64_cpu_features]
> +������ tbnz��� x16, #30, L(end_cntd)
> +
> +������ /* Check if SME is available.� */
> +������ adrp��� x16, __aarch64_have_sme
> +������ ldrb��� w16, [x16, :lo12:__aarch64_have_sme]
> +������ cbz���� w16, L(end_zero)
> +
> +������ /* Check if we're in streaming mode� */
> +������ .inst�� 0xd53b4250� /* mrs����� x16, 
> svcr� */
> +������ tbz���� x16, #0, L(end_zero)
> +L(end_cntd):
> +������ .inst�� 0x04e0e3e0� /* cntd���� x0� */
> +������ ret
> +L(end_zero):
> +������ mov���� x0, 0
> +������ ret
> 
> So this uses 2 different globals to access HWCAPs - and they are initialized
> in different ways at different times during startup. Ie. we can get 
> inconsistent
> results when one is initialized but the other is not.

If these results can be inconsistent, then we might already have issues
elsewhere.

> 
> For this patch I would test SME from __aarch64_cpu_features

Seems sensible - we can save two instructions by reusing the already-loaded
value (and it makes the code look more consistent).

> - but it's not clear
> why there is a separate global when we can use __aarch64_cpu_features.

Historical reasons: __aarch64_cpu_features was pushed upstream exactly one week
later than __aarch64_have_sme.

Alice

> 
> Cheers,
> Wilco

Reply via email to