Hi Alice,
+ENTRY (__arm_get_current_vg) + /* Check if SVE is available. */ + adrp x16, __aarch64_cpu_features + ldr x16, [x16, :lo12:__aarch64_cpu_features] + tbnz x16, #30, L(end_cntd) + + /* Check if SME is available. */ + adrp x16, __aarch64_have_sme + ldrb w16, [x16, :lo12:__aarch64_have_sme] + cbz w16, L(end_zero) + + /* Check if we're in streaming mode */ + .inst 0xd53b4250 /* mrs x16, svcr */ + tbz x16, #0, L(end_zero) +L(end_cntd): + .inst 0x04e0e3e0 /* cntd x0 */ + ret +L(end_zero): + mov x0, 0 + ret So this uses 2 different globals to access HWCAPs - and they are initialized in different ways at different times during startup. Ie. we can get inconsistent results when one is initialized but the other is not. For this patch I would test SME from __aarch64_cpu_features - but it's not clear why there is a separate global when we can use __aarch64_cpu_features. Cheers, Wilco
