On Wed, Jan 21, 2026 at 02:46:15PM +0000, Wilco Dijkstra wrote: > Hi Alice, > > > +ENTRY (__arm_get_current_vg) > +������ /* Check if SVE is available.� */ > +������ adrp��� x16, __aarch64_cpu_features > +������ ldr���� x16, [x16, :lo12:__aarch64_cpu_features] > +������ tbnz��� x16, #30, L(end_cntd) > + > +������ /* Check if SME is available.� */ > +������ adrp��� x16, __aarch64_have_sme > +������ ldrb��� w16, [x16, :lo12:__aarch64_have_sme] > +������ cbz���� w16, L(end_zero) > + > +������ /* Check if we're in streaming mode� */ > +������ .inst�� 0xd53b4250� /* mrs����� x16, > svcr� */ > +������ tbz���� x16, #0, L(end_zero) > +L(end_cntd): > +������ .inst�� 0x04e0e3e0� /* cntd���� x0� */ > +������ ret > +L(end_zero): > +������ mov���� x0, 0 > +������ ret > > So this uses 2 different globals to access HWCAPs - and they are initialized > in different ways at different times during startup. Ie. we can get > inconsistent > results when one is initialized but the other is not.
If these results can be inconsistent, then we might already have issues elsewhere. > > For this patch I would test SME from __aarch64_cpu_features Seems sensible - we can save two instructions by reusing the already-loaded value (and it makes the code look more consistent). > - but it's not clear > why there is a separate global when we can use __aarch64_cpu_features. Historical reasons: __aarch64_cpu_features was pushed upstream exactly one week later than __aarch64_have_sme. Alice > > Cheers, > Wilco
