On 17/11/17 08:42, Andrew Pinski wrote: > On Fri, Nov 17, 2017 at 12:21 AM, Alan Hayward <alan.hayw...@arm.com> wrote: >> >>> On 16 Nov 2017, at 19:32, Andrew Pinski <pins...@gmail.com> wrote: >>> >>> On Thu, Nov 16, 2017 at 4:35 AM, Alan Hayward <alan.hayw...@arm.com> wrote: >>>> This final patch adds the clobber high expressions to tls_desc for aarch64. >>>> It also adds three tests. >>>> >>>> In addition I also tested by taking the gcc torture test suite and making >>>> all global variables __thread. Then emended the suite to compile with >>>> -fpic, >>>> save the .s file and only for one given O level. >>>> I ran this before and after the patch and compared the resulting .s files, >>>> ensuring that there were no ASM changes. >>>> I discarded the 10% of tests that failed to compile (due to the code in >>>> the test now being invalid C). >>>> I did this for O0,O2,O3 on both x86 and aarch64 and observed no difference >>>> between ASM files before and after the patch. >>> >>> Isn't the ABI defined as non-clobbering the lower 64bits for normal >>> function calls? Or is the TLS function "special" in that it >>> saves/restores the 128bit registers; is that documented anywhere? The >>> main reason why I am asking is because glibc is not the only libc out >>> there and someone could have a slightly different ABI here. >>> >> >> In NEON all the register SIMD registers are preserved around TLS calls - all >> 128bits of each register. That’s standard ABI behaviour for NEON. >> >> SVE doesn’t have any explicit preserving of it’s SIMD registers. >> >> However, the NEON and SVE registers share the same silicon - the lower >> 128bits of each SVE register is the same as the corresponding NEON >> register. The side effect of this is that the lower 128bits of the SVE >> registers >> are getting backed up. >> >> Neither glibc or any libraries need updating to support this. >> But, compilers do need to aware of this. > > I had a different question. I asked if this specification of the TLS > calls requiring not to clobber the lower 128bits of the SIMD registers > documented anywhere. As I was trying to say I am in the middle of > writing a libc and did not know of this requirement until I saw this > thread. >
nothing is clobbered just like on x86 and arm: http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-ARM.txt there is no equivalent spec for aarch64 (yet) but the behaviour is consistent with other tlsdesc abis. you could argue that it's suboptimal that the libc has to preserve everything, but it only affects the slow path of dynamic tlsdesc wich does __tls_get_addr and thus may clobber call-clobber registers there.