On Wed, Jul 24, 2013 at 08:25:14AM -1000, Richard Henderson wrote: > On 07/24/2013 05:23 AM, Richard Biener wrote: > > "H.J. Lu" <hjl.to...@gmail.com> wrote: > > > >> Hi, > >> > >> Here is a patch to extend x86-64 psABI to support AVX-512: > > > > Afaik avx 512 doubles the amount of xmm registers. Can we get them callee > > saved please? > > Having them callee saved pre-supposes that one knows the width of the > register.
Whole architecture of SSE/AVX is based on the fact of zerroing-upper. For references - take a look at definition of VLMAX in Spec. E.g. for AVX2 we had: vaddps %ymm1, %ymm2, %ymm3 Intuition says (at least to me) that after compilation it shouldn't have an idea of 256-bit `upper' half. But with AVX-512 we have (again, see Spec, operation section of vaddps, VEX.256 encoded): DEST[31:0] = SRC1[31:0] + SRC2[31:0] ... DEST[255:224] = SRC1[255:224] + SRC2[255:224]. DEST[MAX_VL-1:256] = 0 So, legacy code *will* change upper 256-bit of vector register. The roots can be found in GPR 64-bit insns. So, we have different behavior on 64-bit and 32-bit target for following sequence: push %eax ;; play with eax pop %eax on 64-bit machine upper 32-bits of %eax will be zeroed, and if we'll try to use old version of %rax - fail! So, following such philosophy prohibits to make vector registers callee-safe. BUT. What if we make couple of new registers calle-safe in the sense of *scalar* type? So, what we can do: 1. make callee-safe only bits [0..XXX] of vector register. 2. make call-clobbered bits of (XXX..VLMAX] in the same register. XXX is number of bits to be callee-safe: 64, 80, 128 or even 512. Advantage is that when we are doing FP scalar code, we don’t bother about save/restore callee-safe part. vaddss %xmm17, %xmm17, %xmm17 call foo vaddss %xmm17, %xmm17, %xmm17 We don’t care if `foo’: - is legacy in AVX-512 sense – it just see no xmm17 - in future ISA sense. If this code is 1024-bit wide reg and `foo’ is AVX-512. It will save XXX bits, allowing us to continue scalar calculations without saving/restore -- Thanks, K