Hi Steve,

> This patch checks for SIMD functions and saves the extra registers when
> needed.  It does not change the caller behavour, so with just this patch
> there may be values saved by both the caller and callee.  This is not
> efficient, but it is correct code.

I tried a few simple test cases. It seems calls to non-vector functions don't 
mark
the callee-saves as needing to be saved/restored:

void g(void);

void __attribute__ ((aarch64_vector_pcs))
f1 (void)
{ 
  g();
  g();
}

f1:
        str     x30, [sp, -16]!
        bl      g
        ldr     x30, [sp], 16
        b       g

Here I would expect q8-q23 to be preserved and no tailcall to g() since it is 
not a vector
function. This is important for correctness since f1 must preserve q8-q23.


// compile with -O2 -ffixed-d1 -ffixed-d2 -ffixed-d3 -ffixed-d4 -ffixed-d5 
-ffixed-d6 -ffixed-d7
float __attribute__ ((aarch64_vector_pcs))
f2 (float *p)
{
  float t0 = p[1];
  float t1 = p[3];
  float t2 = p[5]; 
  return t0 - t1 * (t1 + t0) + (t2 * t0);
}

f2:
        stp     d16, d17, [sp, -48]!
        ldr     s17, [x0, 4]
        ldr     s18, [x0, 12]
        ldr     s0, [x0, 20]
        fadd    s16, s17, s18
        fmsub   s16, s16, s18, s17
        fmadd   s0, s17, s0, s16
        ldp     d16, d17, [sp], 48
        ret

This uses s16-s18 when it should prefer to use s24-s31 first. Also it needs to 
save q16-q18,
not only d16 and d17.

Btw the -ffixed-d* is useful to block the register allocator from using certain 
registers.

Wilco

Reply via email to