https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88717

--- Comment #1 from 刘袋鼠 <crazylht at gmail dot com> ---
Pass_insert_vzeroupper uses mode_switch to insert `vzeroupper`.

In function entry and functon body, 256bits/512bits registers are used ,so it
will set mode as `AVX_U128_DIRTY`. But for function exit no 256bits/512bits
register is returned, so `AVX_U128_CLEAN` is set. 

Then `case AVX_U128_CLEAN` will be triggered for mode switching, maybe we
should handle ix86_avx_u128_mode_exit.

Simple case show vzeroupper disappear when return a 512bits register.

```
test.i:
----------------------------
typedef float __v16sf __attribute__ ((__vector_size__ (64)));
typedef float __m512 __attribute__ ((__vector_size__ (64), __may_alias__));

__m512
foo (float *p, __m512 x)
{
  *p = ((__v16sf)x)[0];
  return x;
}


test.s
----------------------------------
foo:
.LFB0:
        .cfi_startproc
        vmovss  %xmm0, (%rdi)
        ret
        .cfi_endproc
.LFE0:
```

Reply via email to