https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89582
--- Comment #6 from Yichao Yu <yyc1992 at gmail dot com> ---
For the vfloat test case, isn't the optimum code just
```
addps %xmm2, %xmm0
addps %xmm3, %xmm1
retq
```
It's not making full use of the vector but I assume not having to spill is a
win? This is what clang produces.
And for the LLVM early lowering of the calling convention, a less awkward way
is.
```
define { <2 x float>, <2 x float> } @f2({<2 x float>, <2 x float>}, {<2 x
float>, <2 x float>}) {
%v0 = extractvalue { <2 x float>, <2 x float> } %0, 0
%v1 = extractvalue { <2 x float>, <2 x float> } %0, 1
%v2 = extractvalue { <2 x float>, <2 x float> } %1, 0
%v3 = extractvalue { <2 x float>, <2 x float> } %1, 1
%v5 = fadd <2 x float> %v0, %v2
%v6 = fadd <2 x float> %v1, %v3
%v7 = insertvalue { <2 x float>, <2 x float> } undef, <2 x float> %v5, 0
%v8 = insertvalue { <2 x float>, <2 x float> } %v7, <2 x float> %v6, 1
ret { <2 x float>, <2 x float> } %v8
}
```