https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88705
Devin Hussey <husseydevin at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |UNCONFIRMED
Resolution|INVALID |---
--- Comment #3 from Devin Hussey <husseydevin at gmail dot com> ---
Well, it is still not as efficient as it should be.
This would be the code that only uses VFP:
fmul:
vadd.f32 s0, s0, s4
vadd.f32 s1, s1, s5
vadd.f32 s2, s2, s6
vadd.f32 s3, s3, s7
bx lr
dmul:
vadd.f64 d0, d0, d2
vadd.f64 d1, d1, d3
bx lr
There is no need to keep swapping in and out of NEON registers.