https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110026
--- Comment #2 from d_vampile <d_vampile at 163 dot com> --- (In reply to Jakub Jelinek from comment #1) > Note, any benchmarking for speed with -O rather than -O2/-O3 is > intentionally missing various optimizations which can greatly improve > performance. O0 does miss a lot of optimizations. However, for the problem I mentioned, the GPRs used before and the FP registers after modification are used. When vectorization is not applicable, the X0 register is faster than the D0 register. Is it appropriate to modify here?