https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263
--- Comment #6 from Jim Wilson <wilson at gcc dot gnu.org> --- Looking at some other targets. ARM has movcc but not 128-bit long double. Aaarch has movcc and 128-bit long double, but has 128-bit load/store so this is only 4 instructions. mips64, powerpc64, and sparc64 have movcc and 128-bit long double, but emit the memcpy inline as 8 instructions. riscv64 meanwhile wants the libcall with -Os as that is 4 instructions instead of 8. For rv32 this would be 16 instructions. I'm not sure offhand if the other targets support 32-bit code and 128-bit long double. Anyways, I tracked the use of BLOCK_OP_NO_LIBCALL in emit_move_complex back to bugzilla 15289, fixed by a patch from Richard Henderson back in Dec 1 2004. I think it is just an oversight that -Os wasn't considered here. I think the correct fix is to only force BLOCK_OP_NO_LIBCALL when optimizing for speed. With this change, I get the 8 instruction sequence with -O2, and the 4 instruction libcall sequence with -Os, which is what the RISC-V backend wants, and this lets the testcase work.