With some recent optimization, -O1/-O2/-O3 can archive almost same performace/size by stack load/store. Thus lwm/swm will save/store less callee-saved register. In fact only $16 is saved with swm.
To be sure that this optimization does exist, let's add 2 more function calls. So that lwm/swm can be much more profitable. If we add only once more, -O1 will still use stack load/store. gcc/testsuite * gcc.target/mips/umips-save-restore-1.c: Be sure lwm/swm are used for more callee-saved registers with addtional 2 more function calls. --- gcc/testsuite/gcc.target/mips/umips-save-restore-1.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.target/mips/umips-save-restore-1.c b/gcc/testsuite/gcc.target/mips/umips-save-restore-1.c index ff1ea4b339a..0e2c4dcc844 100644 --- a/gcc/testsuite/gcc.target/mips/umips-save-restore-1.c +++ b/gcc/testsuite/gcc.target/mips/umips-save-restore-1.c @@ -7,12 +7,14 @@ int bar (int, int, int, int, int); MICROMIPS int foo (int n, int a, int b, int c, int d) { - int i, j; + int i, j, k, l; i = bar (n, a, b, c, d); j = bar (n, a, b, c, d); - return i + j; + k = bar (n, a, b, c, d); + l = bar (n, a, b, c, d); + return i + j + k + l; } -/* { dg-final { scan-assembler "\tswm\t\\\$16-\\\$2(0|1),\\\$31" } } */ -/* { dg-final { scan-assembler "\tlwm\t\\\$16-\\\$2(0|1),\\\$31" } } */ +/* { dg-final { scan-assembler "\tswm\t\\\$16-\\\$2(2|3),\\\$31" } } */ +/* { dg-final { scan-assembler "\tlwm\t\\\$16-\\\$2(2|3),\\\$31" } } */ -- 2.39.3 (Apple Git-146)