https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101995
Bug ID: 101995
Summary: regression built-in memset missed-optimization arm -Os
Product: gcc
Version: 10.3.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: dumoulin.thibaut at gmail dot com
Target Milestone: ---
For cortex-m4 -Os, GCC10 produces bigger assembly code than GCC7 when memset is
called.
Here is the C code example to trigger the regression:
```C
#include <stdio.h>
#include <string.h>
struct foo_t {
int a;
int b;
int c;
int d;
};
/* Random function modifying foo with another value than 0 */
void doStuff(struct foo_t *foo) {
foo->b = foo->a + foo->c;
}
void twoLinesFunction(struct foo_t *foo) {
/* R0 is saved in GCC10 but not in GCC7 */
memset(foo, 0x00, sizeof(struct foo_t));
doStuff(foo);
}
int main(void) {
struct foo_t foo;
twoLinesFunction(&foo);
return 0;
}
```
compile command: `gcc -Os -mcpu=cortex-m4`
GCC7.3.1 produces:
```asm
<twoLinesFunction>:
push {r3, lr}
movs r2, #16
movs r1, #0
bl 8168 <memset>
ldmia.w sp!, {r3, lr}
b.w 8104 <doStuff>
```
While GCC10.3.0 produces:
```asm
<twoLinesFunction>:
push {r4, lr}
movs r2, #16
mov r4, r0 --> backup r0
movs r1, #0
bl 8174 <memset>
mov r0, r4 --> restore r0
ldmia.w sp!, {r4, lr}
b.w 810c <doStuff>
```
Main function remains the same.
The builtin memset function does not change R0 so there is no need to save it
and restore it later. GCC7 is more efficient.
GCC10 should not backup R0 for this builtin function in this case, it produces
slower code.
There is this PR https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61241 which is
also referring to this behavior with a patch to implement the optimization but
I'm not sure when this optimization has been wiped out.