https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93334
Bug ID: 93334
Summary: -O3 generates useless code checking for overlapping
memset ?
Product: gcc
Version: 9.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: nathanael.schaeffer at gmail dot com
Target Milestone: ---
It seems that trying to zero out two arrays in the same loop results in poor
code beeing generated by -O3.
If I understand it correctly, the generated code tries to identify if the
arrays overlap. If it is the case the code then falls back to simple loops
instead of calls to memset.
I wonder why overlapping memset is an issue?
I this some inherited behaviour from dealing with memcpy?
In case 4 arrays are zeroed together, about 40 instructions are generated to
check for mutual overlap... This does not seem to be necessary.
Other compilers (clang, icc) don't do that.
See issue here, with assembly generated:
https://godbolt.org/z/SSWVhm
And I copy the code below for reference too:
void test_simple_code(long l, double* mem, long ofs2) {
for (long k=0; k<l; k++) {
mem[k] = 0.0;
mem[ofs2 +k] = 0.0;
}
}
void test_crazy_code(long l, double* mem, long ofs2, long ofs3, long ofs4) {
for (long k=0; k<l; k++) {
mem[k] = 0.0;
mem[ofs2 +k] = 0.0;
mem[ofs3 +k] = 0.0;
mem[ofs4 +k] = 0.0;
}
}
void test_ok_code(long l, double* mem, long ofs2, long ofs3, long ofs4) {
for (long k=0; k<l; k++)
mem[k] = 0.0;
for (long k=0; k<l; k++)
mem[ofs2 +k] = 0.0;
for (long k=0; k<l; k++)
mem[ofs3 +k] = 0.0;
for (long k=0; k<l; k++)
mem[ofs4 +k] = 0.0;
}