https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115825
Bug ID: 115825
Summary: Loop unrolling increases code size with -Os
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gjl at gcc dot gnu.org
Target Milestone: ---
Created attachment 58606
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58606&action=edit
C99 test case
In the following code, the loop in for16() is unrolled, which leads to quite
some increase in code size. Reduced test case:
char volatile v;
// Will be unrolled to 21 words.
void for16 (void)
{
for (char i = 16; i > 0; i -= 2)
v = i;
}
// Reference for good code, consumes 5 words.
void for18 (void)
{
for (char i = 18; i > 0; i -= 2)
v = i;
}
$ avr-gcc unroll-Os.c -S -Os -dp
The loop in for16 is unrolled and consumes 21 words of code size (7 * 3 words):
for16:
ldi r24,lo8(16) ; 25 [c=4 l=1] movqi_insn/1
sts v,r24 ; 26 [c=4 l=2] movqi_insn/2
ldi r24,lo8(14) ; 27 [c=4 l=1] movqi_insn/1
sts v,r24 ; 28 [c=4 l=2] movqi_insn/2
ldi r24,lo8(12) ; 29 [c=4 l=1] movqi_insn/1
sts v,r24 ; 30 [c=4 l=2] movqi_insn/2
ldi r24,lo8(10) ; 31 [c=4 l=1] movqi_insn/1
sts v,r24 ; 32 [c=4 l=2] movqi_insn/2
ldi r24,lo8(8) ; 33 [c=4 l=1] movqi_insn/1
sts v,r24 ; 34 [c=4 l=2] movqi_insn/2
ldi r24,lo8(6) ; 35 [c=4 l=1] movqi_insn/1
sts v,r24 ; 36 [c=4 l=2] movqi_insn/2
ldi r24,lo8(4) ; 37 [c=4 l=1] movqi_insn/1
sts v,r24 ; 38 [c=4 l=2] movqi_insn/2
ldi r24,lo8(2) ; 39 [c=4 l=1] movqi_insn/1
sts v,r24 ; 40 [c=4 l=2] movqi_insn/2
/* epilogue start */
The loop in similar code for for18 only consumes 5 words:
for18:
ldi r24,lo8(18) ; 25 [c=4 l=1] movqi_insn/1
.L3:
sts v,r24 ; 21 [c=4 l=2] movqi_insn/2
subi r24, 2 ; 31 [c=4 l=1] *add.for.eqne.qi/0
brne .L3 ; 32 [c=4 l=1] branch
/* epilogue start */
COLLECT_GCC=avr-gcc
Target: avr
Configured with: ../../source/gcc-14/configure --target=avr --disable-nls
--enable-languages=c,c++ --with-gnu-as --with-gnu-ld --disable-shared
--disable-libssp
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 14.1.1 20240509 (GCC)
Same happens with master (future v15).