https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115825
Bug ID: 115825 Summary: Loop unrolling increases code size with -Os Product: gcc Version: 14.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 58606 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58606&action=edit C99 test case In the following code, the loop in for16() is unrolled, which leads to quite some increase in code size. Reduced test case: char volatile v; // Will be unrolled to 21 words. void for16 (void) { for (char i = 16; i > 0; i -= 2) v = i; } // Reference for good code, consumes 5 words. void for18 (void) { for (char i = 18; i > 0; i -= 2) v = i; } $ avr-gcc unroll-Os.c -S -Os -dp The loop in for16 is unrolled and consumes 21 words of code size (7 * 3 words): for16: ldi r24,lo8(16) ; 25 [c=4 l=1] movqi_insn/1 sts v,r24 ; 26 [c=4 l=2] movqi_insn/2 ldi r24,lo8(14) ; 27 [c=4 l=1] movqi_insn/1 sts v,r24 ; 28 [c=4 l=2] movqi_insn/2 ldi r24,lo8(12) ; 29 [c=4 l=1] movqi_insn/1 sts v,r24 ; 30 [c=4 l=2] movqi_insn/2 ldi r24,lo8(10) ; 31 [c=4 l=1] movqi_insn/1 sts v,r24 ; 32 [c=4 l=2] movqi_insn/2 ldi r24,lo8(8) ; 33 [c=4 l=1] movqi_insn/1 sts v,r24 ; 34 [c=4 l=2] movqi_insn/2 ldi r24,lo8(6) ; 35 [c=4 l=1] movqi_insn/1 sts v,r24 ; 36 [c=4 l=2] movqi_insn/2 ldi r24,lo8(4) ; 37 [c=4 l=1] movqi_insn/1 sts v,r24 ; 38 [c=4 l=2] movqi_insn/2 ldi r24,lo8(2) ; 39 [c=4 l=1] movqi_insn/1 sts v,r24 ; 40 [c=4 l=2] movqi_insn/2 /* epilogue start */ The loop in similar code for for18 only consumes 5 words: for18: ldi r24,lo8(18) ; 25 [c=4 l=1] movqi_insn/1 .L3: sts v,r24 ; 21 [c=4 l=2] movqi_insn/2 subi r24, 2 ; 31 [c=4 l=1] *add.for.eqne.qi/0 brne .L3 ; 32 [c=4 l=1] branch /* epilogue start */ COLLECT_GCC=avr-gcc Target: avr Configured with: ../../source/gcc-14/configure --target=avr --disable-nls --enable-languages=c,c++ --with-gnu-as --with-gnu-ld --disable-shared --disable-libssp Thread model: single Supported LTO compression algorithms: zlib gcc version 14.1.1 20240509 (GCC) Same happens with master (future v15).