[Bug tree-optimization/115825] New: Loop unrolling increases code size with -Os

gjl at gcc dot gnu.org via Gcc-bugs Mon, 08 Jul 2024 08:10:50 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115825


            Bug ID: 115825
           Summary: Loop unrolling increases code size with -Os
           Product: gcc
           Version: 14.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 58606
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58606&action=edit
C99 test case

In the following code, the loop in for16() is unrolled, which leads to quite
some increase in code size.  Reduced test case:

char volatile v;

// Will be unrolled to 21 words.
void for16 (void)
{
  for (char i = 16; i > 0; i -= 2)
    v = i;
}

// Reference for good code, consumes 5 words.
void for18 (void)
{
  for (char i = 18; i > 0; i -= 2)
    v = i;
}

$ avr-gcc unroll-Os.c -S -Os -dp

The loop in for16 is unrolled and consumes 21 words of code size (7 * 3 words):

for16:
        ldi r24,lo8(16)  ;  25  [c=4 l=1]  movqi_insn/1
        sts v,r24        ;  26  [c=4 l=2]  movqi_insn/2
        ldi r24,lo8(14)  ;  27  [c=4 l=1]  movqi_insn/1
        sts v,r24        ;  28  [c=4 l=2]  movqi_insn/2
        ldi r24,lo8(12)  ;  29  [c=4 l=1]  movqi_insn/1
        sts v,r24        ;  30  [c=4 l=2]  movqi_insn/2
        ldi r24,lo8(10)  ;  31  [c=4 l=1]  movqi_insn/1
        sts v,r24        ;  32  [c=4 l=2]  movqi_insn/2
        ldi r24,lo8(8)   ;  33  [c=4 l=1]  movqi_insn/1
        sts v,r24        ;  34  [c=4 l=2]  movqi_insn/2
        ldi r24,lo8(6)   ;  35  [c=4 l=1]  movqi_insn/1
        sts v,r24        ;  36  [c=4 l=2]  movqi_insn/2
        ldi r24,lo8(4)   ;  37  [c=4 l=1]  movqi_insn/1
        sts v,r24        ;  38  [c=4 l=2]  movqi_insn/2
        ldi r24,lo8(2)   ;  39  [c=4 l=1]  movqi_insn/1
        sts v,r24        ;  40  [c=4 l=2]  movqi_insn/2
/* epilogue start */

The loop in similar code for for18 only consumes 5 words:

for18:
        ldi r24,lo8(18)  ;  25  [c=4 l=1]  movqi_insn/1
.L3:
        sts v,r24        ;  21  [c=4 l=2]  movqi_insn/2
        subi r24, 2      ;  31  [c=4 l=1]  *add.for.eqne.qi/0
        brne .L3                 ;  32  [c=4 l=1]  branch
/* epilogue start */

COLLECT_GCC=avr-gcc
Target: avr
Configured with: ../../source/gcc-14/configure --target=avr --disable-nls
--enable-languages=c,c++ --with-gnu-as --with-gnu-ld --disable-shared
--disable-libssp
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 14.1.1 20240509 (GCC) 

Same happens with master (future v15).

[Bug tree-optimization/115825] New: Loop unrolling increases code size with -Os

Reply via email to