https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120562

            Bug ID: 120562
           Summary: powerpc: Complex and uncomplete loop unroll
           Product: gcc
           Version: 15.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: christophe.leroy at csgroup dot eu
  Target Milestone: ---

Below function leads to unnecessary complex yet uncomplete loop unrolling
(built with -O2 -m32)

void f(unsigned int *p, unsigned int val)
{
        int i;

        for (i = 0; i < 10; i++)
                p[i] = val;
}

The following is generated from GCC 11 till GCC 15:


toto.o:     file format elf32-powerpc


Disassembly of section .text:

00000000 <f>:
   0:   38 63 ff fc     addi    r3,r3,-4
   4:   39 20 00 0a     li      r9,10
   8:   35 29 ff fb     addic.  r9,r9,-5
   c:   90 83 00 04     stw     r4,4(r3)
  10:   90 83 00 08     stw     r4,8(r3)
  14:   90 83 00 0c     stw     r4,12(r3)
  18:   90 83 00 10     stw     r4,16(r3)
  1c:   94 83 00 14     stwu    r4,20(r3)
  20:   4d 82 00 20     beqlr
  24:   35 29 ff fb     addic.  r9,r9,-5
  28:   90 83 00 04     stw     r4,4(r3)
  2c:   90 83 00 08     stw     r4,8(r3)
  30:   90 83 00 0c     stw     r4,12(r3)
  34:   90 83 00 10     stw     r4,16(r3)
  38:   94 83 00 14     stwu    r4,20(r3)
  3c:   40 82 ff cc     bne     8 <f+0x8>
  40:   4e 80 00 20     blr

GCC should know that the number of iterations being 10, there will be no exit
in the middle with the beqlr, and there will be no loop at the end with the
bne.

So at the end the function should have been:

  stw     r4,0(r3)
  stw     r4,4(r3)
  stw     r4,8(r3)
  stw     r4,12(r3)
  stw     r4,16(r3)
  stw     r4,20(r3)
  stw     r4,24(r3)
  stw     r4,28(r3)
  stw     r4,32(r3)
  stw     r4,36(r3)
  blr
  • [Bug c/120562] New: powerp... christophe.leroy at csgroup dot eu via Gcc-bugs

Reply via email to