https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89332
Bug ID: 89332 Summary: Missed detection of dead stores to array in a loop Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: prathamesh3492 at gcc dot gnu.org Target Milestone: --- Hi, For the following test-case: #define ARR_MAX 6 __attribute__((const)) int f(int); int foo() { int arr[ARR_MAX]; for (int i = 0; i < ARR_MAX; i++) arr[i] = f(i); return arr[0]; } With -O3, gcc generates call to f(i) and store to arr[i] on every iteration, while clang detects the stores to arr are dead (except for arr[0]), removes the loop and emits a tail-call to f(0). aarch64-linux-gnu-gcc -O3: foo: .LFB0: .cfi_startproc stp x29, x30, [sp, -64]! .cfi_def_cfa_offset 64 .cfi_offset 29, -64 .cfi_offset 30, -56 mov x29, sp stp x19, x20, [sp, 16] .cfi_offset 19, -48 .cfi_offset 20, -40 add x20, sp, 40 mov w19, 0 .p2align 3,,7 .L2: mov w0, w19 bl f str w0, [x20], 4 add w19, w19, 1 cmp w19, 6 bne .L2 ldr w0, [sp, 40] ldp x19, x20, [sp, 16] ldp x29, x30, [sp], 64 .cfi_restore 30 .cfi_restore 29 .cfi_restore 19 .cfi_restore 20 .cfi_def_cfa_offset 0 ret clang -O3 --target=aarch64-linux-gnu: foo: // @foo // %bb.0: mov w0, wzr b f It seems, clang takes advantage of loop unrolling for the above-case, while gcc doesn't seem to. After increasing ARR_MAX from 6 to 512, clang generates same/similar code as gcc. I doubt tho if such code is written in practice or can result due to abstraction lowering ? It was just a contrived test-case I made up. Thanks, Prathamesh