https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed| |2016-12-16 CC| |rguenth at gcc dot gnu.org Component|c |tree-optimization Version|unknown |7.0 Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Confirmed. We have several passes doing the necessary analyses but none with the goal of eliminating this case. The closest match is probably the bswap pass which has related missed optimization bugs for bswap via memory. The store-merging pass OTOH has the "sink" analysis part -- identifying adjacent stores. Together the passes analysis could handle this case (and bswap via memory). Another case that would probably benefit from moving load/store analysis and dataflow to some common code. Note that SLP vectorization on 32bit with SSE disabled would handle the case in this bug (another pass with some of the required analysis). It's not done at the moment because of a bug in alignment analysis (thought I fixed that ...). Really fixing that yields fct: .LFB0: .cfi_startproc movl v, %eax movl %eax, u ret .cfi_endproc .LFE0: .size fct, .-fct .p2align 4,,15 .globl fct2 .type fct2, @function fct2: .LFB1: .cfi_startproc movl v, %eax movl %eax, u ret with -O3 -m32 -mno-sse (yeah, neither the BB vectorizer nor the backend is very clever in the "vector" sizes it tries/allows).