http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57830
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- The folding only folds the memcpy into MEM[(char * {ref-all})lp] = MEM[(char * {ref-all})&l]; which is certainly desirable, as it improves optimizations, and at any point can be expanded back to memcpy if that is desirable. It is early SRA that turns that l[1] = _12; _14 = strlen (q_8); l[2] = _14; _16 = strlen (r_10); ... MEM[(char * {ref-all})lp_32(D)] = MEM[(char * {ref-all})&l]; into the scalar stores and in the end actually increases the register pressure when you don't have enough call clobbered registers. Note the testcase is highlly artificial, and predicting whether the SRA is a win or not is very hard. If you look at how it is optimized with the strlen pass actually run (-Os -foptimize-strlen, for -O2 it is enabled by default) you'll see that it is actually a win, instead of storing constants into memory and then memcpying it afterwards you store constants into memory at the end (with 4 exceptions I think).