http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57830
Bug ID: 57830 Summary: fold_builtin_memory_op expands memcpy without regard to -Os Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org I see that the memcpy call at the end of gcc.dg/strlenopt-10.c:fn2.c is expanded for the avr target (which has "#define BIGGEST_ALIGNMENT 8", i.e. the "dest_align < TYPE_ALIGN (desttype)" test at builtins.c:8923 succeeds) irrespective of -Os or the size of the copied object. So this generates 20 loads, 20 stores, ancillary address arithmetic, and sky-high register pressure with 18 call-saved registers saved in the prologue and restored in the epilogue. Just leaving the call to memcpy alone would generate shorter code.