https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86073
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2018-06-07 Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Probably arm misses the setup to handle this via can_store_by_pieces or set_storage_via_setmem. What arm subarchitecture are you creating code for? On x86 with -m32 the 'size' loop isn't recognized as memset at the GIMPLE level (it isn't a memset after all) and we expand the 'pro' loop as memset call but not the 'rem' loop. Note that pro seems to be MAX(size, <st> & 3) so it isn't really <= 3. Thus, x86 is working fine but arm isn't.