https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98335
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- So I think we want to improve that + /* If more than a word remains, then make sure to keep the + starting point at least word aligned. */ + if (last_live - first_live > UNITS_PER_WORD) + *trim_head &= (UNITS_PER_WORD - 1); Note, last_live is the start of the last live byte (so last_live + 1 is the end of that). For the small sizes, I'd say we should consider both alignment and exact head/tail trim values. Whole word store is definitely more efficient than 7 bytes store at offset 1, ditto head trim 2 and 3, storing just second half is ok. So shall we e.g. call by_pieces_ninsns for the before/after the expected triming and determine only trim if it doesn't increase number of by pieces store insns? It could also iterate on those.