https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016
--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> --- On Wed, 26 Feb 2025, tnfchris at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016 > > --- Comment #8 from Tamar Christina <tnfchris at gcc dot gnu.org> --- > (In reply to rguent...@suse.de from comment #7) > > On Wed, 26 Feb 2025, tnfchris at gcc dot gnu.org wrote: > > > > > Because of the scalar code doing DI mode loads, and the misalignment being > > > HImode, I don't think the alignment can be reached by peeling. > > > > > > i.e. I don't think you can reach target alignment, and unless I'm missing > > > something, get_misalign_in_elems does not have a failure mode for when > > > target > > > alignment can't be reached? > > > > get_misalign_in_elems of course returns garbage when the access isn't > > element aligned - we're not supposed to compute it in this case. > > Indeed, so it looks like we need an additional check in the pre-header that > checks if the address's misalignment can be corrected to target alignment. No, we should not attempt to peel such DR instead - IIRC we check alignment_reachable for this, so I'm not sure why it doens't work. > Probably can extend the check for if you have enough elements to peel to > prolog_loop_niters? > > A smaller example: > > -- > > #include <stddef.h> > #include <stdint.h> > #include <stdlib.h> > > __attribute__((noipa)) > char *foo(char *buf, size_t len) > { > for (; len > sizeof(uintptr_t) > ; buf += sizeof(uintptr_t), len -= sizeof(uintptr_t)) > { > uintptr_t chunk = *(const uintptr_t *)buf; > > if ((chunk & 0x8080808080808080) != 0x8080808080808080) > break; > } > return buf; > } > > int main () > { > char p[] = { > 0x2f, 0x2a, 0xa, 0x20, 0x20, 0x20, 0x62, 0x75, 0x67, 0x33, 0x33, 0x37, > 0x39, 0x37, 0x32, 0x33, 0x2e, 0x63, 0xa, 0x2a, 0x2f, 0xa, 0xa, 0x23, > 0x69, 0x6e, 0x63, > }; > > if (foo (&p[2], sizeof(p)-2)) > return 0; > > return 1; > } > > Alex do you want this one or shall I take it? > >