https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398
Tamar Christina <tnfchris at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tnfchris at gcc dot gnu.org --- Comment #23 from Tamar Christina <tnfchris at gcc dot gnu.org> --- ? > The 'pb' pointer is the 'cur' pointer but moved back by 'delta'. > > Presumably that means that all memory between 'pb' and 'delta' and could be > > read in as wide a load as possible? > > A C language lawyer would agree with that. But does it really help? > The loop also accesses [cur + len, cur + len_limit]. Could we not emit a runtime check for this? Check if len <= delta and len + len_limit <= delta, and if so emit a vectorized version and if not fall back to the scalar unrolled code? though if the iteration counts are so small as Wilco suggests then maybe it's really not worth doing so.