https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398
--- Comment #21 from ktkachov at gcc dot gnu.org --- So the actual hot loop in xz_r does: typedef unsigned char __uint8_t; typedef unsigned int __uint32_t; typedef unsigned long long __uint64_t; int foo (const __uint64_t len_limit, const __uint8_t *cur, __uint32_t delta, int len) { const __uint8_t *pb = cur - delta; while (++len != len_limit) { if (pb[len] != cur[len]) break; } return len; } The 'pb' pointer is the 'cur' pointer but moved back by 'delta'. Presumably that means that all memory between 'pb' and 'delta' and could be read in as wide a load as possible?