https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121393
--- Comment #2 from Andrew Stubbs <ams at gcc dot gnu.org> ---
Here's preprocessed code:
#pragma omp for collapse(3)
for (v1 = 0; v1 < 20; v1 += 2)
for (v2 = 0x7fffffffffffffffLL + 11ULL;
v2 != 0x7fffffffffffffffLL - 4ULL; -- v2)
for (v3 = 10; v3 != 0; v3--)
b[v1 >> 1][v2 - 0x7fffffffffffffffLL + 3][v3 - 1] += 5.5;
But the "original" dump has this:
#pragma omp for collapse(3)
for (v1 = 0; v1 < 20; v1 = v1 + 2)
for (v2 = 9223372036854775818; v2 != 9223372036854775803; --v2)
for (v3 = 10; v3 > 0; v3-- )
{
b[v1 >> 1][v2 + 9223372036854775812][v3 + 4294967295] = b[v1 >> 1][v2 +
9223372036854775812][v3 + 4294967295] + 5.5e+0
}
So the +4294967295 for -1 is there right from the front end (which is in the
x86_64 compiler, BTW), and this does not work when the offset gets
zero-extended to 64-bit in the AMDGCN back end.