https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105014
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> --- First FAIL minimizes to: ... typedef __uint128_t T; union u { T t; struct { unsigned long long x; unsigned long long y; } xy; }; #define PRINT(VAR) \ do \ { \ __builtin_printf (#VAR ": lo: %llx\n", VAR.xy.x); \ __builtin_printf (#VAR ": hi: %llx\n", VAR.xy.y); \ } \ while (0) extern T __udivmodti4 (T, T, T *); int main (void) { union u a, b, mod, div; a.t = -4; b.t = 1; PRINT (a); PRINT (b); div.t = __udivmodti4 (a.t, b.t, &mod.t); PRINT (div); PRINT (mod); if (mod.t != 0) __builtin_abort (); return 0; } ... Fails like this: ... $ ./install/bin/nvptx-none-run ./pr97459-1.exe a: lo: fffffffffffffffc a: hi: ffffffffffffffff b: lo: 1 b: hi: 0 div: lo: fffffffffffffffd div: hi: ffffffffffffffff mod: lo: ffffffffffff mod: hi: 0 nvptx-run: error getting kernel result: unspecified launch failure (CUDA_ERROR_LAUNCH_FAILED, 719) $ ... With -O0 JIT instead: ... $ ./install/bin/nvptx-none-run -O0 ./pr97459-1.exe a: lo: fffffffffffffffc a: hi: ffffffffffffffff b: lo: 1 b: hi: 0 div: lo: fffffffffffffffc div: hi: ffffffffffffffff mod: lo: 0 mod: hi: 0 ...