https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84041
Bug ID: 84041 Summary: [nvptx] Hang in for-3.c Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- The following testcase, minimized from for-3.c hangs for nvptx (but passes with GOMP_NVPT_JIT=-O[01]): ... extern void abort (); #define CAT2(A, B) A ## B #define CAT(A, B) CAT2(A, B) #define ULL(a) CAT (a, ULL) #define n 10 #pragma omp declare target int a[n]; // HANGS: #define c 0x7ffffffffffffff6ULL //PASSES: //#define c 0x7ffffffffffffff5ULL __attribute__((noinline, noclone)) void test_dpf_runtime (void) { unsigned long long i; #pragma omp distribute parallel for schedule(dynamic,1) for (i = c + ULL (n); i > c; i -= ULL (1)) ; } #pragma omp end declare target int main (void) { #pragma omp target teams { test_dpf_runtime (); } return 0; } ... When setting c to 0x7ffffffffffffff5ULL instead, the test-case passes. In fact, code generation is significantly different, because fd->iter_type is long int in the passing case, and long long in the hanging case. This causes a difference in region->is_combined_parallel, and so we take a different path through expand_omp_for_generic. The way I understand it, it that region->is_combined_parallel triggers an optimization, which can be switched off by always returning false in workshare_safe_to_combine_p. If I do that, I can reproduce the same hang with c == 0x7ffffffffffffff5ULL, or with c == 0.