https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84041

            Bug ID: 84041
           Summary: [nvptx] Hang in for-3.c
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

The following testcase, minimized from for-3.c hangs for nvptx (but passes with
GOMP_NVPT_JIT=-O[01]):
...
extern void abort ();

#define CAT2(A, B) A ## B
#define CAT(A, B) CAT2(A, B)
#define ULL(a) CAT (a, ULL)

#define n 10
#pragma omp declare target
int a[n];

// HANGS:
#define c 0x7ffffffffffffff6ULL

//PASSES:
//#define c 0x7ffffffffffffff5ULL

__attribute__((noinline, noclone)) void
test_dpf_runtime (void)
{
  unsigned long long i;
#pragma omp distribute parallel for schedule(dynamic,1)
  for (i = c + ULL (n);
       i > c;
       i -= ULL (1))
    ;
}
#pragma omp end declare target

int
main (void)
{
#pragma omp target teams
  {
    test_dpf_runtime ();
  }

  return 0;
}
...

When setting c to 0x7ffffffffffffff5ULL instead, the test-case passes.

In fact, code generation is significantly different, because fd->iter_type is
long int in the passing case, and long long in the hanging case. This causes a
difference in region->is_combined_parallel, and so we take a different path
through expand_omp_for_generic.

The way I understand it, it that region->is_combined_parallel triggers an
optimization, which can be switched off by always returning false in
workshare_safe_to_combine_p.

If I do that, I can reproduce the same hang with c == 0x7ffffffffffffff5ULL, or
with c == 0.

Reply via email to