https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, the general OpenMP restrictions say roughly that if there is any overflow
during the computation of the number of loops, then the behavior is
unspecified.
So e.g. a loop
#pragma omp for
  for (int i = -100; i < INT_MAX - 50; i++)
is UB, because the number of iterations overflows in int.
Though, in your case it is just the chunk_size computation that is done later
and  nothing talks about that being undefined.
In the library for static,chunk_size the code handling it e.g. for long
iterators (or whatever fits into it) is:
      unsigned long n, s0, e0, i, c;
      long s, e;

      /* Otherwise, each thread gets exactly chunk_size iterations
         (if available) each time through the loop.  */

      s = ws->incr + (ws->incr > 0 ? -1 : 1);
      n = (ws->end - ws->next + s) / ws->incr;
      i = thr->ts.team_id;
      c = ws->chunk_size;

      /* Initial guess is a C sized chunk positioned nthreads iterations
         in, offset by our thread number.  */
      s0 = (thr->ts.static_trip * nthreads + i) * c;
      e0 = s0 + c;

      /* Detect overflow.  */
      if (s0 >= n)
        return 1;
      if (e0 > n)
        e0 = n;

      /* Transform these to the actual start and end numbers.  */
      s = (long)s0 * ws->incr + ws->next;
      e = (long)e0 * ws->incr + ws->next;

      *pstart = s;
      *pend = e;

      if (e0 == n)
        thr->ts.static_trip = -1;
      else
        thr->ts.static_trip++;
      return 0;

Any overflow in the s and n's computation are user's fault.
thr->ts.static_trip says how many times the current thread got a chunk assigned
before (with -1 means no more work for the current thread), so for e.g. very
large values of chunk_size it will be at most 1 for some threads and thread
numbers need to fit into int, so we can also assume there won't be more than
INT_MAX threads.
I guess for * c and + c we should use __builtin_mul_overflow and
__builtin_add_overflow.
And of course another thing is the inline expansion in omp-expand.c which could
do the same thing.
  • [Bug c/102461] Ne... michelemartone at users dot sourceforge.net via Gcc-bugs
    • [Bug c/10246... michelemartone at users dot sourceforge.net via Gcc-bugs
    • [Bug c/10246... michelemartone at users dot sourceforge.net via Gcc-bugs
    • [Bug c/10246... michelemartone at users dot sourceforge.net via Gcc-bugs
    • [Bug middle-... jakub at gcc dot gnu.org via Gcc-bugs

Reply via email to