[clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293)

Michael Kruse via cfe-commits Mon, 28 Jul 2025 05:39:14 -0700

Meinersbur wrote:

> Maybe there is no need to recursively represent all the nested transformation?


I don't see a need to represent the entire loop structure. The only info that I 
think is potentially needed is the nest depth for transformations that generate 
a loop nest (e.g. tile), and for transformations that generate loop sequences 
it is the number of loops in that sequence (e.g. `split` for 
index-set-splitting, and by construction also every construct with a 
`looprange` clause).

The only use case for 2130117f92e51df73ac8c4b7e37f7f89178a89f2 was to identify 
whether -- while being in a non-fully instantiated template context -- there 
are additional generated loops or not. Within a dependend context 
`getTransformedStmt()` may return null because e.g. arguments of the 
`looprange()` clause might be template parameters. 
```c++
      Stmt *TransformedStmt = Dir->getTransformedStmt();
      if (!TransformedStmt) {
        unsigned NumGeneratedLoops = Dir->getNumGeneratedLoops();
        if (NumGeneratedLoops == 0) {
          // May happen if the loop transformation does not result in a
          // generated loop (such as full unrolling).
          break;
        }
        if (NumGeneratedLoops > 0) {
          // The loop transformation construct has generated loops, but these
          // may not have been generated yet due to being in a dependent
          // context.
          return true;
        }
      }
```

Outside of a template context `getTransformedStmt()` returns the fully expanded 
loops which contain the nesting information that can be analyzed by recursing 
into it, so no separate generated loop info should be necessary. Eventually the 
template will be used in a instantiated context (or it's completly unused) 
where it can be analyzed. Until then, all analysis can be defererred (which ` 
return true;` does). 

Since it is typically easily obtainable (e.g. by the number of items in a 
`sizes` clause for tile), the number of generated loops can be used for 
additional semantic checking even in a templated (/dependent) context. This is 
what the `break` above does: Since we know there are no further loops even in 
an instantiated context, we do not need to defer semantic checks until later, 
and analyze the loop nesting we already have.

I imagine those numbers could be used for better diagnostic output in general, 
e.g. when using `collapse(2)` on a `#pragma omp unroll partial`, hinting the 
user that the transformation does not produce a sufficient number of loops, 
instead of pointing out that the unrolled loop body is not a valid canonical 
loop[^1]. However, I think tracking the entire loop nest for better diagnostic 
messages is overblown, and may not even be accurate. For instance:
```c
for (int i = 1; i < 20; i+=1)
  #pragma omp tile sizes(i)
  for (int j = 1; j < n; j+=1)
```
Just recording the loop nesting of this may result in `gls = [ 3 ]`. Ovever, 
the tile size depends on the outer loop which the OpenMP restrictions on 
canonical loop nests do not allow. One may track the dependencies through all 
the clauses and loop bounds, but I think it is sufficient to let 
`checkOpenMPLoop` diagnose all those restrictions once `getTransformedStmt()` 
is guarenteed to be available.

[^1]: In the OpenMP spec the unrolled body is never a valid canonical loop, 
even with `partial(1)`, nor a loop sequence, a concept that was added after 
unroll. I am not sure what the use case would be for considering the unrolled 
loop body to be a canonical loop sequence.

https://github.com/llvm/llvm-project/pull/139293
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293)

Reply via email to