https://github.com/Meinersbur commented:

Please remove "[LoopTiling]" tag from the title, it is not a component of Clang.

I would expect more testing in transform/tile, for instance:
1. Multi-dimensional tiling
2. collapse that only affects some, but not all of the intratile-loops of a 
multi-dimensional tiling
3. loops on user-defined iterators (like iterfor.cpp, but also collapsing some 
of the intratile-loops)
4. A check whether vectorization takes place on a non-affected innermost 
intra-tile loop is taking place
5. Composition of multiple loop transformation (tile and others)
6. ...

There is no explanation of the overall design of this change which makes it 
very hard to review. The PR summary as well as a big comments at an central 
location need to explain it, where other comments that mention "omp-tile 
body-guard marker" can point to. Someone who encounters it while reading 
non-OpenMP-related code will be puzzled by what it is about. For instance, it 
should sketch how the AST structure looks like and what the semantics of 
OMPInvariantPredicateBoundAttr are.

I tried to understand it anyway and here is what I got:
 1. If all loops are known-multiples of the tile size, no predicate is emitted 
(Review comment: could we do it per-loop?) 
 2. The intratile ForStmt condition is emitted without partial tile check. That 
is, without additional code, it will overshoot the iteration space
 3. The body is surrounded with an IfStmt with an AttributedStmt of an 
OMPInvariantPredicateBoundAttr checking for the iteration space. 

That is, without additional code in CGStmt that catches the AttributedStmt, a 
predicate is emitted. I would have done it the other way around: Make the 
default AST emit the vectorizable code (so no additional handling in CGStmt is 
needed; AttributedStmt can just be ignored, no change in output), and make 
`checkOpenMPLoop` recognize the AttributedStmt: If present, it knows it can 
convert the non-canonical ForStmt condition into a canonical one by remembering 
it still has to emit a predicate around the body in TransformedStmt.
The main benefit is that this is more localized to the handling of OpenMP loops 
than dragging its complexity into the general-purpose CodeGen. Downside might 
be that the predicate needs to be remembered through the composition of 
multiple loop transformations (altough I think this approach might have the 
same problem; tests for it are missing).
 

https://github.com/llvm/llvm-project/pull/191114
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to