jtb20 wrote: > > I don't think synthesizing a critical region inside the collapsed loop will > > be a win overall, and also I don't quite see how it helps in this case. A > > correct version would be doing something like serialising the loop, I > > think, which I suppose sort of bypasses the problem, but doesn't really > > solve it. (When would we do such a transformation? How much analysis do we > > need to avoid killing performance in non-contrived cases?) > > It should be done in the frontend. And here we should care about correctness, > not performance. Sure, this can be optimized for better performance later in > OpenMPOpt pass, but from the frontend we should get it working correctly.
Apologies -- I meant "when would we do such a transformation" as "under what precise conditions would we do it" rather than "at what point in the compilation pipeline". I think you're suggesting that we should only perform a genuine loop collapse if we can prove that it is safe (else falling back to some other code transformation*). IIUC that sort of thing isn't normally done with OpenMP -- the compiler does what the user tells it to. Any parallel loop with inter-iteration dependencies will probably compile to wrong code. It's just that for this particular case it seems especially easy to shoot oneself in the foot. (*) Indeed, maybe just backing out of collapsing the loop nest altogether might be the quickest/safest option. Doing that conservatively might mean essentially never collapsing imperfectly-nested loops though. Is that what we want? https://github.com/llvm/llvm-project/pull/96087 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits