hfinkel added inline comments.

================
Comment at: include/clang/Basic/AttrDocs.td:1330
@@ -1328,1 +1329,3 @@
+the loop unroller tries to unroll the loop by a power-of-two factor
+subject to a limit on unrolled code size.
 
----------------
meheff wrote:
> hfinkel wrote:
> > meheff wrote:
> > > broune wrote:
> > > > Is #pragma unroll different from #pragma unroll(full) ? If not, why not 
> > > > make the two doc strings the same?
> > > In AttrDocs.td there is not documentation for "#pragma clang loop 
> > > unroll".  Instead it just mentions "#pragma clang loop" and points to 
> > > LanguageExtensions for details.  Regarding LanguageExtensions 
> > > documentation of "#pragma clang loop unroll" and AttrDocs documentation 
> > > for "#pragma unroll", I believe they do both say the same thing (or at 
> > > least aren't conflicting with each other) with the LanguageExtensions 
> > > documentation providing more context and describing the limits on 
> > > unrolling.
> > > 
> > > Little bit of background here.  There are two forms of the loop unroll 
> > > pragmas:
> > > 
> > > #pragma unroll
> > > #pragma unroll N
> > > 
> > > and:
> > > 
> > > #pragma clang loop unroll(full)
> > > #pragma clang loop unroll_count(N)
> > > 
> > > The "#pragma unroll" forms are supported for compatibilty with other 
> > > compilers (such as nvcc CUDA compiler) where this form is common while 
> > > "#pragma clang .." is a more cannonical form specific to clang.  "#pragma 
> > > unroll" and "#pragma clang loop unroll(full)" are equivalent.  What I 
> > > propose in the related optimization patch 
> > > (http://reviews.llvm.org/D10854) is that "#pragma unroll" should enable 
> > > loop unrolling with a runtime trip count of the loop.  So if you have:
> > > 
> > > void foo(int n) {
> > > #pragma unroll
> > >   for (int i=0; i<n; i++) { ... }
> > > }
> > > 
> > > Then the optimizer will unroll the loop with some reasonable power-of-two 
> > > factor.  One potentially funny bit about this approach is that "#pragma 
> > > clang loop unroll(full)" will also enable unrolling of a loop with a 
> > > runtime trip count.  This may be surprising... or maybe not. Full 
> > > unrolling is clearly not possible in this case so just being more 
> > > aggressive with unrolling may be the sensible thing to do.
> > > One potentially funny bit about this approach is that "#pragma clang loop 
> > > unroll(full)" will also enable unrolling of a loop with a runtime trip 
> > > count. This may be surprising... or maybe not. Full unrolling is clearly 
> > > not possible in this case so just being more aggressive with unrolling 
> > > may be the sensible thing to do.
> > 
> > I really don't like this. If the user is requesting 'full' unrolling, I'd 
> > rather we do only that (at least up to some really high limit, as we do 
> > currently). Runtime unrolling adds extra expense in the low-trip-count case 
> > (because of the various checks involved) and has a different set of 
> > trade-offs than simpler unrolling.
> > 
> > I think that we should have 'full' unrolling add 
> > "llvm.loop.unroll.runtime.disable" metadata. We should add some other mode, 
> > "aggressive" for example, which is like full unrolling but also allows for 
> > runtime unrolling. I think the desired end state here is that -Rpass=unroll 
> > would provide some specific feedback in the case where full unrolling was 
> > requested, but not possible because of a runtime trip count.
> > 
> I think that makes sense. The issue we were seeing was a loop with runtime 
> trip count behaving badly because it was marked '#pragma unroll', had a 
> runtime trip count, and we'd like to enable runtime unrolling for the NVPTX 
> target (http://reviews.llvm.org/D10855).  IIRC, the loop sometimes is 
> compile-time trip count depending upon template magic.  Anyway, your 
> suggestion (unroll(full) implies llvm.loop.unroll.runtime.disable) fixes that 
> issue.  One question is how do you feel about enabling runtime unrolling with 
> '#pragma unroll N"?  This gives more benefit to loops with higher trip 
> counts, but I could be convinced either way.
> One question is how do you feel about enabling runtime unrolling with 
> '#pragma unroll N"? This gives more benefit to loops with higher trip counts, 
> but I could be convinced either way.

I think it makes sense to enable it for '#pragma unroll N', since to user 
likely expects that behavior in the case of a runtime trip count.



Repository:
  rL LLVM

http://reviews.llvm.org/D10857




_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to