efriedma added inline comments.
================ Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825 + // The local stack holds all alloca instructions and all byval arguments. + AllocaDerivedValueTracker Tracker; + for (Argument &Arg : F.args()) { ---------------- avl wrote: > efriedma wrote: > > avl wrote: > > > efriedma wrote: > > > > avl wrote: > > > > > efriedma wrote: > > > > > > avl wrote: > > > > > > > efriedma wrote: > > > > > > > > avl wrote: > > > > > > > > > efriedma wrote: > > > > > > > > > > avl wrote: > > > > > > > > > > > efriedma wrote: > > > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker > > > > > > > > > > > > analysis? Is it not enough that the call you're trying > > > > > > > > > > > > to TRE is marked "tail"? > > > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker > > > > > > > > > > > >analysis? > > > > > > > > > > > > > > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) > > > > > > > > > > > could be reused here. > > > > > > > > > > > But marking, done in markTails(), looks like separate > > > > > > > > > > > tasks. i.e. it is better > > > > > > > > > > > to make TRE not depending on markTails(). There is a > > > > > > > > > > > review for this - https://reviews.llvm.org/D60031 > > > > > > > > > > > Thus such separation looks useful(To not reuse result of > > > > > > > > > > > markTails but have it computed inplace). > > > > > > > > > > > > > > > > > > > > > > > Is it not enough that the call you're trying to TRE is > > > > > > > > > > > > marked "tail"? > > > > > > > > > > > > > > > > > > > > > > It is not enough that call which is subject to TRE is > > > > > > > > > > > marked "Tail". > > > > > > > > > > > It also should be checked that other calls does not > > > > > > > > > > > capture pointer to local stack: > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > // do not do TRE if any pointer to local stack has > > > > > > > > > > > escaped. > > > > > > > > > > > if (!Tracker.EscapePoints.empty()) > > > > > > > > > > > return false; > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > It is not enough that call which is subject to TRE is > > > > > > > > > > > marked "Tail". It also should be checked that other calls > > > > > > > > > > > does not capture pointer to local stack: > > > > > > > > > > > > > > > > > > > > If there's an escaped pointer to the local stack, we > > > > > > > > > > wouldn't infer "tail" in the first place, would we? > > > > > > > > > If function receives pointer to alloca then it would not be > > > > > > > > > marked with "Tail". Then we do not have a possibility to > > > > > > > > > understand whether this function receives pointer to alloca > > > > > > > > > but does not capture it: > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > void test(int recurseCount) > > > > > > > > > { > > > > > > > > > if (recurseCount == 0) return; > > > > > > > > > int temp = 10; > > > > > > > > > globalIncrement(&temp); > > > > > > > > > test(recurseCount - 1); > > > > > > > > > } > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > test - marked with Tail. > > > > > > > > > globalIncrement - not marked with Tail. But TRE could be done > > > > > > > > > since it does not capture pointer. But if it will capture the > > > > > > > > > pointer then we could not do TRE. So we need to check > > > > > > > > > !Tracker.EscapePoints.empty(). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > test - marked with Tail. > > > > > > > > > > > > > > > > For the given code, TRE won't mark the recursive call "tail". > > > > > > > > That transform isn't legal: the recursive call could access the > > > > > > > > caller's version of "temp". > > > > > > > >For the given code, TRE won't mark the recursive call "tail". > > > > > > > >That transform isn't legal: the recursive call could access the > > > > > > > >caller's version of "temp". > > > > > > > > > > > > > > it looks like recursive call could NOT access the caller's > > > > > > > version of "temp": > > > > > > > > > > > > > > ``` > > > > > > > test(recurseCount - 1); > > > > > > > ``` > > > > > > > > > > > > > > Caller`s version of temp is accessed by non-recursive call: > > > > > > > > > > > > > > ``` > > > > > > > globalIncrement(&temp); > > > > > > > ``` > > > > > > > > > > > > > > If globalIncrement does not capture the "&temp" then TRE looks to > > > > > > > be legal for that case. > > > > > > > > > > > > > > globalIncrement() would not be marked with "Tail". test() would > > > > > > > be marked with Tail. > > > > > > > > > > > > > > Thus the pre-requisite for TRE would be: tail-recursive call must > > > > > > > not receive pointer to local stack(Tail) and non-recursive calls > > > > > > > must not capture the pointer to local stack. > > > > > > Can you give a complete IR example where we infer "tail", but TRE > > > > > > is illegal? > > > > > > > > > > > > Can you give a complete IR example, we we don't infer "tail", but > > > > > > we still do the TRE transform here? > > > > > >Can you give a complete IR example where we infer "tail", but TRE is > > > > > >illegal? > > > > > > > > > > there is no such example. Currently all cases where we infer "tail" > > > > > would be valid for TRE. > > > > > > > > > > >Can you give a complete IR example, we we don't infer "tail", but we > > > > > >still do the TRE transform here? > > > > > > > > > > For the following example current code base would not infer "tail" > > > > > for _Z15globalIncrementPKi and as the result would not do TRE for > > > > > _Z4testi. > > > > > This patch changes this behavior: so that if _Z15globalIncrementPKi > > > > > is not marked with "tail" and does not capture its pointer argument - > > > > > TRE would be allowed for _Z4testi. > > > > > > > > > > > > > > > ``` > > > > > @count = dso_local local_unnamed_addr global i32 0, align 4 > > > > > > > > > > ; Function Attrs: nofree noinline norecurse nounwind uwtable > > > > > define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly > > > > > %param) local_unnamed_addr #0 { > > > > > entry: > > > > > %0 = load i32, i32* %param, align 4 > > > > > %1 = load i32, i32* @count, align 4 > > > > > %add = add nsw i32 %1, %0 > > > > > store i32 %add, i32* @count, align 4 > > > > > ret void > > > > > } > > > > > > > > > > ; Function Attrs: nounwind uwtable > > > > > define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr > > > > > #1 { > > > > > entry: > > > > > %temp = alloca i32, align 4 > > > > > %cmp = icmp eq i32 %recurseCount, 0 > > > > > br i1 %cmp, label %return, label %if.end > > > > > > > > > > if.end: ; preds = %entry > > > > > %0 = bitcast i32* %temp to i8* > > > > > call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6 > > > > > store i32 10, i32* %temp, align 4 > > > > > call void @_Z15globalIncrementPKi(i32* nonnull %temp) > > > > > %sub = add nsw i32 %recurseCount, -1 > > > > > call void @_Z4testi(i32 %sub) > > > > > call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6 > > > > > br label %return > > > > > > > > > > return: ; preds = %entry, > > > > > %if.end > > > > > ret void > > > > > } > > > > > > > > > > ; Function Attrs: argmemonly nounwind willreturn > > > > > declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2 > > > > > > > > > > ; Function Attrs: argmemonly nounwind willreturn > > > > > declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2 > > > > > > > > > > attributes #0 = { nofree noinline norecurse nounwind uwtable } > > > > > attributes #1 = { nounwind uwtable } > > > > > attributes #2 = { argmemonly nounwind willreturn } > > > > > > > > > > ``` > > > > In your example, we don't infer "tail" for globalIncrement... but we do > > > > infer it for the recursive call to test(). I'm suggesting you could > > > > just check for "tail" on test(), instead of using > > > > AllocaDerivedValueTracker. > > > Checking only test() is not enough. There additionally should be checked > > > globalIncrement(). > > > > > > It is correct that while checking test() there could be checked "Tail" > > > flag. Which does not require using AllocaDerivedValueTracker. > > > > > > But while checking globalIncrement() there should be checked whether some > > > alloca value escaped or not. "Tail" flag could not be used for that. > > > AllocaDerivedValueTracker allow to do such check: > > > > > > Tracker.EscapePoints.empty() > > > > > > If we would not do check for globalIncrement then it is not valid to do > > > TRE. Thus it seems we need to check globalIncrement for escaping pointer > > > and we need to use AllocaDerivedValueTracker for that. > > > Checking only test() is not enough. There additionally should be checked > > > globalIncrement(). > > > > >> Can you give a complete IR example where we infer "tail", but TRE is > > >> illegal? > > > there is no such example. Currently all cases where we infer "tail" would > > > be valid for TRE. > > > > I'm not sure how to reconcile this. Are you saying we could infer "tail" > > in some case where TRE is illegal, but don't right now? Or are you saying > > that you plan to extend TRE to handle cases where we can't infer "tail" on > > the recursive call? > >Or are you saying that you plan to extend TRE to handle cases where we can't > >infer "tail" on the recursive call? > > I think not exactly. more precise would probably be : "I am saying that plan > to extend TRE to handle cases where we can't infer "tail" on the > NON-recursive NON-last call" > > globalIncrement() is non-recursive non-last call in above example. But we > need to check whether it captures argument pointer or not to decide whether > it is OK to do TRE. > > To make things clear - I am suggesting instead of current pre-requisite for > TRE : > > "All call sites are marked with Tail" > > to make following: > > "Recursive last calls are marked with "Tail", non-recursive non-last calls > are proved to not capture alloca". > > For the above example it means : the requirement for test() should stay the > same(should be marked with Tail). The requirement for globalIncrement() > should be "does not capture alloca". > "Recursive last calls are marked with 'tail'" implies "non-recursive non-last calls are proved to not capture alloca". Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82085/new/ https://reviews.llvm.org/D82085 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits