tzik added inline comments.
================ Comment at: lib/Sema/SemaDecl.cpp:12760 // to deduce an implicit return type. - if (FD->getReturnType()->isRecordType() && - (!getLangOpts().CPlusPlus || !FD->isDependentContext())) + if (!FD->getReturnType()->isScalarType()) computeNRVO(Body, getCurFunction()); ---------------- Quuxplusone wrote: > rsmith wrote: > > tzik wrote: > > > Quuxplusone wrote: > > > > What is the purpose of this change? > > > > If it's "no functional change" it should be done separately IMHO; if it > > > > is supposed to change codegen, then it needs some codegen tests. (From > > > > looking at the code: maybe this affects codegen on functions that > > > > return member function pointers by value?) > > > I think the previous implementation was incorrect. > > > Though computeNRVO clears ReturnStmt::NRVOCandidate when the > > > corresponding variable is not NRVO variable, CGStmt checks both of > > > ReturnStmt::NRVOCandidate and VarDecl::NRVOVariable anyway. > > > So, computeNRVO took no effect in the previous code, and the absence of > > > computeNRVO here on function templates did not matter. > > > Note that there was no chance to call computeNRVO on function template > > > elsewhere too. > > > > > > OTOH in the new implementation, computeNRVO is necessary, and its absence > > > on function templates matters. > > > > > > We can remove computeNRVO here separately as a NFC patch and readd the > > > new impl to the same place, but it's probably less readable, IMO. > > What happens if we can't tell whether we have an NRVO candidate or not > > until instantiation: > > > > ``` > > template<typename T> X f() { > > T v; > > return v; // is an NRVO candidate if T is X, otherwise is not > > } > > ``` > > > > (I think this is not hard to handle: the dependent construct here can only > > affect whether you get NRVO at all, not which variable you perform NRVO on, > > but I want to check that it is handled properly.) > Ah, so if I understand correctly — which I probably don't —... > the parts of this diff related to `isRecordType()` make NRVO more reliable on > template functions? Because the existing code has lots of checks for > `isRecordType()` which means that they don't run for dependent > (as-yet-unknown) types, even if those types are going to turn out to be > record types occasionally? > > I see that line 12838 runs `computeNRVO` unconditionally for *all* > ObjCMethodDecls, //even// ones with scalar return types. So maybe running > `computeNRVO` unconditionally right here would also be correct? > Datapoint: I made a patch to do nothing but replace this condition with `if > (Body)` and to remove the similar condition guarding `computeNRVO` in > `Sema::ActOnBlockStmtExpr`. It passed the Clang test suite with flying > colors. I don't know if this means that the test suite is missing some tests, > or if it means that the conditions are unnecessary. > > IMHO if you're changing this anyway, it would be nice to put the remaining > condition/optimization `if (getReturnType().isScalarType()) return;` inside > `computeNRVO` itself, instead of repeating that condition at every call site. `v` here is marked as an NRVO variable in its function template AST, and that propagates to the instantiation if the return type is compatible. https://github.com/llvm-mirror/clang/blob/9c82d4ff6015e4477e924c8a495a834c2fced29e/lib/Sema/SemaTemplateInstantiateDecl.cpp#L743 ================ Comment at: lib/Sema/SemaDecl.cpp:12760 // to deduce an implicit return type. - if (FD->getReturnType()->isRecordType() && - (!getLangOpts().CPlusPlus || !FD->isDependentContext())) + if (!FD->getReturnType()->isScalarType()) computeNRVO(Body, getCurFunction()); ---------------- tzik wrote: > Quuxplusone wrote: > > rsmith wrote: > > > tzik wrote: > > > > Quuxplusone wrote: > > > > > What is the purpose of this change? > > > > > If it's "no functional change" it should be done separately IMHO; if > > > > > it is supposed to change codegen, then it needs some codegen tests. > > > > > (From looking at the code: maybe this affects codegen on functions > > > > > that return member function pointers by value?) > > > > I think the previous implementation was incorrect. > > > > Though computeNRVO clears ReturnStmt::NRVOCandidate when the > > > > corresponding variable is not NRVO variable, CGStmt checks both of > > > > ReturnStmt::NRVOCandidate and VarDecl::NRVOVariable anyway. > > > > So, computeNRVO took no effect in the previous code, and the absence of > > > > computeNRVO here on function templates did not matter. > > > > Note that there was no chance to call computeNRVO on function template > > > > elsewhere too. > > > > > > > > OTOH in the new implementation, computeNRVO is necessary, and its > > > > absence on function templates matters. > > > > > > > > We can remove computeNRVO here separately as a NFC patch and readd the > > > > new impl to the same place, but it's probably less readable, IMO. > > > What happens if we can't tell whether we have an NRVO candidate or not > > > until instantiation: > > > > > > ``` > > > template<typename T> X f() { > > > T v; > > > return v; // is an NRVO candidate if T is X, otherwise is not > > > } > > > ``` > > > > > > (I think this is not hard to handle: the dependent construct here can > > > only affect whether you get NRVO at all, not which variable you perform > > > NRVO on, but I want to check that it is handled properly.) > > Ah, so if I understand correctly — which I probably don't —... > > the parts of this diff related to `isRecordType()` make NRVO more reliable > > on template functions? Because the existing code has lots of checks for > > `isRecordType()` which means that they don't run for dependent > > (as-yet-unknown) types, even if those types are going to turn out to be > > record types occasionally? > > > > I see that line 12838 runs `computeNRVO` unconditionally for *all* > > ObjCMethodDecls, //even// ones with scalar return types. So maybe running > > `computeNRVO` unconditionally right here would also be correct? > > Datapoint: I made a patch to do nothing but replace this condition with `if > > (Body)` and to remove the similar condition guarding `computeNRVO` in > > `Sema::ActOnBlockStmtExpr`. It passed the Clang test suite with flying > > colors. I don't know if this means that the test suite is missing some > > tests, or if it means that the conditions are unnecessary. > > > > IMHO if you're changing this anyway, it would be nice to put the remaining > > condition/optimization `if (getReturnType().isScalarType()) return;` inside > > `computeNRVO` itself, instead of repeating that condition at every call > > site. > `v` here is marked as an NRVO variable in its function template AST, and that > propagates to the instantiation if the return type is compatible. > https://github.com/llvm-mirror/clang/blob/9c82d4ff6015e4477e924c8a495a834c2fced29e/lib/Sema/SemaTemplateInstantiateDecl.cpp#L743 > the parts of this diff related to `isRecordType()` [...] Right, they look not RecordTypes in the template context. > Because the existing code has lots of checks for `isRecordType()` [...] Hm, maybe. Some of `isRecordType` don't have dependent type handling around them... > running `computeNRVO` unconditionally right here would also be correct? Updated. Agree, scalar types should work, IIUC. ================ Comment at: test/CodeGenCXX/nrvo.cpp:139 } - // FIXME: we should NRVO this variable too. - X x; + // CHECK: tail call {{.*}} @_ZN1XC1ERKS_ return x; ---------------- rsmith wrote: > tzik wrote: > > Quuxplusone wrote: > > > You've changed this function from testing one thing with a FIXME, to > > > testing a completely different thing. Could you add your new code as a > > > new function, and leave the old FIXME alone until it's fixed? > > > Alternatively, if your patch actually fixes the FIXME, could you just > > > replace the FIXME comment with a CHECK and leave the rest of this > > > function alone? > > My patch fixes the FIXME. > > However, on the resulting code of NRVOing, > > ``` > > X y; > > return y; > > ``` > > and > > ``` > > X x; > > return x; > > ``` > > get to the same code and unified. And, the function is simplified as > > ``` > > X test3(bool B) { > > X x; > > return x; > > } > > ``` > > without the if statement. > > So, there will be nothing to CHECK left here if I leave the code as-is. I > > think that does not fit to the test scenario. > ... and this is one reason why we don't generally like clang's IR generation > tests to enable optimizations. Please consider starting a new test file to > test the output from clang's frontend rather than the output from -O1. OK, I added one. Repository: rC Clang https://reviews.llvm.org/D47067 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits