ChuanqiXu9 wrote:

> @usx95 may be able to help with the reproducer.
> 
> In the meantime, I'm trying to collect some information on the compile times. 
> So far it looks like we have a ~10-15x compile time regression on some 
> translation units. Without this patch `-ftime-report` shows:
> 
> ```
> ===-------------------------------------------------------------------------===
>                           Clang front-end time report
> ===-------------------------------------------------------------------------===
>   Total Execution Time: 39.1940 seconds (39.7238 wall clock)
> 
>    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- 
> Name ---
>   28.2611 ( 77.5%)   1.8439 ( 67.3%)  30.1050 ( 76.8%)  30.5230 ( 76.8%)  
> Clang front-end timer
>    8.1911 ( 22.5%)   0.8980 ( 32.7%)   9.0891 ( 23.2%)   9.2009 ( 23.2%)  
> Reading modules
>   36.4522 (100.0%)   2.7419 (100.0%)  39.1940 (100.0%)  39.7238 (100.0%)  
> Total
> ```
> 
> With it:
> 
> ```
> ===-------------------------------------------------------------------------===
> Clang front-end time report
> ===-------------------------------------------------------------------------===
> Total Execution Time: 466.7373 seconds (1251.6300 wall clock)
> ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- 
> Name ---
>  404.7200 ( 96.1%)  40.6383 ( 88.8%)  445.3583 ( 95.4%)  471.9647 ( 37.7%)  
> Clang front-end timer
>  15.2098 (  3.6%)   3.3586 (  7.3%)  18.5684 (  4.0%)  398.1242 ( 31.8%)  
> Reading modules
>  420.9899 (100.0%)  45.7474 (100.0%)  466.7373 (100.0%)  1251.6300 (100.0%)  
> Total
> ```
> 
> `perf record -g` / `perf report` give the following picture:
> 
> ```
>   Children      Self  Command  Shared Object       Symbol
> +   94.85%     0.00%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformCallExpr(clang::CallExpr*) [clone 
> .__uniq.16014532493918845222783194145290083557] ◆
> +   93.47%     0.00%  clang    clang               [.] 
> clang::Sema::InstantiateFunctionDefinition(clang::SourceLocation, 
> clang::FunctionDecl*, bool, bool, bool)                                       
>               ▒
> +   93.37%    83.51%  clang    clang               [.] 
> clang::ASTReader::LoadExternalSpecializations(clang::Decl const*, bool)       
>                                                                               
>   ▒
> +   93.19%     0.00%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformCompoundStmt(clang::CompoundStmt*,
>  bool) [clone .__uniq.16014532493918845222783194▒
> +   93.08%     0.00%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformUnresolvedLookupExpr(clang::UnresolvedLookupExpr*,
>  bool) [clone .__uniq.1601453249▒
> +   92.98%     0.00%  clang    clang               [.] 
> clang::Sema::BuildTemplateIdExpr(clang::CXXScopeSpec const&, 
> clang::SourceLocation, clang::LookupResult&, bool, 
> clang::TemplateArgumentListInfo const*)       ▒
> +   92.44%     0.00%  clang    clang               [.] 
> clang::Sema::CheckVarTemplateId(clang::VarTemplateDecl*, 
> clang::SourceLocation, clang::SourceLocation, clang::TemplateArgumentListInfo 
> const&)                ▒
> +   92.08%     0.00%  clang    clang               [.] 
> clang::Sema::InstantiateVariableInitializer(clang::VarDecl*, clang::VarDecl*, 
> clang::MultiLevelTemplateArgumentList const&)                                 
>   ▒
> +   91.87%     0.00%  clang    clang               [.] 
> clang::VarTemplateDecl::getPartialSpecializations(llvm::SmallVectorImpl<clang::VarTemplatePartialSpecializationDecl*>&)
>  const                                 ▒
> +   91.18%     0.00%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformBinaryOperator(clang::BinaryOperator*)
>  [clone .__uniq.1601453249391884522278319414▒
> +   91.07%     0.00%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformExprs(clang::Expr* const*, 
> unsigned int, bool, llvm::SmallVectorImpl<clang::Expr*>▒
> +   90.70%     0.01%  clang    clang               [.] 
> clang::Sema::InstantiateVariableDefinition(clang::SourceLocation, 
> clang::VarDecl*, bool, bool, bool)                                            
>               ▒
> +   90.41%     0.01%  clang    clang               [.] 
> clang::Sema::BuildDeclarationNameExpr(clang::CXXScopeSpec const&, 
> clang::DeclarationNameInfo const&, clang::NamedDecl*, clang::NamedDecl*, 
> clang::TemplateArgu▒
> +   90.29%     0.00%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformInitListExpr(clang::InitListExpr*)
>  [clone .__uniq.16014532493918845222783194145290▒
> +   89.92%     0.00%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformParenExpr(clang::ParenExpr*) 
> [clone .__uniq.16014532493918845222783194145290083557▒
> +   89.23%     0.00%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformConditionalOperator(clang::ConditionalOperator*)
>  [clone .__uniq.160145324939188452▒
> +   84.49%     0.02%  clang    clang               [.] 
> clang::Sema::RequireCompleteTypeImpl(clang::SourceLocation, clang::QualType, 
> clang::Sema::CompleteTypeKind, clang::Sema::TypeDiagnoser*)                   
>    ▒
> +   84.47%     0.00%  clang    clang               [.] 
> clang::Sema::InstantiateClassTemplateSpecialization(clang::SourceLocation, 
> clang::ClassTemplateSpecializationDecl*, clang::TemplateSpecializationKind, 
> bool)  ▒
> +   84.07%     0.01%  clang    clang               [.] 
> clang::Sema::InstantiateClass(clang::SourceLocation, clang::CXXRecordDecl*, 
> clang::CXXRecordDecl*, clang::MultiLevelTemplateArgumentList const&, 
> clang::Templa▒
> +   82.84%     0.02%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformType(clang::TypeLocBuilder&, 
> clang::TypeLoc) [clone .__uniq.1601453249391884522278▒
> +   82.23%     0.02%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformTemplateSpecializationType(clang::TypeLocBuilder&,
>  clang::TemplateSpecializationTy▒
> +   81.99%     0.01%  clang    clang               [.] (anonymous 
> namespace)::TemplateInstantiator::TransformTemplateArgument(clang::TemplateArgumentLoc
>  const&, clang::TemplateArgumentLoc&, bool) [clone .__uniq.16▒
> +   81.54%     0.00%  clang    clang               [.] 
> clang::Sema::RequireCompleteDeclContext(clang::CXXScopeSpec&, 
> clang::DeclContext*)                                                          
>                   ▒
> +   80.18%     0.01%  clang    clang               [.] 
> clang::TreeTransform<(anonymous 
> namespace)::TemplateInstantiator>::TransformType(clang::TypeSourceInfo*) 
> [clone .__uniq.16014532493918845222783194145290083557▒
> +   79.88%     0.12%  clang    clang               [.] 
> clang::Sema::CheckTemplateIdType(clang::TemplateName, clang::SourceLocation, 
> clang::TemplateArgumentListInfo&)                                             
>    ▒
> ```
> 
> I can try to build clang with better debug information and get a higher 
> fidelity profile, but hopefully this already shows the direction to look at.

Thanks. It looks like `ASTReader::LoadExternalSpecializations(const Decl *D, 
bool OnlyPartial)` is the hot spot. I didn't think about it. Maybe the problem 
here is `findAll()`? Since we would always load all the specializations. Or the 
problem is we may call `findAll()` too many times. I'll try to take a look. And 
a profiling result with more information will be definitely helpful.

https://github.com/llvm/llvm-project/pull/83237
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to