https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/136720
From 011008bd03f66e9afdfb3eec24a1c1c8a4018f52 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com> Date: Tue, 22 Apr 2025 16:46:05 +0200 Subject: [PATCH 1/5] [analyzer] Workaround for unintended slowdown (scope increase) Recently some users reported that the runtime of the static analyzer drastically increased on their codebase when they upgraded to a more recent (sligthly patched, internal) clang version: some of their translation units got more than +600% runtime (yes, with two zeroes) and IIUC the average increase was also high. Bisection revealed that the bulk of the increase was caused by my earlier commit bb27d5e5c6b194a1440b8ac4e5ace68d0ee2a849 ("Don't assume third iteration in loops") and I verified that the slowdown affects the downstream clang. (I used tmux as an arbitrary easy-to-analyze open source project and measured that reverting this commit reduces the total runtime by more than 50% compared to a recent main revision.) Further profiling and investigation proved that this increase is caused by an _increase of analysis scope_ because there was an arbitrary heuristic that placed functions on a "don't inline this" blacklist if they reached the `-analyzer-max-loop` limit (anywhere, on any one execution path) -- which became significantly rarer when my commit ensured the analyzer no longer "just assumes" four iterations. (With more inlining significantly more entry points use up their allocated budgets, which leads to the increased runtime.) I feel that this heuristic for the "don't inline" blacklist is unjustified, because reaching the "retry without inlining" limit on one path does not imply that inlining the function won't be valuable on other paths -- so I hope that we can eventually replace it with more "natural" limits of the analysis scope. However, the brutal jump of analysis runtime significantly degrades the user experience, so I created this quick workaround commit that approximates the "don't inline" blacklist effects of opaque loops (where the analyzer doesn't understand the loop condition) without fully reverting the "Don't assume third iteration" commit (to avoid reintroducing the false positives that were eliminated by it). --- .../StaticAnalyzer/Core/AnalyzerOptions.def | 14 ++ .../Core/PathSensitive/FunctionSummary.h | 4 - clang/lib/StaticAnalyzer/Core/ExprEngine.cpp | 60 +++++- clang/test/Analysis/analyzer-config.c | 1 + .../Analysis/legacy-inlining-prevention.c | 198 ++++++++++++++++++ clang/test/Analysis/loop-unrolling.cpp | 30 ++- 6 files changed, 285 insertions(+), 22 deletions(-) create mode 100644 clang/test/Analysis/legacy-inlining-prevention.c diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def index f9f22a9ced650..0ac85b14147e0 100644 --- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def +++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def @@ -414,6 +414,20 @@ ANALYZER_OPTION( "any target.", true) +ANALYZER_OPTION( + bool, LegacyInliningPrevention, "legacy-inlining-prevention", + "If enabled, the analyzer puts functions on a \"do not inline this\" list " + "if it finds an execution path within that function that may potentially " + "perform 'analyzer-max-loop' (= 4 by default) iterations in a loop. " + "Note that functions that _definitely_ reach the loop limit on some " + "execution path are currently marked as \"do not inline\" even if this " + "option is disabled (but this may change in future versions). This option " + "is a dumb and arbitrary restriction on inlining, but disabling it would " + "significantly increase the analysis workload (and the time taken) " + "compared to older clang versions because more top-level functions can " + "use up their 'max-nodes' limit if inlining is not restricted.", + true) + //===----------------------------------------------------------------------===// // Unsigned analyzer options. //===----------------------------------------------------------------------===// diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h index 3ee0d229cfc29..761395260a0cf 100644 --- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h +++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h @@ -81,10 +81,6 @@ class FunctionSummariesTy { I->second.MayInline = 0; } - void markReachedMaxBlockCount(const Decl *D) { - markShouldNotInline(D); - } - std::optional<bool> mayInline(const Decl *D) { MapTy::const_iterator I = Map.find(D); if (I != Map.end() && I->second.InlineChecked) diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp index 86e2e8f634bfd..1ff36eb83dbe7 100644 --- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp +++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -2523,6 +2523,20 @@ bool ExprEngine::replayWithoutInlining(ExplodedNode *N, return true; } +/// Return the innermost location context which is inlined at `Node`, unless +/// it's the top-level (entry point) location context. +static const LocationContext *getInlinedLocationContext(ExplodedNode *Node, + ExplodedGraph &G) { + const LocationContext *CalleeLC = Node->getLocation().getLocationContext(); + const LocationContext *RootLC = + (*G.roots_begin())->getLocation().getLocationContext(); + + if (CalleeLC->getStackFrame() == RootLC->getStackFrame()) + return nullptr; + + return CalleeLC; +} + /// Block entrance. (Update counters). void ExprEngine::processCFGBlockEntrance(const BlockEdge &L, NodeBuilderWithSinks &nodeBuilder, @@ -2570,21 +2584,24 @@ void ExprEngine::processCFGBlockEntrance(const BlockEdge &L, const ExplodedNode *Sink = nodeBuilder.generateSink(Pred->getState(), Pred, &tag); - // Check if we stopped at the top level function or not. - // Root node should have the location context of the top most function. - const LocationContext *CalleeLC = Pred->getLocation().getLocationContext(); - const LocationContext *CalleeSF = CalleeLC->getStackFrame(); - const LocationContext *RootLC = - (*G.roots_begin())->getLocation().getLocationContext(); - if (RootLC->getStackFrame() != CalleeSF) { - Engine.FunctionSummaries->markReachedMaxBlockCount(CalleeSF->getDecl()); + if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) { + // FIXME: This will unconditionally prevent inlining this function (even + // from other entrypoints), which is not a reasonable heuristic: even if + // we reached max block count on this particular execution path, there + // may be other execution paths (especially with other parametrizations) + // where the analyzer can reach the end of the function (so there is no + // natural reason to avoid inlining it). However, disabling this would + // significantly increase the analysis time (because more entrypoints + // would exhaust their allocated budget), so it must be compensated by a + // different (more reasonable) reduction of analysis scope. + Engine.FunctionSummaries->markShouldNotInline( + LC->getStackFrame()->getDecl()); // Re-run the call evaluation without inlining it, by storing the // no-inlining policy in the state and enqueuing the new work item on // the list. Replay should almost never fail. Use the stats to catch it // if it does. - if ((!AMgr.options.NoRetryExhausted && - replayWithoutInlining(Pred, CalleeLC))) + if ((!AMgr.options.NoRetryExhausted && replayWithoutInlining(Pred, LC))) return; NumMaxBlockCountReachedInInlined++; } else @@ -2856,8 +2873,29 @@ void ExprEngine::processBranch( // conflicts with the widen-loop analysis option (which is off by // default). If we intend to support and stabilize the loop widening, // we must ensure that it 'plays nicely' with this logic. - if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops) + if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops) { Builder.generateNode(StTrue, true, PredN); + } else if (AMgr.options.LegacyInliningPrevention) { + // FIXME: There is an ancient and very arbitrary heuristic in + // `ExprEngine::processCFGBlockEntrance` which prevents all further + // inlining of a function if it finds an execution path within that + // function which reaches the `MaxBlockVisitOnPath` limit (a/k/a + // `analyzer-max-loop`, by default four iterations in a loop). Adding + // this "don't assume third iteration" logic significantly increased + // the analysis runtime on some inputs because less functions were + // arbitrarily excluded from being inlined, so more entrypoints used + // up their full allocated budget. As a hacky compensation for this, + // here we apply the "should not inline" mark in cases when the loop + // could potentially reach the `MaxBlockVisitOnPath` limit without the + // "don't assume third iteration" logic. This slightly overcompensates + // (activates if the third iteration can be entered, and will not + // recognize cases where the fourth iteration would't be completed), but + // should be good enough for practical purposes. + if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) { + Engine.FunctionSummaries->markShouldNotInline( + LC->getStackFrame()->getDecl()); + } + } } if (StFalse) { diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c index 80cad54b039f4..42c622f2cd51b 100644 --- a/clang/test/Analysis/analyzer-config.c +++ b/clang/test/Analysis/analyzer-config.c @@ -92,6 +92,7 @@ // CHECK-NEXT: inline-lambdas = true // CHECK-NEXT: ipa = dynamic-bifurcate // CHECK-NEXT: ipa-always-inline-size = 3 +// CHECK-NEXT: legacy-inlining-prevention = true // CHECK-NEXT: max-inlinable-size = 100 // CHECK-NEXT: max-nodes = 225000 // CHECK-NEXT: max-symbol-complexity = 35 diff --git a/clang/test/Analysis/legacy-inlining-prevention.c b/clang/test/Analysis/legacy-inlining-prevention.c new file mode 100644 index 0000000000000..976e2ba6ddb5d --- /dev/null +++ b/clang/test/Analysis/legacy-inlining-prevention.c @@ -0,0 +1,198 @@ +// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -verify=expected,default %s +// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config legacy-inlining-prevention=false -verify=expected,disabled %s + +int getNum(void); // Get an opaque number. + +void clang_analyzer_numTimesReached(void); +void clang_analyzer_dump(int arg); + +//----------------------------------------------------------------------------- +// Simple case: inlined function never reaches `analyzer-max-loop`. + +int inner_simple(void) { + clang_analyzer_numTimesReached(); // expected-warning {{2}} + return 42; +} + +int outer_simple(void) { + int x = inner_simple(); + int y = inner_simple(); + return 53 / (x - y); // expected-warning {{Division by zero}} +} + +//----------------------------------------------------------------------------- +// Inlined function always reaches `analyzer-max-loop`. + +int inner_fixed_loop_1(void) { + int i; + clang_analyzer_numTimesReached(); // expected-warning {{1}} + for (i = 0; i < 10; i++); + clang_analyzer_numTimesReached(); // no-warning + return 42; +} + +int outer_fixed_loop_1(void) { + int x = inner_fixed_loop_1(); + int y = inner_fixed_loop_1(); + return 53 / (x - y); // no-warning +} + +//----------------------------------------------------------------------------- +// Inlined function always reaches `analyzer-max-loop`; inlining is prevented +// even for different entry points. +// This test uses `clang_analyzer_dump` and distinct `arg` values because +// `clang_analyzer_numTimesReached` only counts the paths reaching that node +// during the analysis of one particular entry point, so it cannot distinguish +// "two entry points reached this, both with one path" (where the two reports +// are unified as duplicates by the generic report postprocessing) and "one +// entry point reached this with one path" (where naturally nothing shows that +// the second entry point _tried_ to reach it). + +int inner_fixed_loop_2(int arg) { + // Identical copy of inner_fixed_loop_1 + int i; + clang_analyzer_dump(arg); // expected-warning {{2}} + for (i = 0; i < 10; i++); + clang_analyzer_dump(arg); // no-warning + return 42; +} + +int outer_1_fixed_loop_2(void) { + return inner_fixed_loop_2(1); +} + +int outer_2_fixed_loop_2(void) { + return inner_fixed_loop_2(2); +} + +//----------------------------------------------------------------------------- +// Inlined function reaches `analyzer-max-loop` only in its second call. The +// function is inlined twice but the second call doesn't finish and ends up +// being conservatively evaluated. + +int inner_parametrized_loop_1(int count) { + int i; + clang_analyzer_numTimesReached(); // expected-warning {{2}} + for (i = 0; i < count; i++); + clang_analyzer_numTimesReached(); // expected-warning {{1}} + return 42; +} + +int outer_parametrized_loop_1(void) { + int x = inner_parametrized_loop_1(2); + int y = inner_parametrized_loop_1(10); + return 53 / (x - y); // no-warning +} + +//----------------------------------------------------------------------------- +// Inlined function reaches `analyzer-max-loop` on its first call, so the +// second call isn't inlined (although it could be fully evaluated). + +int inner_parametrized_loop_2(int count) { + int i; + clang_analyzer_numTimesReached(); // expected-warning {{1}} + for (i = 0; i < count; i++); + clang_analyzer_numTimesReached(); // no-warning + return 42; +} + +int outer_parametrized_loop_2(void) { + int y = inner_parametrized_loop_2(10); + int x = inner_parametrized_loop_2(2); + return 53 / (x - y); // no-warning +} + +//----------------------------------------------------------------------------- +// Inlined function may or may not reach `analyzer-max-loop` depending on an +// opaque check before the loop. This is very similar to the "fixed loop" +// cases: the function is placed on the "don't inline" list when any execution +// path reaches `analyzer-max-loop` (even if other execution paths reach the +// end of the function). + +int inner_conditional_loop(void) { + int i; + clang_analyzer_numTimesReached(); // expected-warning {{1}} + if (getNum() == 777) { + for (i = 0; i < 10; i++); + } + clang_analyzer_numTimesReached(); // expected-warning {{1}} + return 42; +} + +int outer_1_conditional_loop(void) { + return inner_conditional_loop(); +} + +int outer_2_conditional_loop(void) { + return inner_conditional_loop(); +} + +//----------------------------------------------------------------------------- +// Inlined function executes an opaque loop that may or may not reach +// `analyzer-max-loop`. Historically, before the "don't assume third iteration" +// commit (bb27d5e5c6b194a1440b8ac4e5ace68d0ee2a849) this worked like the +// `conditional_loop` cases: the analyzer was able to find a path reaching +// `analyzer-max-loop` so inlining was disabled. After that commit the analyzer +// does not _assume_ a third (or later) iteration (i.e. does not enter those +// iterations if the loop condition is an unknown value), so e.g. this test +// function does not reach `analyzer-max-loop` iterations and the inlining is +// not disabled. +// Unfortunately this change significantly increased the workload and +// runtime of the analyzer (more entry points used up their budget), so the +// option `legacy-inlining-prevention` was introduced and enabled by default to +// suppress the inlining in situations where the "don't assume third iteration" +// logic activates. +// This testcase demonstrate that the inlining is prevented with the default +// `legacy-inlining-prevention=true` config, but is not prevented when this +// option is disabled (set to false). + +int inner_opaque_loop_1(void) { + int i; + clang_analyzer_numTimesReached(); // default-warning {{1}} disabled-warning {{2}} + for (i = 0; i < getNum(); i++); + return i; +} + +int outer_opaque_loop_1(void) { + int iterCount = inner_opaque_loop_1(); + + // The first call to `inner_opaque_loop_1()` splits three execution paths that + // differ in the number of performed iterations (0, 1 or 2). The function + // `inner_opaque_loop_1` is added to the "do not inline this" list when the + // path that performed two iterations tries to enter the third iteration (and + // the "don't assume third iteration" logic prevents this) -- but the other + // two paths (which performed 0 and 1 iterations) would reach and inline the + // second `inner_opaque_loop_1()` before this would happen (because the + // default traversal is a complex heuristic that happens to prefer this). The + // following `if` will discard these "early exit" paths to highlight the + // difference between the default and disabled state: + if (iterCount < 2) + return 0; + + return inner_opaque_loop_1(); +} + +//----------------------------------------------------------------------------- +// Another less contrived testcase that demonstrates the difference between the +// enabled (default) and disabled state of `legacy-inlining-prevention`. +// Here the two calls to `inner_opaque_loop_2()` are in different entry points +// so the first call is fully analyzed (and can put the function on the "do +// not inline" list) before reaching the second call. +// This test uses `clang_analyzer_dump` because (as explained in an earlier +// comment block) `clang_analyzer_numTimesReached` is not suitable for counting +// visits from separate entry points. + +int inner_opaque_loop_2(int arg) { + int i; + clang_analyzer_dump(arg); // default-warning {{2}} + // disabled-warning@-1 {{1}} disabled-warning@-1 {{2}} + for (i = 0; i < getNum(); i++); + return i; +} + +int outer_1_opaque_loop_2(void) { + return inner_opaque_loop_2(1); +} +int outer_2_opaque_loop(void) { + return inner_opaque_loop_2(2); +} diff --git a/clang/test/Analysis/loop-unrolling.cpp b/clang/test/Analysis/loop-unrolling.cpp index bf05a7739ce48..ebae81e000c7a 100644 --- a/clang/test/Analysis/loop-unrolling.cpp +++ b/clang/test/Analysis/loop-unrolling.cpp @@ -1,5 +1,5 @@ -// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true -verify -std=c++14 -analyzer-config exploration_strategy=unexplored_first_queue %s -// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true,exploration_strategy=dfs -verify -std=c++14 -DDFS=1 %s +// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true -verify=expected,default -std=c++14 -analyzer-config exploration_strategy=unexplored_first_queue %s +// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true,exploration_strategy=dfs -verify=expected,dfs -std=c++14 %s void clang_analyzer_numTimesReached(); void clang_analyzer_warnIfReached(); @@ -337,6 +337,7 @@ int nested_both_unrolled() { } int simple_known_bound_loop() { + // Iteration count visible: can be unrolled and fully executed. for (int i = 2; i < 12; i++) { // This function is inlined in nested_inlined_unroll1() clang_analyzer_numTimesReached(); // expected-warning {{90}} @@ -345,27 +346,42 @@ int simple_known_bound_loop() { } int simple_unknown_bound_loop() { + // Iteration count unknown: unrolling won't happen and the execution will be + // split two times: + // (1) split between skipped loop (immediate exit) and entering the loop + // (2) split between exit after 1 iteration and entering the second iteration + // After these there is no third state split because the "don't assume third + // iteration" logic in `ExprEngine::processBranch` prevents it; but the + // `legacy-inlining-prevention` logic will put this function onto the list of + // functions that may not be inlined in the future. + // The exploration strategy apparently influences the number of times this + // function can be inlined before it's placed on the "don't inline" list. for (int i = 2; i < getNum(); i++) { - clang_analyzer_numTimesReached(); // expected-warning {{8}} + clang_analyzer_numTimesReached(); // default-warning {{4}} dfs-warning {{8}} } return 0; } int nested_inlined_unroll1() { + // Here the analyzer can unroll and fully execute both the outer loop and the + // inner loop within simple_known_bound_loop(). int k; for (int i = 0; i < 9; i++) { clang_analyzer_numTimesReached(); // expected-warning {{9}} - k = simple_known_bound_loop(); // no reevaluation without inlining + k = simple_known_bound_loop(); } int a = 22 / k; // expected-warning {{Division by zero}} return 0; } int nested_inlined_no_unroll1() { + // Here no unrolling happens and we only run `analyzer-max-loop` (= 4) + // iterations of the loop within this function, but some state splits happen + // in `simple_unknown_bound_loop()` calls. int k; - for (int i = 0; i < 9; i++) { - clang_analyzer_numTimesReached(); // expected-warning {{10}} - k = simple_unknown_bound_loop(); // reevaluation without inlining, splits the state as well + for (int i = 0; i < 40; i++) { + clang_analyzer_numTimesReached(); // default-warning {{9}} dfs-warning {{12}} + k = simple_unknown_bound_loop(); } int a = 22 / k; // no-warning return 0; From d57946b9b26e9244f5d3819956ffc767d0e0b611 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com> Date: Thu, 24 Apr 2025 15:20:27 +0200 Subject: [PATCH 2/5] Use clang_analyzer_dump in tests Note that we need to expect e.g. {{2 S32}} because the pattern {{2}} would match any other 32-bit integer as well. The last two testcases in the file were testing the same thing in different ways, so this commit only keeps the one that used `clang_analyzer_dump` (which was previously an exception). --- .../Analysis/legacy-inlining-prevention.c | 125 +++++++----------- 1 file changed, 48 insertions(+), 77 deletions(-) diff --git a/clang/test/Analysis/legacy-inlining-prevention.c b/clang/test/Analysis/legacy-inlining-prevention.c index 976e2ba6ddb5d..022a5e9dca005 100644 --- a/clang/test/Analysis/legacy-inlining-prevention.c +++ b/clang/test/Analysis/legacy-inlining-prevention.c @@ -3,57 +3,55 @@ int getNum(void); // Get an opaque number. -void clang_analyzer_numTimesReached(void); void clang_analyzer_dump(int arg); //----------------------------------------------------------------------------- -// Simple case: inlined function never reaches `analyzer-max-loop`. +// Simple case: inlined function never reaches `analyzer-max-loop`, so it is +// always inlined. -int inner_simple(void) { - clang_analyzer_numTimesReached(); // expected-warning {{2}} +int inner_simple(int callIdx) { + clang_analyzer_dump(callIdx); // expected-warning {{1 S32}} + // expected-warning@-1 {{2 S32}} return 42; } int outer_simple(void) { - int x = inner_simple(); - int y = inner_simple(); + int x = inner_simple(1); + int y = inner_simple(2); return 53 / (x - y); // expected-warning {{Division by zero}} } //----------------------------------------------------------------------------- -// Inlined function always reaches `analyzer-max-loop`. +// Inlined function always reaches `analyzer-max-loop`, which stops the +// analysis on that path and puts the function on the "do not inline" list. -int inner_fixed_loop_1(void) { +int inner_fixed_loop_1(int callIdx) { int i; - clang_analyzer_numTimesReached(); // expected-warning {{1}} + clang_analyzer_dump(callIdx); // expected-warning {{1 S32}} for (i = 0; i < 10; i++); - clang_analyzer_numTimesReached(); // no-warning + clang_analyzer_dump(callIdx); // no-warning return 42; } int outer_fixed_loop_1(void) { - int x = inner_fixed_loop_1(); - int y = inner_fixed_loop_1(); + int x = inner_fixed_loop_1(1); + int y = inner_fixed_loop_1(2); return 53 / (x - y); // no-warning } //----------------------------------------------------------------------------- // Inlined function always reaches `analyzer-max-loop`; inlining is prevented // even for different entry points. -// This test uses `clang_analyzer_dump` and distinct `arg` values because -// `clang_analyzer_numTimesReached` only counts the paths reaching that node -// during the analysis of one particular entry point, so it cannot distinguish -// "two entry points reached this, both with one path" (where the two reports -// are unified as duplicates by the generic report postprocessing) and "one -// entry point reached this with one path" (where naturally nothing shows that -// the second entry point _tried_ to reach it). - -int inner_fixed_loop_2(int arg) { +// NOTE: the analyzer happens to analyze the entry points in a reversed order, +// so `outer_2_fixed_loop_2` is analyzed first and it will be the one which is +// able to inline the inner function. + +int inner_fixed_loop_2(int callIdx) { // Identical copy of inner_fixed_loop_1 int i; - clang_analyzer_dump(arg); // expected-warning {{2}} + clang_analyzer_dump(callIdx); // expected-warning {{2 S32}} for (i = 0; i < 10; i++); - clang_analyzer_dump(arg); // no-warning + clang_analyzer_dump(callIdx); // no-warning return 42; } @@ -72,9 +70,10 @@ int outer_2_fixed_loop_2(void) { int inner_parametrized_loop_1(int count) { int i; - clang_analyzer_numTimesReached(); // expected-warning {{2}} + clang_analyzer_dump(count); // expected-warning {{2 S32}} + // expected-warning@-1 {{10 S32}} for (i = 0; i < count; i++); - clang_analyzer_numTimesReached(); // expected-warning {{1}} + clang_analyzer_dump(count); // expected-warning {{2 S32}} return 42; } @@ -90,9 +89,9 @@ int outer_parametrized_loop_1(void) { int inner_parametrized_loop_2(int count) { int i; - clang_analyzer_numTimesReached(); // expected-warning {{1}} + clang_analyzer_dump(count); // expected-warning {{10 S32}} for (i = 0; i < count; i++); - clang_analyzer_numTimesReached(); // no-warning + clang_analyzer_dump(count); // no-warning return 42; } @@ -108,23 +107,28 @@ int outer_parametrized_loop_2(void) { // cases: the function is placed on the "don't inline" list when any execution // path reaches `analyzer-max-loop` (even if other execution paths reach the // end of the function). +// NOTE: This is tested with two separate entry points to ensure that one +// inlined call is fully evaluated before we try to inline the other call. +// NOTE: the analyzer happens to analyze the entry points in a reversed order, +// so `outer_2_conditional_loop` is analyzed first and it will be the one which +// is able to inline the inner function. -int inner_conditional_loop(void) { +int inner_conditional_loop(int callIdx) { int i; - clang_analyzer_numTimesReached(); // expected-warning {{1}} + clang_analyzer_dump(callIdx); // expected-warning {{2 S32}} if (getNum() == 777) { for (i = 0; i < 10; i++); } - clang_analyzer_numTimesReached(); // expected-warning {{1}} + clang_analyzer_dump(callIdx); // expected-warning {{2 S32}} return 42; } int outer_1_conditional_loop(void) { - return inner_conditional_loop(); + return inner_conditional_loop(1); } int outer_2_conditional_loop(void) { - return inner_conditional_loop(); + return inner_conditional_loop(2); } //----------------------------------------------------------------------------- @@ -142,57 +146,24 @@ int outer_2_conditional_loop(void) { // option `legacy-inlining-prevention` was introduced and enabled by default to // suppress the inlining in situations where the "don't assume third iteration" // logic activates. -// This testcase demonstrate that the inlining is prevented with the default -// `legacy-inlining-prevention=true` config, but is not prevented when this -// option is disabled (set to false). - -int inner_opaque_loop_1(void) { - int i; - clang_analyzer_numTimesReached(); // default-warning {{1}} disabled-warning {{2}} - for (i = 0; i < getNum(); i++); - return i; -} +// NOTE: This is tested with two separate entry points to ensure that one +// inlined call is fully evaluated before we try to inline the other call. +// NOTE: the analyzer happens to analyze the entry points in a reversed order, +// so `outer_2_opaque_loop` is analyzed first and it will be the one which is +// able to inline the inner function. -int outer_opaque_loop_1(void) { - int iterCount = inner_opaque_loop_1(); - - // The first call to `inner_opaque_loop_1()` splits three execution paths that - // differ in the number of performed iterations (0, 1 or 2). The function - // `inner_opaque_loop_1` is added to the "do not inline this" list when the - // path that performed two iterations tries to enter the third iteration (and - // the "don't assume third iteration" logic prevents this) -- but the other - // two paths (which performed 0 and 1 iterations) would reach and inline the - // second `inner_opaque_loop_1()` before this would happen (because the - // default traversal is a complex heuristic that happens to prefer this). The - // following `if` will discard these "early exit" paths to highlight the - // difference between the default and disabled state: - if (iterCount < 2) - return 0; - - return inner_opaque_loop_1(); -} - -//----------------------------------------------------------------------------- -// Another less contrived testcase that demonstrates the difference between the -// enabled (default) and disabled state of `legacy-inlining-prevention`. -// Here the two calls to `inner_opaque_loop_2()` are in different entry points -// so the first call is fully analyzed (and can put the function on the "do -// not inline" list) before reaching the second call. -// This test uses `clang_analyzer_dump` because (as explained in an earlier -// comment block) `clang_analyzer_numTimesReached` is not suitable for counting -// visits from separate entry points. - -int inner_opaque_loop_2(int arg) { +int inner_opaque_loop(int callIdx) { int i; - clang_analyzer_dump(arg); // default-warning {{2}} - // disabled-warning@-1 {{1}} disabled-warning@-1 {{2}} + clang_analyzer_dump(callIdx); // default-warning {{2 S32}} + // disabled-warning@-1 {{1 S32}} + // disabled-warning@-2 {{2 S32}} for (i = 0; i < getNum(); i++); return i; } -int outer_1_opaque_loop_2(void) { - return inner_opaque_loop_2(1); +int outer_1_opaque_loop(void) { + return inner_opaque_loop(1); } int outer_2_opaque_loop(void) { - return inner_opaque_loop_2(2); + return inner_opaque_loop(2); } From ba6a2cedb1110a0d91d897c3ff25ab557828dc99 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com> Date: Thu, 24 Apr 2025 18:28:31 +0200 Subject: [PATCH 3/5] Correct docs and comments --- .../clang/StaticAnalyzer/Core/AnalyzerOptions.def | 10 +++++----- clang/lib/StaticAnalyzer/Core/ExprEngine.cpp | 8 ++++---- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def index 0ac85b14147e0..bbdd68539c79b 100644 --- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def +++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def @@ -421,11 +421,11 @@ ANALYZER_OPTION( "perform 'analyzer-max-loop' (= 4 by default) iterations in a loop. " "Note that functions that _definitely_ reach the loop limit on some " "execution path are currently marked as \"do not inline\" even if this " - "option is disabled (but this may change in future versions). This option " - "is a dumb and arbitrary restriction on inlining, but disabling it would " - "significantly increase the analysis workload (and the time taken) " - "compared to older clang versions because more top-level functions can " - "use up their 'max-nodes' limit if inlining is not restricted.", + "option is disabled. This option is an arbitrary restriction on inlining, " + "but disabling this significantly increases the analysis workload (and " + "the time taken) compared to older clang versions because more top-level " + "functions can use up their 'max-nodes' budget if inlining is not " + "restricted.", true) //===----------------------------------------------------------------------===// diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp index 1ff36eb83dbe7..bdf904d9fd04e 100644 --- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp +++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -2586,12 +2586,12 @@ void ExprEngine::processCFGBlockEntrance(const BlockEdge &L, if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) { // FIXME: This will unconditionally prevent inlining this function (even - // from other entrypoints), which is not a reasonable heuristic: even if + // from other entry points), which is not a reasonable heuristic: even if // we reached max block count on this particular execution path, there // may be other execution paths (especially with other parametrizations) // where the analyzer can reach the end of the function (so there is no // natural reason to avoid inlining it). However, disabling this would - // significantly increase the analysis time (because more entrypoints + // significantly increase the analysis time (because more entry points // would exhaust their allocated budget), so it must be compensated by a // different (more reasonable) reduction of analysis scope. Engine.FunctionSummaries->markShouldNotInline( @@ -2876,14 +2876,14 @@ void ExprEngine::processBranch( if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops) { Builder.generateNode(StTrue, true, PredN); } else if (AMgr.options.LegacyInliningPrevention) { - // FIXME: There is an ancient and very arbitrary heuristic in + // FIXME: There is an ancient and arbitrary heuristic in // `ExprEngine::processCFGBlockEntrance` which prevents all further // inlining of a function if it finds an execution path within that // function which reaches the `MaxBlockVisitOnPath` limit (a/k/a // `analyzer-max-loop`, by default four iterations in a loop). Adding // this "don't assume third iteration" logic significantly increased // the analysis runtime on some inputs because less functions were - // arbitrarily excluded from being inlined, so more entrypoints used + // arbitrarily excluded from being inlined, so more entry points used // up their full allocated budget. As a hacky compensation for this, // here we apply the "should not inline" mark in cases when the loop // could potentially reach the `MaxBlockVisitOnPath` limit without the From 4140f53765874dcc1bea239ee27f6796c77a41cc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com> Date: Fri, 25 Apr 2025 13:08:33 +0200 Subject: [PATCH 4/5] Rename the new test file --- ...acy-inlining-prevention.c => loop-based-inlining-prevention.c} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename clang/test/Analysis/{legacy-inlining-prevention.c => loop-based-inlining-prevention.c} (100%) diff --git a/clang/test/Analysis/legacy-inlining-prevention.c b/clang/test/Analysis/loop-based-inlining-prevention.c similarity index 100% rename from clang/test/Analysis/legacy-inlining-prevention.c rename to clang/test/Analysis/loop-based-inlining-prevention.c From 6c238f8cff73dad26db529d92d4adaaf8be1d06e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com> Date: Fri, 25 Apr 2025 13:09:13 +0200 Subject: [PATCH 5/5] Document loop handling limitations that are required by the inlining prevention --- .../Analysis/loop-based-inlining-prevention.c | 37 +++++++++++++++++-- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/clang/test/Analysis/loop-based-inlining-prevention.c b/clang/test/Analysis/loop-based-inlining-prevention.c index 022a5e9dca005..678cac8ccad35 100644 --- a/clang/test/Analysis/loop-based-inlining-prevention.c +++ b/clang/test/Analysis/loop-based-inlining-prevention.c @@ -1,6 +1,28 @@ // RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -verify=expected,default %s // RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config legacy-inlining-prevention=false -verify=expected,disabled %s +// This file tests some heuristics in the engine that put functions on a +// "do not inline" list if their analyisis reaches the `analyzer-max-loop` +// limit (by default 4 iterations) in a loop. This was almost surely intended +// as memoization optimization for the "retry without inlining" fallback (if we +// had to retry once, next time don't even try inlining), but aggressively +// oversteps the "natural" scope: reaching 4 iterations on _one particular_ +// execution path does not imply that each path would need "retry without +// inlining" especially if a different call receives different arguments. +// +// This heuristic significantly affects the scope/depth of the analysis (and +// therefore the execution time) because without this limitation on the +// inlining significantly more entry points would be able to exhaust their +// `max-nodes` quota. (Trivial thin wrappers around big complex functions are +// common in many projects.) +// +// Unfortunately, this arbitrary heuristic strongly relies on the current loop +// handling model and its many limitations, so improvements in loop handling +// can cause surprising slowdowns by reducing the "do not inline" blacklist. +// In the tests "FIXME-BUT-NEEDED" comments mark "problematic" (aka buggy) +// analyzer behavior which cannot be fixed without also improving the +// heuristics for (not) inlining large functions. + int getNum(void); // Get an opaque number. void clang_analyzer_dump(int arg); @@ -28,7 +50,7 @@ int outer_simple(void) { int inner_fixed_loop_1(int callIdx) { int i; clang_analyzer_dump(callIdx); // expected-warning {{1 S32}} - for (i = 0; i < 10; i++); + for (i = 0; i < 10; i++); // FIXME-BUT-NEEDED: This stops the analysis. clang_analyzer_dump(callIdx); // no-warning return 42; } @@ -36,6 +58,8 @@ int inner_fixed_loop_1(int callIdx) { int outer_fixed_loop_1(void) { int x = inner_fixed_loop_1(1); int y = inner_fixed_loop_1(2); + + // FIXME-BUT-NEEDED: The analysis doesn't reach this zero division. return 53 / (x - y); // no-warning } @@ -47,10 +71,10 @@ int outer_fixed_loop_1(void) { // able to inline the inner function. int inner_fixed_loop_2(int callIdx) { - // Identical copy of inner_fixed_loop_1 + // Identical copy of inner_fixed_loop_1. int i; clang_analyzer_dump(callIdx); // expected-warning {{2 S32}} - for (i = 0; i < 10; i++); + for (i = 0; i < 10; i++); // FIXME-BUT-NEEDED: This stops the analysis. clang_analyzer_dump(callIdx); // no-warning return 42; } @@ -73,6 +97,7 @@ int inner_parametrized_loop_1(int count) { clang_analyzer_dump(count); // expected-warning {{2 S32}} // expected-warning@-1 {{10 S32}} for (i = 0; i < count; i++); + // FIXME-BUT-NEEDED: This loop stops the analysis when count >=4. clang_analyzer_dump(count); // expected-warning {{2 S32}} return 42; } @@ -80,6 +105,8 @@ int inner_parametrized_loop_1(int count) { int outer_parametrized_loop_1(void) { int x = inner_parametrized_loop_1(2); int y = inner_parametrized_loop_1(10); + + // FIXME-BUT-NEEDED: The analysis doesn't reach this zero division. return 53 / (x - y); // no-warning } @@ -88,9 +115,11 @@ int outer_parametrized_loop_1(void) { // second call isn't inlined (although it could be fully evaluated). int inner_parametrized_loop_2(int count) { + // Identical copy of inner_parametrized_loop_1. int i; clang_analyzer_dump(count); // expected-warning {{10 S32}} for (i = 0; i < count; i++); + // FIXME-BUT-NEEDED: This loop stops the analysis when count >=4. clang_analyzer_dump(count); // no-warning return 42; } @@ -98,6 +127,8 @@ int inner_parametrized_loop_2(int count) { int outer_parametrized_loop_2(void) { int y = inner_parametrized_loop_2(10); int x = inner_parametrized_loop_2(2); + + // FIXME-BUT-NEEDED: The analysis doesn't reach this zero division. return 53 / (x - y); // no-warning } _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits