whisperity added inline comments.
================ Comment at: clang-tools-extra/clang-tidy/misc/DiscardedReturnValueCheck.cpp:181 + + static const auto Decltype = decltypeType(hasUnderlyingExpr(Call)); + static const auto TemplateArg = ---------------- whisperity wrote: > aaron.ballman wrote: > > aaron.ballman wrote: > > > whisperity wrote: > > > > whisperity wrote: > > > > > whisperity wrote: > > > > > > aaron.ballman wrote: > > > > > > > So, I'm not super keen on this approach of having to try to > > > > > > > identify every single place in which an expression is considered > > > > > > > to be "used" -- this is going to be fragile because we'll miss > > > > > > > places and it's going to be a maintenance burden because new > > > > > > > places will be added as the languages evolve. > > > > > > > > > > > > > > For example, if we're handling `decltype` as a use, why not > > > > > > > `noexcept`? Or conditional `explicit`? What about a `co_return` > > > > > > > statement? > > > > > > > > > > > > > > I'm not certain what we can do to improve this, but I think it's > > > > > > > worth trying to explore options to see if we can generalize what > > > > > > > constitutes a use so that we can write a few custom matchers to > > > > > > > do the heavy lifting instead of trying to play whack-a-mole. > > > > > > I've been having other thoughts about this `decltype` here... > > > > > > Actually, neither `decltype` nor `noexcept` should be handled as a > > > > > > //"use"// at all, while `co_return` should be the same as a > > > > > > `return` -- however, I think it was due to lack of projects where > > > > > > such could be meaningfully measured as a missed case was why > > > > > > implementing that failed. > > > > > > > > > > > > For `decltype`, `typedef`, and `noexcept` (and perhaps several > > > > > > others), the good solution would be having a third route: calls > > > > > > that //should not be counted//. Neither as a "consumed call", nor > > > > > > as a "bare call". Ignored, from both calculations. Maybe even for > > > > > > template arguments below. > > > > > As for better matching... Unfortunately, types in the AST are so > > > > > varied and `hasDescendant` is too generic to express something like > > > > > `stmt(anyOf(ifStmt(), forStmt(), switchStmt()), hasDescendant(Call))` > > > > > to express in a single expression matching uses... The conditions are > > > > > not always direct children of the outer node, while `hasDescendant` > > > > > will match not just the condition but the entire tree... resulting in > > > > > things like //both// functions in > > > > > > > > > > ```lang=cpp > > > > > if (foo()) > > > > > bar() > > > > > ``` > > > > > > > > > > matching. > > > > > > > > > > Well... generalisation... I can throw in a formal fluke: > > > > > > > > > > > A **use** is a //context// for a specific `CallExpr C` in which we > > > > > > can reasonably assume that the value produced by evaluating `C` is > > > > > > loaded by another expression. > > > > > > > > > > Now what I found is `-Wunused-result`, aka > > > > > `SemaDiagnostics::warn_unused_expr`, which is triggered in the > > > > > function `ExprResult Sema::ActOnFinishFullExpr(Expr* FE, > > > > > SourceLocation CC, bool DiscardedValue, bool IsConstexpr);`. Now this > > > > > function itself does //some// heuristics inside (with a **lot** of > > > > > `FIXME`s as of rGdab5e10ea5dbc2e6314e0e7ce54a9c51fbcb44bd), but > > > > > notably, `DiscardedValue` is a parameter. According to a quick > > > > > search, this function (and its overloads) have **82** callsites > > > > > within `Sema`, with many of them just tougher to decipher than > > > > > others. Some of the other ways this function is called, e.g. > > > > > `ActOnStmtExprResult`, have codes like this: > > > > > > > > > > ```lang=cpp > > > > > IsStmtExprResult = GetLookAheadToken(LookAhead).is(tok::r_brace) && > > > > > GetLookAheadToken(LookAhead + 1).is(tok::r_paren); > > > > > ``` > > > > > > > > > > So I would say most of the logic there is **very** parsing specific, > > > > > and requires information that is only available during the parsing > > > > > descent, and not later when someone tries to consume a `const AST`. > > > > @aaron.ballman There is a `bugprone-unused-return-value` since mid > > > > 2018, in which the matched function set is configurable with a > > > > hardcoded default, and the matching logic is also... verbose. > > > > > > > > [[ > > > > http://github.com/llvm/llvm-project/blob/c1a9d14982f887355da1959eba3a47b952fc6e7a/clang-tools-extra/clang-tidy/bugprone/UnusedReturnValueCheck.cpp#L144-L165 > > > > | Source ]] > > > > > > > > ```lang=cpp > > > > auto UnusedInIfStmt = > > > > ifStmt(eachOf(hasThen(MatchedCallExpr), > > > > hasElse(MatchedCallExpr))); > > > > auto UnusedInWhileStmt = whileStmt(hasBody(MatchedCallExpr)); > > > > auto UnusedInDoStmt = doStmt(hasBody(MatchedCallExpr)); > > > > ``` > > > > > > > > Agreed, this is seemingly a subset of the inverse match. > > > > For decltype, typedef, and noexcept (and perhaps several others), the > > > > good solution would be having a third route: calls that should not be > > > > counted. Neither as a "consumed call", nor as a "bare call". Ignored, > > > > from both calculations. Maybe even for template arguments below. > > > > > > I agree, these cases seem like they're not a use in the sense of > > > discarding the return value. > > > @aaron.ballman There is a bugprone-unused-return-value since mid 2018, in > > > which the matched function set is configurable with a hardcoded default, > > > and the matching logic is also... verbose. > > > > Yeah, but this one is even more verbose than that one. > > > > That said, I'm really not seeing a better approach to this problem. I'd be > > very curious how the performance of this check compares to the performance > > of the `bugprone-unused-return-value` check compared to running no checks. > > Maybe the performance spread isn't nearly as bad as I'm worried? > While I am not familiar with all the intricacies, I know Clang-Tidy does a > lot of memoisation for matches. At some point I really need to dig into it to > understand it, but it might be reasonable to say that there are some caching > for the matcher expressions. > > I do not have a direct comparison ready against that check, but for > @bahramib's thesis work (which he is putting together right now, hence why > I'm responding 🙂) I have done the usual CI runs to gather data, and the run > times for the analysis (at `-j 16`) are. (The `50%` and `80%` are the > threshold values.) > > ``` > bitcoin_v0.20.1 50% (single) : 0:01:02 - 140 reports > bitcoin_v0.20.1 80% (single) : 0:00:29 - 25 reports > codechecker_v6.17.0 50% (single) : 0:00:00 - 2 reports > codechecker_v6.17.0 80% (single) : 0:00:00 - 0 reports > contour_v0.2.0.173 50% (single) : 0:00:41 - 38 reports > contour_v0.2.0.173 80% (single) : 0:00:26 - 10 reports > curl_curl-7 50% (single) : 0:00:09 - 55 reports > curl_curl-7 80% (single) : 0:00:09 - 6 reports > ffmpeg_n4.3.1 50% (single) : 0:00:28 - 904 reports > ffmpeg_n4.3.1 80% (single) : 0:00:27 - 148 reports > libwebm_libwebm-1.0.0.27 50% (single) : 0:00:00 - 9 reports > libwebm_libwebm-1.0.0.27 80% (single) : 0:00:00 - 0 reports > llvm-project_llvmorg-12.0.0 50% (single) : 0:15:16 - 4615 reports > llvm-project_llvmorg-12.0.0 80% (single) : 0:14:34 - 1077 reports > memcached_1.6.8 50% (single) : 0:00:00 - 32 reports > memcached_1.6.8 80% (single) : 0:00:00 - 5 reports > mongo_r4.4.6 50% (single) : 0:20:23 - 1671 reports > mongo_r4.4.6 80% (single) : 0:20:13 - 370 reports > openssl_openssl-3.0.0-alpha7 50% (single) : 0:01:28 - 427 reports > openssl_openssl-3.0.0-alpha7 80% (single) : 0:00:41 - 56 reports > postgres_REL 50% (single) : 0:00:29 - 799 reports > postgres_REL 80% (single) : 0:00:21 - 62 reports > protobuf_v3.13.0-CSA-patched 50% (single) : 0:00:30 - 51 reports > protobuf_v3.13.0-CSA-patched 80% (single) : 0:00:20 - 20 reports > qtbase_v6.2.0 50% (single) : 0:04:12 - 1195 reports > qtbase_v6.2.0 80% (single) : 0:04:15 - 200 reports > sqlite_version-3.33.0 50% (single) : 0:00:05 - 284 reports > sqlite_version-3.33.0 80% (single) : 0:00:02 - 42 reports > tmux_2.6 50% (single) : 0:00:02 - 35 reports > tmux_2.6 80% (single) : 0:00:01 - 1 reports > twin_v0.8.1 50% (single) : 0:00:01 - 54 reports > twin_v0.8.1 80% (single) : 0:00:01 - 9 reports > vim_v8.2.1920 50% (single) : 0:00:02 - 227 reports > vim_v8.2.1920 80% (single) : 0:00:03 - 32 reports > xerces_v3.2.3 50% (single) : 0:00:08 - 125 reports > xerces_v3.2.3 80% (single) : 0:00:08 - 9 reports > ``` > > ----- > > For completeness, here is the same report for the "multi TU" or "project > level" mode. This measurement **only** includes the time it took for the > `diagnose` phase to match, load the persistence, and emit diagnostics, and > **DOES NOT** include the "collect" phase! However, it must be noted that the > diagnose phase of this check only deals with printing diagnostics for > matches, and no longer "counts", hence sometimes the numbers are lower than > in the single-TU mode. > > ``` > bitcoin_v0.20.1 50% (multi) : 0:00:50 - 320 reports > bitcoin_v0.20.1 80% (multi) : 0:00:31 - 92 reports > codechecker_v6.17.0 50% (multi) : 0:00:00 - 5 reports > codechecker_v6.17.0 80% (multi) : 0:00:00 - 0 reports > contour_v0.2.0.173 50% (multi) : 0:00:27 - 107 reports > contour_v0.2.0.173 80% (multi) : 0:00:27 - 8 reports > curl_curl-7 50% (multi) : 0:00:09 - 210 reports > curl_curl-7 80% (multi) : 0:00:10 - 104 reports > ffmpeg_n4.3.1 50% (multi) : 0:00:32 - 1790 reports > ffmpeg_n4.3.1 80% (multi) : 0:00:31 - 613 reports > libwebm_libwebm-1.0.0.27 50% (multi) : 0:00:00 - 10 reports > libwebm_libwebm-1.0.0.27 80% (multi) : 0:00:01 - 1 reports > llvm-project_llvmorg-12.0.0 50% (multi) : 0:16:58 - 6837 reports > llvm-project_llvmorg-12.0.0 80% (multi) : 0:16:54 - 2150 reports > memcached_1.6.8 50% (multi) : 0:00:00 - 49 reports > memcached_1.6.8 80% (multi) : 0:00:01 - 16 reports > mongo_r4.4.6 50% (multi) : 0:21:58 - 2463 reports > mongo_r4.4.6 80% (multi) : 0:22:26 - 1223 reports > openssl_openssl-3.0.0-alpha7 50% (multi) : 0:01:33 - 1060 reports > openssl_openssl-3.0.0-alpha7 80% (multi) : 0:01:05 - 244 reports > postgres_REL 50% (multi) : 0:00:26 - 644 reports > postgres_REL 80% (multi) : 0:00:31 - 215 reports > protobuf_v3.13.0-CSA-patched 50% (multi) : 0:00:21 - 122 reports > protobuf_v3.13.0-CSA-patched 80% (multi) : 0:00:20 - 75 reports > qtbase_v6.2.0 50% (multi) : 0:04:27 - 2764 reports > qtbase_v6.2.0 80% (multi) : 0:04:09 - 1093 reports > sqlite_version-3.33.0 50% (multi) : 0:00:05 - 288 reports > sqlite_version-3.33.0 80% (multi) : 0:00:03 - 48 reports > tmux_2.6 50% (multi) : 0:00:01 - 48 reports > tmux_2.6 80% (multi) : 0:00:05 - 1 reports > twin_v0.8.1 50% (multi) : 0:00:01 - 69 reports > twin_v0.8.1 80% (multi) : 0:00:03 - 22 reports > vim_v8.2.1920 50% (multi) : 0:00:03 - 383 reports > vim_v8.2.1920 80% (multi) : 0:00:06 - 54 reports > xerces_v3.2.3 50% (multi) : 0:00:09 - 197 reports > xerces_v3.2.3 80% (multi) : 0:00:23 - 38 reports > ``` Also, the aforementioned analyses were done with running **only** this check, and only Clang-Tidy. 🙂 I think I can put in a measurement job in other configurations but I need to first handshake with the others because the measurement CI is a sensitive shared resource. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D124446/new/ https://reviews.llvm.org/D124446 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits