whisperity added inline comments.

================
Comment at: clang-tools-extra/clang-tidy/misc/DiscardedReturnValueCheck.cpp:181
+
+  static const auto Decltype = decltypeType(hasUnderlyingExpr(Call));
+  static const auto TemplateArg =
----------------
whisperity wrote:
> aaron.ballman wrote:
> > aaron.ballman wrote:
> > > whisperity wrote:
> > > > whisperity wrote:
> > > > > whisperity wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > So, I'm not super keen on this approach of having to try to 
> > > > > > > identify every single place in which an expression is considered 
> > > > > > > to be "used" -- this is going to be fragile because we'll miss 
> > > > > > > places and it's going to be a maintenance burden because new 
> > > > > > > places will be added as the languages evolve.
> > > > > > > 
> > > > > > > For example, if we're handling `decltype` as a use, why not 
> > > > > > > `noexcept`? Or conditional `explicit`? What about a `co_return` 
> > > > > > > statement?
> > > > > > > 
> > > > > > > I'm not certain what we can do to improve this, but I think it's 
> > > > > > > worth trying to explore options to see if we can generalize what 
> > > > > > > constitutes a use so that we can write a few custom matchers to 
> > > > > > > do the heavy lifting instead of trying to play whack-a-mole.
> > > > > > I've been having other thoughts about this `decltype` here... 
> > > > > > Actually, neither `decltype` nor `noexcept` should be handled as a 
> > > > > > //"use"// at all, while `co_return` should be the same as a 
> > > > > > `return` -- however, I think it was due to lack of projects where 
> > > > > > such could be meaningfully measured as a missed case was why 
> > > > > > implementing that failed.
> > > > > > 
> > > > > > For `decltype`, `typedef`, and `noexcept` (and perhaps several 
> > > > > > others), the good solution would be having a third route: calls 
> > > > > > that //should not be counted//. Neither as a "consumed call", nor 
> > > > > > as a "bare call". Ignored, from both calculations. Maybe even for 
> > > > > > template arguments below.
> > > > > As for better matching... Unfortunately, types in the AST are so 
> > > > > varied and `hasDescendant` is too generic to express something like 
> > > > > `stmt(anyOf(ifStmt(), forStmt(), switchStmt()), hasDescendant(Call))` 
> > > > > to express in a single expression matching uses... The conditions are 
> > > > > not always direct children of the outer node, while `hasDescendant` 
> > > > > will match not just the condition but the entire tree... resulting in 
> > > > > things like //both// functions in
> > > > > 
> > > > > ```lang=cpp
> > > > > if (foo())
> > > > >   bar()
> > > > > ```
> > > > > 
> > > > > matching.
> > > > > 
> > > > > Well... generalisation... I can throw in a formal fluke:
> > > > > 
> > > > > > A **use** is a //context// for a specific `CallExpr C` in which we 
> > > > > > can reasonably assume that the value produced by evaluating `C` is 
> > > > > > loaded by another expression.
> > > > > 
> > > > > Now what I found is `-Wunused-result`, aka 
> > > > > `SemaDiagnostics::warn_unused_expr`, which is triggered in the 
> > > > > function `ExprResult Sema::ActOnFinishFullExpr(Expr* FE, 
> > > > > SourceLocation CC, bool DiscardedValue, bool IsConstexpr);`. Now this 
> > > > > function itself does //some// heuristics inside (with a **lot** of 
> > > > > `FIXME`s as of rGdab5e10ea5dbc2e6314e0e7ce54a9c51fbcb44bd), but 
> > > > > notably, `DiscardedValue` is a parameter. According to a quick 
> > > > > search, this function (and its overloads) have **82** callsites 
> > > > > within `Sema`, with many of them just tougher to decipher than 
> > > > > others. Some of the other ways this function is called, e.g. 
> > > > > `ActOnStmtExprResult`, have codes like this:
> > > > > 
> > > > > ```lang=cpp
> > > > > IsStmtExprResult = GetLookAheadToken(LookAhead).is(tok::r_brace) && 
> > > > > GetLookAheadToken(LookAhead + 1).is(tok::r_paren);
> > > > > ```
> > > > > 
> > > > > So I would say most of the logic there is **very** parsing specific, 
> > > > > and requires information that is only available during the parsing 
> > > > > descent, and not later when someone tries to consume a `const AST`.
> > > > @aaron.ballman There is a `bugprone-unused-return-value` since mid 
> > > > 2018, in which the matched function set is configurable with a 
> > > > hardcoded default, and the matching logic is also... verbose.
> > > > 
> > > > [[ 
> > > > http://github.com/llvm/llvm-project/blob/c1a9d14982f887355da1959eba3a47b952fc6e7a/clang-tools-extra/clang-tidy/bugprone/UnusedReturnValueCheck.cpp#L144-L165
> > > >  | Source ]]
> > > > 
> > > > ```lang=cpp
> > > > auto UnusedInIfStmt =
> > > >       ifStmt(eachOf(hasThen(MatchedCallExpr), 
> > > > hasElse(MatchedCallExpr)));
> > > >   auto UnusedInWhileStmt = whileStmt(hasBody(MatchedCallExpr));
> > > >   auto UnusedInDoStmt = doStmt(hasBody(MatchedCallExpr));
> > > > ```
> > > > 
> > > > Agreed, this is seemingly a subset of the inverse match.
> > > > For decltype, typedef, and noexcept (and perhaps several others), the 
> > > > good solution would be having a third route: calls that should not be 
> > > > counted. Neither as a "consumed call", nor as a "bare call". Ignored, 
> > > > from both calculations. Maybe even for template arguments below.
> > > 
> > > I agree, these cases seem like they're not a use in the sense of 
> > > discarding the return value.
> > > @aaron.ballman There is a bugprone-unused-return-value since mid 2018, in 
> > > which the matched function set is configurable with a hardcoded default, 
> > > and the matching logic is also... verbose.
> > 
> > Yeah, but this one is even more verbose than that one.
> > 
> > That said, I'm really not seeing a better approach to this problem. I'd be 
> > very curious how the performance of this check compares to the performance 
> > of the `bugprone-unused-return-value` check compared to running no checks. 
> > Maybe the performance spread isn't nearly as bad as I'm worried?
> While I am not familiar with all the intricacies, I know Clang-Tidy does a 
> lot of memoisation for matches. At some point I really need to dig into it to 
> understand it, but it might be reasonable to say that there are some caching 
> for the matcher expressions.
> 
> I do not have a direct comparison ready against that check, but for 
> @bahramib's thesis work (which he is putting together right now, hence why 
> I'm responding 🙂) I have done the usual CI runs to gather data, and the run 
> times for the analysis (at `-j 16`) are. (The `50%` and `80%` are the 
> threshold values.)
> 
> ```
> bitcoin_v0.20.1 50% (single) : 0:01:02 - 140 reports
> bitcoin_v0.20.1 80% (single) : 0:00:29 - 25 reports
> codechecker_v6.17.0 50% (single) : 0:00:00 - 2 reports
> codechecker_v6.17.0 80% (single) : 0:00:00 - 0 reports
> contour_v0.2.0.173 50% (single) : 0:00:41 - 38 reports
> contour_v0.2.0.173 80% (single) : 0:00:26 - 10 reports
> curl_curl-7 50% (single) : 0:00:09 - 55 reports
> curl_curl-7 80% (single) : 0:00:09 - 6 reports
> ffmpeg_n4.3.1 50% (single) : 0:00:28 - 904 reports
> ffmpeg_n4.3.1 80% (single) : 0:00:27 - 148 reports
> libwebm_libwebm-1.0.0.27 50% (single) : 0:00:00 - 9 reports
> libwebm_libwebm-1.0.0.27 80% (single) : 0:00:00 - 0 reports
> llvm-project_llvmorg-12.0.0 50% (single) : 0:15:16 - 4615 reports
> llvm-project_llvmorg-12.0.0 80% (single) : 0:14:34 - 1077 reports
> memcached_1.6.8 50% (single) : 0:00:00 - 32 reports
> memcached_1.6.8 80% (single) : 0:00:00 - 5 reports
> mongo_r4.4.6 50% (single) : 0:20:23 - 1671 reports
> mongo_r4.4.6 80% (single) : 0:20:13 - 370 reports
> openssl_openssl-3.0.0-alpha7 50% (single) : 0:01:28 - 427 reports
> openssl_openssl-3.0.0-alpha7 80% (single) : 0:00:41 - 56 reports
> postgres_REL 50% (single) : 0:00:29 - 799 reports
> postgres_REL 80% (single) : 0:00:21 - 62 reports
> protobuf_v3.13.0-CSA-patched 50% (single) : 0:00:30 - 51 reports
> protobuf_v3.13.0-CSA-patched 80% (single) : 0:00:20 - 20 reports
> qtbase_v6.2.0 50% (single) : 0:04:12 - 1195 reports
> qtbase_v6.2.0 80% (single) : 0:04:15 - 200 reports
> sqlite_version-3.33.0 50% (single) : 0:00:05 - 284 reports
> sqlite_version-3.33.0 80% (single) : 0:00:02 - 42 reports
> tmux_2.6 50% (single) : 0:00:02 - 35 reports
> tmux_2.6 80% (single) : 0:00:01 - 1 reports
> twin_v0.8.1 50% (single) : 0:00:01 - 54 reports
> twin_v0.8.1 80% (single) : 0:00:01 - 9 reports
> vim_v8.2.1920 50% (single) : 0:00:02 - 227 reports
> vim_v8.2.1920 80% (single) : 0:00:03 - 32 reports
> xerces_v3.2.3 50% (single) : 0:00:08 - 125 reports
> xerces_v3.2.3 80% (single) : 0:00:08 - 9 reports
> ```
> 
> -----
> 
> For completeness, here is the same report for the "multi TU" or "project 
> level" mode. This measurement **only** includes the time it took for the 
> `diagnose` phase to match, load the persistence, and emit diagnostics, and 
> **DOES NOT** include the "collect" phase! However, it must be noted that the 
> diagnose phase of this check only deals with printing diagnostics for 
> matches, and no longer "counts", hence sometimes the numbers are lower than 
> in the single-TU mode.
> 
> ```
> bitcoin_v0.20.1 50% (multi) : 0:00:50 - 320 reports
> bitcoin_v0.20.1 80% (multi) : 0:00:31 - 92 reports
> codechecker_v6.17.0 50% (multi) : 0:00:00 - 5 reports
> codechecker_v6.17.0 80% (multi) : 0:00:00 - 0 reports
> contour_v0.2.0.173 50% (multi) : 0:00:27 - 107 reports
> contour_v0.2.0.173 80% (multi) : 0:00:27 - 8 reports
> curl_curl-7 50% (multi) : 0:00:09 - 210 reports
> curl_curl-7 80% (multi) : 0:00:10 - 104 reports
> ffmpeg_n4.3.1 50% (multi) : 0:00:32 - 1790 reports
> ffmpeg_n4.3.1 80% (multi) : 0:00:31 - 613 reports
> libwebm_libwebm-1.0.0.27 50% (multi) : 0:00:00 - 10 reports
> libwebm_libwebm-1.0.0.27 80% (multi) : 0:00:01 - 1 reports
> llvm-project_llvmorg-12.0.0 50% (multi) : 0:16:58 - 6837 reports
> llvm-project_llvmorg-12.0.0 80% (multi) : 0:16:54 - 2150 reports
> memcached_1.6.8 50% (multi) : 0:00:00 - 49 reports
> memcached_1.6.8 80% (multi) : 0:00:01 - 16 reports
> mongo_r4.4.6 50% (multi) : 0:21:58 - 2463 reports
> mongo_r4.4.6 80% (multi) : 0:22:26 - 1223 reports
> openssl_openssl-3.0.0-alpha7 50% (multi) : 0:01:33 - 1060 reports
> openssl_openssl-3.0.0-alpha7 80% (multi) : 0:01:05 - 244 reports
> postgres_REL 50% (multi) : 0:00:26 - 644 reports
> postgres_REL 80% (multi) : 0:00:31 - 215 reports
> protobuf_v3.13.0-CSA-patched 50% (multi) : 0:00:21 - 122 reports
> protobuf_v3.13.0-CSA-patched 80% (multi) : 0:00:20 - 75 reports
> qtbase_v6.2.0 50% (multi) : 0:04:27 - 2764 reports
> qtbase_v6.2.0 80% (multi) : 0:04:09 - 1093 reports
> sqlite_version-3.33.0 50% (multi) : 0:00:05 - 288 reports
> sqlite_version-3.33.0 80% (multi) : 0:00:03 - 48 reports
> tmux_2.6 50% (multi) : 0:00:01 - 48 reports
> tmux_2.6 80% (multi) : 0:00:05 - 1 reports
> twin_v0.8.1 50% (multi) : 0:00:01 - 69 reports
> twin_v0.8.1 80% (multi) : 0:00:03 - 22 reports
> vim_v8.2.1920 50% (multi) : 0:00:03 - 383 reports
> vim_v8.2.1920 80% (multi) : 0:00:06 - 54 reports
> xerces_v3.2.3 50% (multi) : 0:00:09 - 197 reports
> xerces_v3.2.3 80% (multi) : 0:00:23 - 38 reports
> ```
Also, the aforementioned analyses were done with running **only** this check, 
and only Clang-Tidy. 🙂  I think I can put in a measurement job in other 
configurations but I need to first handshake with the others because the 
measurement CI is a sensitive shared resource.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124446/new/

https://reviews.llvm.org/D124446

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to