=?utf-8?q?Donát?= Nagy <donat.n...@ericsson.com>, =?utf-8?q?Donát?= Nagy <donat.n...@ericsson.com>, =?utf-8?q?Donát?= Nagy <donat.n...@ericsson.com>, =?utf-8?q?Donát?= Nagy <donat.n...@ericsson.com>, =?utf-8?q?Donát?= Nagy <donat.n...@ericsson.com>, =?utf-8?q?Donát?= Nagy <donat.n...@ericsson.com> Message-ID: In-Reply-To: <llvm.org/llvm/llvm-project/pull/139...@github.com>
steakhal wrote: > For a moment I was very happy to see this solution, but unfortunately XXX is > **not** the name of the checker class (= backend = full family), but instead > it is just yet another kind of "internal" name that is assigned to each > individual checker (= frontend = one checker from the family). > > You're right that for the simple single-part checkers this kind of name > usually coincides with the checker class name, but it is still not a single > name for the full checker family. For example consider the class > `DivZeroChecker` which implements `core.DivideZero` and > `optin.taint.TaintedDiv` -- this is introduced in `Checkers.td` as > > ``` > let ParentPackage = Core in { > //... > def DivZeroChecker : Checker<"DivideZero">, > HelpText<"Check for division by zero">, > Documentation<HasDocumentation>; > //... > } > let ParentPackage = TaintOptIn in { > //... > def TaintedDivChecker: Checker<"TaintedDiv">, > HelpText<"Check for divisions where the denominator is tainted " > "(attacker controlled) and might be 0.">, > Dependencies<[TaintPropagationChecker]>, > Documentation<HasDocumentation>; > //... > } > ``` > > In this particular case using "debug name = internal name of the _first_ > checker part" would work (and it is a core checker so we can say that it > won't be disabled), but if we have a checker family where all the parts are > non-core checkers, then this solution cannot assign a single stable name to > the whole family. To me what is important with this debug name is to be able to easily connect it with an implementation file to look at. Given that many times the checker name matches this debug name, and that is also spelled at least partially in the file name (even if this isn't quite true all the cases), its a much better choice than hard-coding something. This is the reason why I wanted to avoid "magic" prefixing it with "family-of-XXX" because at that point we would lose the ability to directly grep for this exact string and expect a match. This is a weak argument though because at this point we can expect that devs would be able to find the right checker even with the family prefix. This is why I don't have hard feelings about having this or not having this prefix. I'm good either way. This debug name should be the XXX part of a registerXXX fn generated for each checker (let it be modeling or reporting), and if it's a checker family, then the debug name of all the "parts" should be the name of the family. This would directly point to what implementation to look at no matter how the checker is surfaced to the users, under what aliases or what have we. In what I proposed as a quick demonstration is the user facing checker name, so that's not quite aligned with what I'd want to ideally see. --- Reflecting on your proposal: First, let's align on why we currently have the concept of "modeling" and "reporting" checkers. I think you are probably already aware of this, but let's clarify this. It's a great property if the exploded graph remains sort of the same no matter what checkers are enabled. It makes it easier to reason about it and when and why we have sink nodes, affecting what paths explored etc. In the past (and also still today AFAIK in some cases) checkers checked some property and reported a fatal error node. So there was a sink node if the checker was enabled, but as soon as it was disabled, the sink node was missing, discovering different issues in other checkers down the line in the exploded graph. To solve this, was the effort of splitting the checkers into modeling and reporting and emit the sink node regardless. This way only the user-facing diagnostics would be different, but the underlying egraph would remain the same regardless if we enable or disable a checker. (This statement isn't quite true, but you get the principle). So the problem was that we had to hack together multiple "checker frontends" and conditionally report a fatal bug or a sink node on an error. In theory, a "checker frontend" should only ever depend on a set of "checker backends" aka. "modeling checkers". I wish we had some better way of doing this. I think we are now better than ever to achieve this that we tie together the BugType and a checker frontend that we can query. I reread both outlines solutions and both seemed better than what we have today. Neither are for free, but the second resonated more with me, and was aligned with my expectations. This makes me question why we want to improve the ergonomics of defining and organizing checkers? Is this the most important pain point we want to focus on right now? To me btw, this checker families and what we had before with raw bool flags are the same thing. It's just now a lot less ugly. I'll be honest with you, that right now I can't see if the currently more appealing direction with option 2 would be really the better path to take (or something along that line). So I think it's a risky direction to have so many changes on the flight in the subject before consolidating a statusquo first with checker families. After that we may have a better view what's next, if we still believe that the ergonomics is the best place to invest. https://github.com/llvm/llvm-project/pull/139256 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits