[PATCH] D77150: [Analyzer] New Option for ContainerModeling: AggressiveEraseModeling

Artem Dergachev via Phabricator via cfe-commits Tue, 07 Jul 2020 09:35:21 -0700

NoQ added inline comments.

================
Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:647-653
     CmdLineOption<Boolean,
                   "AggressiveStdFindModeling",
                   "Enables exploration of the failure branch in std::find-like 
"
                   "functions.",
                   "false",
                   Released>
   ]>,
----------------
baloghadamsoftware wrote:
> Szelethus wrote:
> > Szelethus wrote:
> > > NoQ wrote:
> > > > baloghadamsoftware wrote:
> > > > > Szelethus wrote:
> > > > > > Ah, okay, I see which one you refer to. We should totally make this 
> > > > > > non-user facing as well. 
> > > > > The option is not about state split! It is for choosing between the 
> > > > > (default) conservative approach and a more aggressive one. It is 
> > > > > absolutely for the user to set. Some users prefer less false positive 
> > > > > for the price of losing true positives. However, some other users 
> > > > > prefer more true positives for the price of additional false 
> > > > > positives. This is why we have checker options to be able to serve 
> > > > > both groups.
> > > > > 
> > > > > This feature was explicitly requested by our users who first disabled 
> > > > > iterator checkers because of the too many false positives but later 
> > > > > re-enabled them because they run into a bug in a production system 
> > > > > which could have been prevented by enabling them. However, they run 
> > > > > into another bug that our checker did not find because of its 
> > > > > conservative behavior. They requested a more aggressive one but we 
> > > > > must not do it for everyone. The concept of the Analyzer is that we 
> > > > > apply the conservative approach by default and the aggressive can be 
> > > > > enabled by analyzer and checker options.
> > > > These options are unlikely to be on-by-default in vanilla clang because 
> > > > of having a fundamentally unfixable class of false positives associated 
> > > > with them. Honestly, i've never seen users who are actually ok with 
> > > > false positives. I've regularly seen users say that they are ok with 
> > > > false positives when they wanted their false negatives fixed, but it 
> > > > always turned out that they don't understand what they're asking for 
> > > > and they quickly changed their mind when they saw the result. But you 
> > > > guys basically vend your own tool to your own customers and bear your 
> > > > own responsibility and i therefore can't tell you what to do in your 
> > > > realm, and i'm ok with having options that allow you to disagree with 
> > > > my choices.
> > > > 
> > > > inb4, "//`-analyzer-config` is not user-facing!!!11//"
> > > >inb4, "-analyzer-config is not user-facing!!!11"
> > > I was starting to breath really heavy and type really fast :D
> > > 
> > > Jokes aside, I'll spill the beans, we don't have a really convenient way 
> > > to tweak these options through CodeChecker just yet, so for the time 
> > > being, they really aren't, but maybe one day.
> > > 
> > > With that said, I also agree with you regarding options that 
> > > overapproximate the analysis (greater code coverage with more false and 
> > > true positives). User facing options should be the ones where the nature 
> > > of the bug itself if subjective, like whether we want to check the 
> > > initializedness of pointees. We sure can regard them as unwanted, or we 
> > > could regard them as totally fine, so an option is a great compromise.
> > > 
> > > >[...] it always turned out that they don't understand what they're 
> > > >asking for [...]
> > > 
> > > This is why I think this shouldn't be user-facing. You need to be quite 
> > > knowledgeable about the analyzer, and IteratorChecker in particular to 
> > > make good judgement about enabling the option added in this patch. That 
> > > doesn't mean you shouldn't experiment with it, but I'd prefer to not 
> > > expose it too much.
> > > 
> > > I just don't see this feature being that desired by many if they knew the 
> > > costs, even if one user really wants it. We could however, in that case, 
> > > say
> > > "Hey, there is this option you can enable that might help you hunt down 
> > > more bugs. It does change the actual analysis itself, and experience 
> > > showed that it might result in more false positives and could impact 
> > > performance, so we opted not to make it user-facing, but it is there, if 
> > > you feel adventurous."
> > Granted, I might be demonizing the cons of these options too much, but the 
> > description doesn't help on understanding it that much either O:)
> So you both are saying that I should tell the user that we cannot help, the 
> community does not support that he catches this bug in early phase with the 
> analyzer. He must catch it in the testing phase for higher costs. This is not 
> the best way to build confidence in the analyzer.
> 
> In my perception options are used to serve one users need without harming 
> others. I never understand why there is so much resistance against 
> introducing an option that does not change the default behavior at all.
1. That's right, implementing every single suggestion users have is a terrible 
idea. Not everything is possible, not everything is feasible. Populism should 
not be put ahead of common sense. As an extreme example, imagine you're in the 
80's and a user complained that they need a voice recognition feature in their 
computer ("Why do I have to type all this text, it's so slow???"). Would you be 
ready to inform them that it'll take 30 years and a few revolutionary 
breakthroughs in computer science to implement? Would you be ready to tell them 
that learning how to touch-type will make their typing faster than anything 
they could ever achieve with dictation? Because that's your job as a programmer 
to be assertive and to be able to say "no" or "maybe later" or "you're not 
using the program correctly" when user demands don't make sense or are 
impossible to implement or if the user doesn't know what they're asking for. 
Developers of every single program that has any users at all constantly 
struggle with this.

2. Supporting multiple configurations is much harder than supporting a single 
configuration. I wish I could say that it will be entirely up to you to 
maintain these configurations, including the situations when it doesn't cause 
problems yet but obstructs future development. However, I can't say that, due 
to 3.

3. You can't control who uses your option once it's landed in open-source. By 
committing the option to the upstream LLVM, you're not shipping this option to 
just one user, you're shipping it to the entire world.

4. You can't simply say "I know it may sometimes work incorrectly but I tested 
it and it works for me". The option may behave fine on codebases in your 
company but completely ruin reputation of the static analyzer by producing a 
lot of false positives on a famous open-source project that you didn't test it 
on. This is why we care a lot about correctness and not introducing new sources 
of false positives.

5. You may say that the users know what they're doing when they turn the option 
on, and that they're willing to accept some false positives. That's not true 
though; you don't personally brief every user of your option. They may try it 
simply because they heard about it from someone else. They're not always 
willing to read documentation. One user may enable the option for all other 
developers and they may never learn that this was an option to begin with.

6. In particular, i believe that false positives destroy confidence in the 
analyzer much more than false negatives. I also don't trust the users when they 
say they don't care about false positives. I heard this many times from a 
variety of people but when I gave them what they wanted they immediately 
started complaining about false positives every single time.

7. Static analysis is not a one-size-fits-all solution to bug finding. Instead 
of trying to find a way to find every kind of bug statically, you should be 
trying to find the best tool for the job. There are areas in which static 
analysis excels, like finding exotic execution paths not covered by tests, but 
it's not good for everything.

8. Static analysis is not necessarily cheaper than testing or code review or 
dynamic analysis. Understanding a single analyzer report may take hours.

9. False positives reduce cost-effectiveness of static analysis dramatically. 
They may cause the user to spend a lot of time trying to understand an 
incorrect warning, or the users may introduce a new bug when they try to 
silence the warning without understanding the warning and/or the surrounding 
code.

10. Last but not least, honestly, we have significantly more important problems 
to fix. If you claim that adding an option makes a few users happy without 
changing anything for anybody else, then you must also admit that you're doing 
a lot of work for just a small portion of users, when you could spend your time 
(and our time) benefiting *every* user instead.

So, like, options aren't too bad on their own but they're definitely not as 
free as you describe. You still need to think about *all* users who can 
potentially access the option, not just the one who's in front of you. "The 
analyzer produces false positives under an option" still means that the 
analyzer produces false positives in one of the supported configurations. 
Marking features as opt-in doesn't give you an excuse for shipping bad features.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77150/new/

https://reviews.llvm.org/D77150

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77150: [Analyzer] New Option for ContainerModeling: AggressiveEraseModeling

Reply via email to