[PATCH] D53076: [analyzer] Enhance ConditionBRVisitor to write out more information

Artem Dergachev via Phabricator via cfe-commits Thu, 01 Nov 2018 20:53:07 -0700

NoQ added a comment.

@Charusso, i believe you have some misconception about what constraints mean 
and how they work. Not sure what this misconception is, so i'll make a blind 
guess and annoy you a little bit with a narcissistic rant and you'll have to 
bear me, sry!

The most important thing to know about constraints is that they are applied to 
"symbols", not to variables. We explain it to the user in terms of variables 
(when possible) because symbols are immaterial in the code and as such cannot 
be easily explained to the user, but in reality constraints are entirely about 
symbols.

Let me explain this with an example. If you say "let x denote the number of 
sheep in a herd" and then a new sheep is born and joins the herd, the number of 
sheep becomes (x + 1) and no longer remains equal to x, while x can still be 
used to represent the *original* number of sheep. In this example, the number 
of sheep is the variable, and x is the symbol.

The good thing about symbols is that they denote the same unknown value 
throughout the whole analysis. Values of variables may change (hence the 
nomenclature), but values denoted by symbols never change. They are kinda like 
Static Analyzer's substitute for transforming the program into SSA notation. 
When constraints are assumed, we make assumptions about the unknown value 
denoted by the symbol and it becomes "less unknown", but it is still the same 
value, it's just we know more about it. Which means that:

- //"The Golden Rule of Constraint Solving":// (i came up with this title 3 
seconds ago) In any two points A and B during analysis (i.e., nodes in the 
exploded graph), if A precedes B along any execution path (i.e., there's a 
directed path from A to B in the exploded graph), constraints for any symbol x 
in point B are a subset of constraints for x in point A.

For example, if Static Analyzer assumes that a symbol x is greater than 10, it 
cannot later assume that x is less than 5, because [-∞, 4] is not a subset of 
[11, +∞]. Therefore Static Analyzer will not explore the path on which x is 
less than 5, and it will not update range constraints for the symbol. But it 
will explore the path on which x is also less than 15 by setting range 
constraints for x to [11, 14]  because [11, 14] is indeed a subset of [11, +∞].

The formal definition of "Assuming" pieces is that they indicate that the 
constraints change for the symbol (eg., for the symbol that represents the 
*current* value of the variable mentioned in the message of the event piece). 
This is the precise meaning of "Static Analyzer is assuming". And when 
explained in terms of variables, this precise meaning is comprehend-able by the 
user and should be conveyed properly. The distinction between presence and 
absence of assumptions is crucial to convey. Therefore, if the branch condition 
is denoted by a symbol and constraints over that symbol do not change, there 
should be no "Assuming..." piece for the branch condition symbol, and the 
respective mechanism should not be removed. But there may be a new sort of 
event piece, eg. "Knowing...", if you still want to highlight the motivation 
behind "Taking..." a specific branch, which seems to be the purpose of your 
patch.

"Assuming..." pieces over the same variable may contradict each other (i.e., 
look as if they violate the Golden Rule) when contents of the variable change. 
So, yeah, in this case another "Assuming..." piece will be necessary in some 
cases. But it doesn't mean that the Golden Rule is actually violated; it just 
means that the variable now has different value and different assumptions can 
be made about it. This doesn't happen in the examples that i pointed out, so 
there should not be an "Assuming..." piece.

In an inline comment i show what this means on the test case that seems to be 
the easiest.

================
Comment at: test/Analysis/MisusedMovedObject.cpp:187
     A a;
-    if (i == 1) { // expected-note {{Taking false branch}} expected-note 
{{Taking false branch}}
+    if (i == 1) { // expected-note {{Assuming 'i' is not equal to 1}} 
expected-note {{Taking false branch}}
+      // expected-note@-1 {{Assuming 'i' is not equal to 1}} expected-note@-1 
{{Taking false branch}}
----------------
Charusso wrote:
> NoQ wrote:
> > These assumptions were already made on the previous branches. There should 
> > be no extra assumptions here.
> Agree but only if there is no extra constraint EventPiece between them.
I think i answered this concern in the out-of-line comment. Because constraints 
never change in a contradictory manner but only grow shorter, a much stronger 
statement can be made here: there are also no extra assumptions when the 
variable is not reassigned since the last "Assuming..." piece. Of course, even 
if it is reassigned, we may still not want to have the "Assuming..." piece - it 
depends entirely on the current constraints over the symbol that represents the 
branch condition.

================
Comment at: test/Analysis/NewDelete-path-notes.cpp:10
   if (p)
-    // expected-note@-1 {{Taking true branch}}
+    // expected-note@-1 {{Assuming 'p' is non-null}}
+    // expected-note@-2 {{Taking true branch}}
----------------
Charusso wrote:
> NoQ wrote:
> > Static Analyzer knows that the standard operator new never returns null. 
> > Therefore no assumption is being made here.
> As I see SA knows nothing. Where to teach it?
It does know it. The ultimate way to observe what Static Analyzer knows or 
thinks at any point during analysis and how its knowledge evolves is to [[ 
http://clang-analyzer.llvm.org/checker_dev_manual.html#visualizing | dump the 
"exploded graph" ]]:

{F7485724}

As you see, as soon as operator new is evaluated, even before the if-statement 
is evaluated, the symbol `conj_$0{int, LC1, S772, #1}` that represents the 
unknown value of the pointer produced by operator new becomes constrained to 
[1, 2⁶⁴ - 1] (assuming the target system is 64-bit), which means exactly that: 
it can take any value except 0.

Therefore when Static Analyzer reaches the if-statement later, it has no choice 
but to proceed to the true-branch, and not only the false branch is entirely 
skipped (the execution path doesn't split in two), but also the set of 
constraints remains unchanged, indicating that no assumption is being made:

{F7485753}

For the reference, here's the complete exploded graph for the test:

{F7485762}

https://reviews.llvm.org/D53076

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D53076: [analyzer] Enhance ConditionBRVisitor to write out more information

Reply via email to