https://github.com/NagyDonat commented:
I see that the old behavior had problems, but the new code also has a
problematic aspect: in the case when the pointers may or may not be same (that
is, when `StSameBuf && StNotSameBuf` is true) it always assumes that the
pointers are not equal (because it assigns `state = StNotSameBuf;` in the line
just below the diff context displayed by github).
To test this, you should replace the `// ...` placeholders in your new tests
with `clang_analyzer_eval(a == b); // expected-warning{{UNKNOWN}}`. Ideally
these checks should pass (if the strings are identical, the pointers may or may
not be equal) but with your code they will produce FALSE (because the
`StSameBuf` branch is silently discarded).
To avoid losing the "pointers may be equal" case you should remove the
statement `state = StNotSameBuf;`. Unfortunately, deleting this statement also
has a serious side effect: after it `clang_analyzer_eval(a == b);` would also
be `UNKNOWN` on the branch where `strcmp` is nonzero. (However, this side
effect is just "not recording knowledge" which is IMO less problematic than
"arbitrarily discarding a case that would be possible".)
This side effect can be easily mitigated in the case when the analyzer can
evaluate the comparison of known strings (just re-add `state = StNotSameBuf;`
on the branch where the checker records the assumption that the result is GT or
LT compared to zero).
Mitigating the side effect is also possible in the case when the analyzer
cannot evaluate the comparison (which includes e.g. the tests added in this
PR): to do this we should:
- call `auto [StResultEqual, StResultNonEqual] = state->assume(resultVal)`
(where `resultVal` is the freshly conjured otherwise unconstrained result value)
- these states will be both non-null because the value was unconstrained
- add a transition to `StResultEqual` (i.e. the string values equal, pointers
may or may not be equal)
- starting from `StResultNotEqual` (when the string values differ) assume that
the two pointers are different, then add a second transition to the resulting
state (if it is non-null ... in theory it should be always non-null, but there
might be some rare corner case were it isn't).
This second mitigation introduces a state split, but IMO this is a much more
"benign" state split than the one that was introduced by the old code.
Separating "`strcmp` returns zero" and "`strcmp` returns nonzero" is a
reasonable after a call to `strcmp` (in fact, this is very similar to the
`eagerly-assume` heuristic, which works correctly and is enabled by default) --
while the old "pointers are equal" and "pointers are different" state split was
indeed a surprising and unjustified consequence of calling `strcmp`.
_I admit that these suggestions are a bit complex, feel free to ask for details
if you don't understand something..._
https://github.com/llvm/llvm-project/pull/161644
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits