[clang] 25b9696 - [analyzer] Upstream BitwiseShiftChecker

2023-08-18 Thread Donát Nagy via cfe-commits

Author: Donát Nagy
Date: 2023-08-18T10:47:05+02:00
New Revision: 25b9696b61e53a958e217bb3d0eab66350dc187f

URL: 
https://github.com/llvm/llvm-project/commit/25b9696b61e53a958e217bb3d0eab66350dc187f
DIFF: 
https://github.com/llvm/llvm-project/commit/25b9696b61e53a958e217bb3d0eab66350dc187f.diff

LOG: [analyzer] Upstream BitwiseShiftChecker

This commit releases a checker that was developed to a stable level in
the Ericsson-internal fork of Clang Static Analyzer.

Note that the functionality of this checker overlaps with
core.UndefinedBinaryOperatorResult ("UBOR"), but there are several
differences between them:
(1) UBOR is only triggered when the constant folding performed by the
Clang Static Analyzer engine determines that the value of a binary
operator expression is undefined; this checker can report issues where
the operands are not constants.
(2) UBOR has unrelated checks for handling other binary operators, this
checker only examines bitwise shifts.
(3) This checker has a Pedantic flag and by default does not report
expressions (e.g. -2 << 2) that're undefined by the standard but
consistently supported in practice.
(4) UBOR exhibits buggy behavior in code that involves cast expressions,
e.g.
void foo(unsigned short s) {
  if (s == 2) {
(void) ((unsigned int) s) << 16;
  }
}

Later it would be good to eliminate this overlap (perhaps by deprecating
and then eliminating the bitwise shift handling in UBOR), but in my
opinion that belongs to separate commits.

Differential Revision: https://reviews.llvm.org/D156312

Co-authored-by: Endre Fulop 

Added: 
clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp
clang/test/Analysis/bitwise-shift-common.c
clang/test/Analysis/bitwise-shift-pedantic.c
clang/test/Analysis/bitwise-shift-sanity-checks.c
clang/test/Analysis/bitwise-shift-state-update.c

Modified: 
clang/docs/ReleaseNotes.rst
clang/docs/analyzer/checkers.rst
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
clang/test/Analysis/analyzer-config.c
clang/test/Analysis/analyzer-enabled-checkers.c
clang/test/Analysis/bitwise-ops-nocrash.c
clang/test/Analysis/bitwise-ops.c
clang/test/Analysis/casts.c
clang/test/Analysis/diagnostics/track_subexpressions.cpp
clang/test/Analysis/left-shift-cxx2a.cpp
clang/test/Analysis/std-c-library-functions-arg-enabled-checkers.c
clang/test/Analysis/symbol-simplification-nonloc-loc.cpp

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 4571f16c37b5eb..30a8d57b34384e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -283,6 +283,10 @@ libclang
 Static Analyzer
 ---
 
+- Added a new checker ``core.BitwiseShift`` which reports situations where
+  bitwise shift operators produce undefined behavior (because some operand is
+  negative or too large).
+
 .. _release-notes-sanitizers:
 
 Sanitizers

diff  --git a/clang/docs/analyzer/checkers.rst 
b/clang/docs/analyzer/checkers.rst
index 98cf799ae46280..ac6919e4c98297 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -29,6 +29,56 @@ Models core language features and contains general-purpose 
checkers such as divi
 null pointer dereference, usage of uninitialized values, etc.
 *These checkers must be always switched on as other checker rely on them.*
 
+.. _core-BitwiseShift:
+
+core.BitwiseShift (C, C++)
+""
+
+Finds undefined behavior caused by the bitwise left- and right-shift operator
+operating on integer types.
+
+By default, this checker only reports situations when the right operand is
+either negative or larger than the bit width of the type of the left operand;
+these are logically unsound.
+
+Moreover, if the pedantic mode is activated by
+``-analyzer-config core.BitwiseShift:Pedantic=true``, then this checker also
+reports situations where the _left_ operand of a shift operator is negative or
+overflow occurs during the right shift of a signed value. (Most compilers
+handle these predictably, but the C standard and the C++ standards before C++20
+say that they're undefined behavior. In the C++20 standard these constructs are
+well-defined, so activating pedantic mode in C++20 has no effect.)
+
+**Examples**
+
+.. code-block:: cpp
+
+ static_assert(sizeof(int) == 4, "assuming 32-bit int")
+
+ void basic_examples(int a, int b) {
+   if (b < 0) {
+ b = a << b; // warn: right operand is negative in left shift
+   } else if (b >= 32) {
+ b = a >> b; // warn: right shift overflows the capacity of 'int'
+   }
+ }
+
+ int pedantic_examples(int a, int b) {
+   if (a < 0) {
+ return a >> b; // warn: left operand is negative in right shift
+   }
+   a = 1000u << 31; // OK, overflow of unsigned value is well-defined, a == 0
+   if (b > 10) {
+ a =

[clang] 3e01403 - [analyzer] Improve underflow handling in ArrayBoundV2

2023-08-21 Thread Donát Nagy via cfe-commits

Author: Donát Nagy
Date: 2023-08-21T17:17:02+02:00
New Revision: 3e014038b373e5a4a96d89d46cea17e4d2456a04

URL: 
https://github.com/llvm/llvm-project/commit/3e014038b373e5a4a96d89d46cea17e4d2456a04
DIFF: 
https://github.com/llvm/llvm-project/commit/3e014038b373e5a4a96d89d46cea17e4d2456a04.diff

LOG: [analyzer] Improve underflow handling in ArrayBoundV2

This minor change ensures that underflow errors are reported on memory
regions that are in unknown space but have a well-defined beginning.

As a concrete example, the following test case did not produce a warning
previously, but will produce a warning after this patch:

  typedef struct {
int id;
char name[256];
  } user_t;

  user_t *get_symbolic_user(void);
  char test_underflow_symbolic_2() {
user_t *user = get_symbolic_user();
return user->name[-1];
  }

Differential Revision: https://reviews.llvm.org/D157104

Added: 


Modified: 
clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
clang/test/Analysis/out-of-bounds.c

Removed: 




diff  --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index 269277aaf357f0..93da34ed5da4c5 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -172,13 +172,18 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, 
bool isLoad,
 return;
 
   NonLoc ByteOffset = RawOffset->getByteOffset();
+  const SubRegion *Reg = RawOffset->getRegion();
 
   // CHECK LOWER BOUND
-  const MemSpaceRegion *SR = RawOffset->getRegion()->getMemorySpace();
-  if (!llvm::isa(SR)) {
-// A pointer to UnknownSpaceRegion may point to the middle of
-// an allocated region.
-
+  const MemSpaceRegion *Space = Reg->getMemorySpace();
+  if (!(isa(Reg) && isa(Space))) {
+// A symbolic region in unknown space represents an unknown pointer that
+// may point into the middle of an array, so we don't look for underflows.
+// Both conditions are significant because we want to check underflows in
+// symbolic regions on the heap (which may be introduced by checkers like
+// MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and
+// non-symbolic regions (e.g. a field subregion of a symbolic region) in
+// unknown space.
 auto [state_precedesLowerBound, state_withinLowerBound] =
 compareValueToThreshold(state, ByteOffset,
 svalBuilder.makeZeroArrayIndex(), svalBuilder);
@@ -194,8 +199,7 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool 
isLoad,
   }
 
   // CHECK UPPER BOUND
-  DefinedOrUnknownSVal Size =
-  getDynamicExtent(state, RawOffset->getRegion(), svalBuilder);
+  DefinedOrUnknownSVal Size = getDynamicExtent(state, Reg, svalBuilder);
   if (auto KnownSize = Size.getAs()) {
 auto [state_withinUpperBound, state_exceedsUpperBound] =
 compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder);

diff  --git a/clang/test/Analysis/out-of-bounds.c 
b/clang/test/Analysis/out-of-bounds.c
index edf70085e18fae..cf8e60f66ac063 100644
--- a/clang/test/Analysis/out-of-bounds.c
+++ b/clang/test/Analysis/out-of-bounds.c
@@ -151,11 +151,23 @@ void test_assume_after_access(unsigned long x) {
 // Don't warn when indexing below the start of a symbolic region's whose
 // base extent we don't know.
 int *get_symbolic(void);
-void test_index_below_symboloc(void) {
+void test_underflow_symbolic(void) {
   int *buf = get_symbolic();
   buf[-1] = 0; // no-warning;
 }
 
+// But warn if we understand the internal memory layout of a symbolic region.
+typedef struct {
+  int id;
+  char name[256];
+} user_t;
+
+user_t *get_symbolic_user(void);
+char test_underflow_symbolic_2() {
+  user_t *user = get_symbolic_user();
+  return user->name[-1]; // expected-warning{{Out of bound memory access}}
+}
+
 void test_incomplete_struct(void) {
   extern struct incomplete incomplete;
   int *p = (int *)&incomplete;



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 8a5cfdf - [analyzer][NFC] Remove useless class BuiltinBug

2023-08-28 Thread Donát Nagy via cfe-commits

Author: Donát Nagy
Date: 2023-08-28T15:20:14+02:00
New Revision: 8a5cfdf7851dcdb4e16c510b133d7d0e79e43fc4

URL: 
https://github.com/llvm/llvm-project/commit/8a5cfdf7851dcdb4e16c510b133d7d0e79e43fc4
DIFF: 
https://github.com/llvm/llvm-project/commit/8a5cfdf7851dcdb4e16c510b133d7d0e79e43fc4.diff

LOG: [analyzer][NFC] Remove useless class BuiltinBug

...because it provides no useful functionality compared to its base
class `BugType`.

A long time ago there were substantial differences between `BugType` and
`BuiltinBug`, but they were eliminated by commit 1bd58233 in 2009 (!).
Since then the only functionality provided by `BuiltinBug` was that it
specified `categories::LogicError` as the bug category and it stored an
extra data member `desc`.

This commit sets `categories::LogicError` as the default value of the
third argument (bug category) in the constructors of BugType and
replaces use of the `desc` field with simpler logic.

Note that `BugType` has a data member `Description` and a non-virtual
method `BugType::getDescription()` which queries it; these are distinct
from the member `desc` of `BuiltinBug` and the identically named method
`BuiltinBug::getDescription()` which queries it. This confusing name
collision was a major motivation for the elimination of `BuiltinBug`.

As this commit touches many files, I avoided functional changes and left
behind FIXME notes to mark minor issues that should be fixed later.

Differential Revision: https://reviews.llvm.org/D158855

Added: 


Modified: 
clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h
clang/lib/StaticAnalyzer/Checkers/ArrayBoundChecker.cpp
clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
clang/lib/StaticAnalyzer/Checkers/BoolAssignmentChecker.cpp
clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
clang/lib/StaticAnalyzer/Checkers/CastSizeChecker.cpp
clang/lib/StaticAnalyzer/Checkers/ChrootChecker.cpp
clang/lib/StaticAnalyzer/Checkers/ConversionChecker.cpp
clang/lib/StaticAnalyzer/Checkers/DebugCheckers.cpp
clang/lib/StaticAnalyzer/Checkers/EnumCastOutOfRangeChecker.cpp
clang/lib/StaticAnalyzer/Checkers/FixedAddressChecker.cpp
clang/lib/StaticAnalyzer/Checkers/NonNullParamChecker.cpp
clang/lib/StaticAnalyzer/Checkers/ObjCAtSyncChecker.cpp
clang/lib/StaticAnalyzer/Checkers/PointerArithChecker.cpp
clang/lib/StaticAnalyzer/Checkers/PointerSubChecker.cpp
clang/lib/StaticAnalyzer/Checkers/ReturnPointerRangeChecker.cpp
clang/lib/StaticAnalyzer/Checkers/ReturnUndefChecker.cpp
clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
clang/lib/StaticAnalyzer/Checkers/TestAfterDivZeroChecker.cpp
clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp
clang/lib/StaticAnalyzer/Checkers/UndefCapturedBlockVarChecker.cpp
clang/lib/StaticAnalyzer/Checkers/UndefResultChecker.cpp
clang/lib/StaticAnalyzer/Checkers/UndefinedArraySubscriptChecker.cpp
clang/lib/StaticAnalyzer/Checkers/UndefinedAssignmentChecker.cpp

clang/lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
clang/lib/StaticAnalyzer/Checkers/VforkChecker.cpp
clang/lib/StaticAnalyzer/Core/BugReporter.cpp
clang/unittests/StaticAnalyzer/CallEventTest.cpp
clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp
clang/unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp

Removed: 




diff  --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h 
b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h
index 5588811763273b..e50afd6d0da7e5 100644
--- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h
+++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h
@@ -35,13 +35,13 @@ class BugType {
   virtual void anchor();
 
 public:
-  BugType(CheckerNameRef CheckerName, StringRef Name, StringRef Cat,
-  bool SuppressOnSink = false)
-  : CheckerName(CheckerName), Description(Name), Category(Cat),
+  BugType(CheckerNameRef CheckerName, StringRef Desc,
+  StringRef Cat = categories::LogicError, bool SuppressOnSink = false)
+  : CheckerName(CheckerName), Description(Desc), Category(Cat),
 Checker(nullptr), SuppressOnSink(SuppressOnSink) {}
-  BugType(const CheckerBase *Checker, StringRef Name, StringRef Cat,
-  bool SuppressOnSink = false)
-  : CheckerName(Checker->getCheckerName()), Description(Name),
+  BugType(const CheckerBase *Checker, StringRef Desc,
+  StringRef Cat = categories::LogicError, bool SuppressOnSink = false)
+  : CheckerName(Checker->getCheckerName()), Description(Desc),
 Category(Cat), Checker(Checker), SuppressOnSink(SuppressOnSink) {}
   virtual ~BugType() = default;
 
@@ -64,27 +64,6 @@ class BugType {
   bool isSuppressOnSink() const { return SuppressOnSin

[clang] cec30e2 - [dataflow] Fix complie on gcc7 part 2

2023-06-30 Thread Donát Nagy via cfe-commits

Author: Donát Nagy
Date: 2023-06-30T16:27:36+02:00
New Revision: cec30e2b190bd58a26f54b5fd4232d3828632466

URL: 
https://github.com/llvm/llvm-project/commit/cec30e2b190bd58a26f54b5fd4232d3828632466
DIFF: 
https://github.com/llvm/llvm-project/commit/cec30e2b190bd58a26f54b5fd4232d3828632466.diff

LOG: [dataflow] Fix complie on gcc7 part 2

Change c2bb68078eb9 triggered a bug in gcc7 by disabling a copy
constructor. Commit 2f7d30dee826 fixed one occurrence of this issue, but
after that the compilation still failed with another very similar error.

This commit eliminates this second compilation error, and ensures that
`ninja check-clang-analysis` can be successfully executed with gcc 7.5.

Added: 


Modified: 
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h

Removed: 




diff  --git a/clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h 
b/clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
index c6e9f848477f8a..33eb42897b8a1a 100644
--- a/clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
+++ b/clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
@@ -235,7 +235,7 @@ runDataflowAnalysis(
   std::move(State.Env)};
 });
   });
-  return BlockStates;
+  return std::move(BlockStates);
 }
 
 /// Abstract base class for dataflow "models": reusable analysis components 
that



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 1d75b18 - [analyzer][NFC] Fix dangling StringRef in barely used code

2023-06-30 Thread Donát Nagy via cfe-commits

Author: Donát Nagy
Date: 2023-06-30T17:15:43+02:00
New Revision: 1d75b18843fbca52655e240a120b5fdeeef17c0e

URL: 
https://github.com/llvm/llvm-project/commit/1d75b18843fbca52655e240a120b5fdeeef17c0e
DIFF: 
https://github.com/llvm/llvm-project/commit/1d75b18843fbca52655e240a120b5fdeeef17c0e.diff

LOG: [analyzer][NFC] Fix dangling StringRef in barely used code

CheckerContext::getNoteTag has a shorthand version that takes a
plain 'StringRef Note' instead of a lambda that calculates the note.

The old implementation of this method was incorrect because it created a
lambda that captured the StringRef, which was dereferenced much later,
when the NoteTags were visited.

In the current codebase this does not cause errors because this method
is called only once, and there the `Note` argument is a string literal
that remains valid. However, I tried to use this method in a checker
that I was prototyping, and there it printed random memory junk (instead
of the message that I composed in a local variable).

Differential Revision: https://reviews.llvm.org/D153889

Added: 


Modified: 
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h

Removed: 




diff  --git 
a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h 
b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
index 9bd5a802d5d64e..9923c41e6ad2d1 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
@@ -316,8 +316,8 @@ class CheckerContext {
   ///bug path significantly shorter.
   const NoteTag *getNoteTag(StringRef Note, bool IsPrunable = false) {
 return getNoteTag(
-[Note](BugReporterContext &,
-   PathSensitiveBugReport &) { return std::string(Note); },
+[Note = std::string(Note)](BugReporterContext &,
+   PathSensitiveBugReport &) { return Note; },
 IsPrunable);
   }
 



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] de25473 - [analyzer] Fix comparison logic in ArrayBoundCheckerV2

2023-04-26 Thread Donát Nagy via cfe-commits

Author: Donát Nagy
Date: 2023-04-26T15:02:23+02:00
New Revision: de2547329b41ad6ea4ea876d12731bde5a6b64c5

URL: 
https://github.com/llvm/llvm-project/commit/de2547329b41ad6ea4ea876d12731bde5a6b64c5
DIFF: 
https://github.com/llvm/llvm-project/commit/de2547329b41ad6ea4ea876d12731bde5a6b64c5.diff

LOG: [analyzer] Fix comparison logic in ArrayBoundCheckerV2

The prototype checker alpha.security.ArrayBoundV2 performs two
comparisons to check that in an expression like Array[Index]
0 <= Index < length(Array)
holds. These comparisons are handled by almost identical logic: the
inequality is first rearranged by getSimplifiedOffsets(), then evaluated
with evalBinOpNN().

However the simplification used "naive" elementary mathematical
schematics, but evalBinOpNN() performed the signed -> unsigned
conversions described in the C/C++ standards, and this confusion led to
wildly inaccurate results: false positives from the lower bound check
and false negatives from the upper bound check.

This commit eliminates the code duplication by moving the comparison
logic into a separate function, then adds an explicit check to this
unified code path, which handles the problematic case separately.

In addition to this, the commit also cleans up a testcase that was
demonstrating the presence of this problem. Note that while that
testcase was failing with an overflow error, its actual problem was in
the underflow handler logic:
(0) The testcase introduces a five-element array "char a[5]" and an
unknown argument "size_t len"; then evaluates "a[len+1]".
(1) The underflow check tries to determine whether "len+1 < 0" holds.
(2) This inequality is rearranged to "len < -1".
(3) evalBinOpNN() evaluates this with the schematics of C/C++ and
converts -1 to the size_t value SIZE_MAX.
(4) The engine concludes that len == SIZE_MAX, because otherwise we'd
have an underflow here.
(5) The overflow check tries to determine whether "len+1 >= 5".
(6) This inequality is rearranged to "len >= 4".
(7) The engine substitutes len == SIZE_MAX and reports that we have
an overflow.

Differential Revision: https://reviews.llvm.org/D135375

Added: 
clang/test/Analysis/array-bound-v2-constraint-check.c

Modified: 
clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp

Removed: 
clang/test/Analysis/out-of-bounds-false-positive.c



diff  --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index 142577174ead8..a476613b6dcc7 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -57,8 +57,8 @@ class RegionRawOffsetV2 {
 : baseRegion(nullptr), byteOffset(UnknownVal()) {}
 
 public:
-  RegionRawOffsetV2(const SubRegion* base, SVal offset)
-: baseRegion(base), byteOffset(offset) {}
+  RegionRawOffsetV2(const SubRegion *base, NonLoc offset)
+  : baseRegion(base), byteOffset(offset) { assert(base); }
 
   NonLoc getByteOffset() const { return byteOffset.castAs(); }
   const SubRegion *getRegion() const { return baseRegion; }
@@ -72,19 +72,12 @@ class RegionRawOffsetV2 {
 };
 }
 
-static SVal computeExtentBegin(SValBuilder &svalBuilder,
-   const MemRegion *region) {
-  const MemSpaceRegion *SR = region->getMemorySpace();
-  if (SR->getKind() == MemRegion::UnknownSpaceRegionKind)
-return UnknownVal();
-  else
-return svalBuilder.makeZeroArrayIndex();
-}
-
 // TODO: once the constraint manager is smart enough to handle non simplified
 // symbolic expressions remove this function. Note that this can not be used in
 // the constraint manager as is, since this does not handle overflows. It is
 // safe to assume, however, that memory offsets will not overflow.
+// NOTE: callers of this function need to be aware of the effects of overflows
+// and signed<->unsigned conversions!
 static std::pair
 getSimplifiedOffsets(NonLoc offset, nonloc::ConcreteInt extent,
  SValBuilder &svalBuilder) {
@@ -117,6 +110,38 @@ getSimplifiedOffsets(NonLoc offset, nonloc::ConcreteInt 
extent,
   return std::pair(offset, extent);
 }
 
+// Evaluate the comparison Value < Threshold with the help of the custom
+// simplification algorithm defined for this checker. Return a pair of states,
+// where the first one corresponds to "value below threshold" and the second
+// corresponds to "value at or above threshold". Returns {nullptr, nullptr} in
+// the case when the evaluation fails.
+static std::pair
+compareValueToThreshold(ProgramStateRef State, NonLoc Value, NonLoc Threshold,
+SValBuilder &SVB) {
+  if (auto ConcreteThreshold = Threshold.getAs()) {
+std::tie(Value, Threshold) = getSimplifiedOffsets(Value, 
*ConcreteThreshold, SVB);
+  }
+  if (auto ConcreteThreshold = Threshold.getAs()) {
+QualType T = Value.getType(SVB.getContex

[clang] 8c22cbe - [analyzer] ArrayBoundCheckerV2: suppress false positives from ctype macros

2023-05-03 Thread Donát Nagy via cfe-commits

Author: Donát Nagy
Date: 2023-05-03T18:52:27+02:00
New Revision: 8c22cbea87beb74da3dc5891c40cdf574cd5fe56

URL: 
https://github.com/llvm/llvm-project/commit/8c22cbea87beb74da3dc5891c40cdf574cd5fe56
DIFF: 
https://github.com/llvm/llvm-project/commit/8c22cbea87beb74da3dc5891c40cdf574cd5fe56.diff

LOG: [analyzer] ArrayBoundCheckerV2: suppress false positives from ctype macros

The checker alpha.security.ArrayBoundV2 created bug reports in
situations when the (tainted) result of fgetc() or getchar() was passed
to one of the isX() macros from ctype.h.

This is a common input handling pattern (within the limited toolbox of
the C language) and several open source projects contained code where it
led to false positive reports; so this commit suppresses ArrayBoundV2
reports generated within the isX() macros.

Note that here even true positive reports would be difficult to
understand, as they'd refer to the implementation details of these
macros.

Differential Revision: https://reviews.llvm.org/D149460

Added: 


Modified: 
clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
clang/test/Analysis/taint-generic.c

Removed: 




diff  --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index a476613b6dcc7..6d0cc505756b8 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -42,8 +42,10 @@ class ArrayBoundCheckerV2 :
   void reportTaintOOB(CheckerContext &C, ProgramStateRef errorState,
   SVal TaintedSVal) const;
 
+  static bool isFromCtypeMacro(const Stmt *S, ASTContext &AC);
+
 public:
-  void checkLocation(SVal l, bool isLoad, const Stmt*S,
+  void checkLocation(SVal l, bool isLoad, const Stmt *S,
  CheckerContext &C) const;
 };
 
@@ -155,6 +157,15 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, 
bool isLoad,
   // memory access is within the extent of the base region.  Since we
   // have some flexibility in defining the base region, we can achieve
   // various levels of conservatism in our buffer overflow checking.
+
+  // The header ctype.h (from e.g. glibc) implements the isX() macros as
+  //   #define isX(arg) (LOOKUP_TABLE[arg] & BITMASK_FOR_X)
+  // and incomplete analysis of these leads to false positives. As even
+  // accurate reports would be confusing for the users, just disable reports
+  // from these macros:
+  if (isFromCtypeMacro(LoadS, checkerContext.getASTContext()))
+return;
+
   ProgramStateRef state = checkerContext.getState();
 
   SValBuilder &svalBuilder = checkerContext.getSValBuilder();
@@ -267,6 +278,25 @@ void ArrayBoundCheckerV2::reportOOB(CheckerContext 
&checkerContext,
   checkerContext.emitReport(std::move(BR));
 }
 
+bool ArrayBoundCheckerV2::isFromCtypeMacro(const Stmt *S, ASTContext &ACtx) {
+  SourceLocation Loc = S->getBeginLoc();
+  if (!Loc.isMacroID())
+return false;
+
+  StringRef MacroName = Lexer::getImmediateMacroName(
+  Loc, ACtx.getSourceManager(), ACtx.getLangOpts());
+
+  if (MacroName.size() < 7 || MacroName[0] != 'i' || MacroName[1] != 's')
+return false;
+
+  return ((MacroName == "isalnum") || (MacroName == "isalpha") ||
+  (MacroName == "isblank") || (MacroName == "isdigit") ||
+  (MacroName == "isgraph") || (MacroName == "islower") ||
+  (MacroName == "isnctrl") || (MacroName == "isprint") ||
+  (MacroName == "ispunct") || (MacroName == "isspace") ||
+  (MacroName == "isupper") || (MacroName == "isxdigit"));
+}
+
 #ifndef NDEBUG
 LLVM_DUMP_METHOD void RegionRawOffsetV2::dump() const {
   dumpToStream(llvm::errs());

diff  --git a/clang/test/Analysis/taint-generic.c 
b/clang/test/Analysis/taint-generic.c
index 626e01e39d158..62e1f570b6622 100644
--- a/clang/test/Analysis/taint-generic.c
+++ b/clang/test/Analysis/taint-generic.c
@@ -156,6 +156,28 @@ void bufferGetchar(int x) {
   Buffer[m] = 1;  //expected-warning {{Out of bound memory access (index is 
tainted)}}
 }
 
+extern const unsigned short int **__ctype_b_loc (void);
+enum { _ISdigit = 2048 };
+# define isdigit(c) ((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) 
_ISdigit)
+
+int isdigitImplFalsePositive(void) {
+  // If this code no longer produces a bug report, then consider removing the
+  // special case that disables buffer overflow reports coming from the isX
+  // macros in ctypes.h.
+  int c = getchar();
+  return ((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) _ISdigit);
+  //expected-warning@-1 {{Out of bound memory access (index is tainted)}}
+}
+
+int isdigitSuppressed(void) {
+  // Same code as above, but reports are suppressed based on macro name:
+  int c = getchar();
+  return isdigit(c); //no-warning
+}
+
+// Some later tests use isdigit as a function, so we need to undef it:
+#undef isdig

[clang] b88023c - [analyzer][NFC] Use std::optional instead of custom "empty" state

2023-05-04 Thread Donát Nagy via cfe-commits

Author: Donát Nagy
Date: 2023-05-04T12:56:15+02:00
New Revision: b88023c25729f4dd4548a25e5e12d6624e2adbaf

URL: 
https://github.com/llvm/llvm-project/commit/b88023c25729f4dd4548a25e5e12d6624e2adbaf
DIFF: 
https://github.com/llvm/llvm-project/commit/b88023c25729f4dd4548a25e5e12d6624e2adbaf.diff

LOG: [analyzer][NFC] Use std::optional instead of custom "empty" state

This commit eliminates the uninitialized error state from the class
RegionRawOffsetV2 (which is locally used by the Clang Static Analyzer
checker alpha.security.ArrayBoundV2) and replaces its use with
std::optional.

Motivated by https://reviews.llvm.org/D148355#inline-1437928

Moreover, the code of RegionRawOffsetV2::computeOffset() is rearranged
to clarify its behavior. The helper function getValue() was eliminated
by picking a better initial value for the variable Offset; two other
helper functions were replaced by the lambda function Calc() because
this way it doesn't need to take the "context" objects as parameters.

This reorganization revealed some surprising (but not outright buggy)
behavior that's marked by a FIXME and will be revisited in a separate
commit.

Differential Revision: https://reviews.llvm.org/D149259

Added: 


Modified: 
clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp

Removed: 




diff  --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index 6d0cc505756b8..269277aaf357f 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -53,21 +53,17 @@ class ArrayBoundCheckerV2 :
 class RegionRawOffsetV2 {
 private:
   const SubRegion *baseRegion;
-  SVal byteOffset;
-
-  RegionRawOffsetV2()
-: baseRegion(nullptr), byteOffset(UnknownVal()) {}
+  NonLoc byteOffset;
 
 public:
   RegionRawOffsetV2(const SubRegion *base, NonLoc offset)
   : baseRegion(base), byteOffset(offset) { assert(base); }
 
-  NonLoc getByteOffset() const { return byteOffset.castAs(); }
+  NonLoc getByteOffset() const { return byteOffset; }
   const SubRegion *getRegion() const { return baseRegion; }
 
-  static RegionRawOffsetV2 computeOffset(ProgramStateRef state,
- SValBuilder &svalBuilder,
- SVal location);
+  static std::optional
+  computeOffset(ProgramStateRef State, SValBuilder &SVB, SVal Location);
 
   void dump() const;
   void dumpToStream(raw_ostream &os) const;
@@ -169,16 +165,16 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, 
bool isLoad,
   ProgramStateRef state = checkerContext.getState();
 
   SValBuilder &svalBuilder = checkerContext.getSValBuilder();
-  const RegionRawOffsetV2 &rawOffset =
-RegionRawOffsetV2::computeOffset(state, svalBuilder, location);
+  const std::optional &RawOffset =
+  RegionRawOffsetV2::computeOffset(state, svalBuilder, location);
 
-  if (!rawOffset.getRegion())
+  if (!RawOffset)
 return;
 
-  NonLoc ByteOffset = rawOffset.getByteOffset();
+  NonLoc ByteOffset = RawOffset->getByteOffset();
 
   // CHECK LOWER BOUND
-  const MemSpaceRegion *SR = rawOffset.getRegion()->getMemorySpace();
+  const MemSpaceRegion *SR = RawOffset->getRegion()->getMemorySpace();
   if (!llvm::isa(SR)) {
 // A pointer to UnknownSpaceRegion may point to the middle of
 // an allocated region.
@@ -199,7 +195,7 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool 
isLoad,
 
   // CHECK UPPER BOUND
   DefinedOrUnknownSVal Size =
-  getDynamicExtent(state, rawOffset.getRegion(), svalBuilder);
+  getDynamicExtent(state, RawOffset->getRegion(), svalBuilder);
   if (auto KnownSize = Size.getAs()) {
 auto [state_withinUpperBound, state_exceedsUpperBound] =
 compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder);
@@ -307,84 +303,55 @@ void RegionRawOffsetV2::dumpToStream(raw_ostream &os) 
const {
 }
 #endif
 
-// Lazily computes a value to be used by 'computeOffset'.  If 'val'
-// is unknown or undefined, we lazily substitute '0'.  Otherwise,
-// return 'val'.
-static inline SVal getValue(SVal val, SValBuilder &svalBuilder) {
-  return val.isUndef() ? svalBuilder.makeZeroArrayIndex() : val;
-}
-
-// Scale a base value by a scaling factor, and return the scaled
-// value as an SVal.  Used by 'computeOffset'.
-static inline SVal scaleValue(ProgramStateRef state,
-  NonLoc baseVal, CharUnits scaling,
-  SValBuilder &sb) {
-  return sb.evalBinOpNN(state, BO_Mul, baseVal,
-sb.makeArrayIndex(scaling.getQuantity()),
-sb.getArrayIndexType());
-}
-
-// Add an SVal to another, treating unknown and undefined values as
-// summing to UnknownVal.  Used by 'computeOffset'.
-static SVal addValue(ProgramStateRef state, SVal x, SVal y,
- 

[clang] b16a593 - [analyzer][NFC] Pass the diagnostic message to the TrackConstraintBRVisitor

2023-06-06 Thread Donát Nagy via cfe-commits

Author: Endre Fulop
Date: 2023-06-06T16:28:31+02:00
New Revision: b16a59328fc5120006aeac501637229cc7e30357

URL: 
https://github.com/llvm/llvm-project/commit/b16a59328fc5120006aeac501637229cc7e30357
DIFF: 
https://github.com/llvm/llvm-project/commit/b16a59328fc5120006aeac501637229cc7e30357.diff

LOG: [analyzer][NFC] Pass the diagnostic message to the TrackConstraintBRVisitor

The `TrackConstraintBRVisitor` should accept a message for the note
instead of creating one. It would let us inject domain-specific
knowledge in a non-intrusive way, leading to a more generic visitor.

Differential Revision: https://reviews.llvm.org/D152255

Added: 


Modified: 
clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp

Removed: 




diff  --git 
a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h 
b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
index 9c81e02245ecd..d9b3d9352d322 100644
--- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
+++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
@@ -392,19 +392,19 @@ const Expr *getDerefExpr(const Stmt *S);
 } // namespace bugreporter
 
 class TrackConstraintBRVisitor final : public BugReporterVisitor {
-  DefinedSVal Constraint;
-  bool Assumption;
+  const SmallString<64> Message;
+  const DefinedSVal Constraint;
+  const bool Assumption;
   bool IsSatisfied = false;
-  bool IsZeroCheck;
 
   /// We should start tracking from the last node along the path in which the
   /// value is constrained.
   bool IsTrackingTurnedOn = false;
 
 public:
-  TrackConstraintBRVisitor(DefinedSVal constraint, bool assumption)
-  : Constraint(constraint), Assumption(assumption),
-IsZeroCheck(!Assumption && isa(Constraint)) {}
+  TrackConstraintBRVisitor(DefinedSVal constraint, bool assumption,
+   StringRef Message)
+  : Message(Message), Constraint(constraint), Assumption(assumption) {}
 
   void Profile(llvm::FoldingSetNodeID &ID) const override;
 
@@ -417,6 +417,9 @@ class TrackConstraintBRVisitor final : public 
BugReporterVisitor {
PathSensitiveBugReport &BR) override;
 
 private:
+  /// Checks if the constraint refers to a null-location.
+  bool isZeroCheck() const;
+
   /// Checks if the constraint is valid in the current state.
   bool isUnderconstrained(const ExplodedNode *N) const;
 };

diff  --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp 
b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
index 2b461acf9a731..42d03f67510cf 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
@@ -1786,6 +1786,7 @@ PathDiagnosticPieceRef StoreSiteFinder::VisitNode(const 
ExplodedNode *Succ,
 void TrackConstraintBRVisitor::Profile(llvm::FoldingSetNodeID &ID) const {
   static int tag = 0;
   ID.AddPointer(&tag);
+  ID.AddString(Message);
   ID.AddBoolean(Assumption);
   ID.Add(Constraint);
 }
@@ -1796,8 +1797,12 @@ const char *TrackConstraintBRVisitor::getTag() {
   return "TrackConstraintBRVisitor";
 }
 
+bool TrackConstraintBRVisitor::isZeroCheck() const {
+  return !Assumption && Constraint.getAs();
+}
+
 bool TrackConstraintBRVisitor::isUnderconstrained(const ExplodedNode *N) const 
{
-  if (IsZeroCheck)
+  if (isZeroCheck())
 return N->getState()->isNull(Constraint).isUnderconstrained();
   return (bool)N->getState()->assume(Constraint, !Assumption);
 }
@@ -1827,19 +1832,6 @@ PathDiagnosticPieceRef 
TrackConstraintBRVisitor::VisitNode(
 // the transition point.
 assert(!isUnderconstrained(N));
 
-// We found the transition point for the constraint.  We now need to
-// pretty-print the constraint. (work-in-progress)
-SmallString<64> sbuf;
-llvm::raw_svector_ostream os(sbuf);
-
-if (isa(Constraint)) {
-  os << "Assuming pointer value is ";
-  os << (Assumption ? "non-null" : "null");
-}
-
-if (os.str().empty())
-  return nullptr;
-
 // Construct a new PathDiagnosticPiece.
 ProgramPoint P = N->getLocation();
 
@@ -1854,7 +1846,7 @@ PathDiagnosticPieceRef 
TrackConstraintBRVisitor::VisitNode(
 if (!L.isValid())
   return nullptr;
 
-auto X = std::make_shared(L, os.str());
+auto X = std::make_shared(L, Message);
 X->setTag(getTag());
 return std::move(X);
   }
@@ -2366,8 +2358,9 @@ class InterestingLValueHandler final : public 
ExpressionHandler {
 // null.
 if (V.getAsLocSymbol(/*IncludeBaseRegions=*/true))
   if (LVState->isNull(V).isConstrainedTrue())
-
Report.addVisitor(V.castAs(),
-false);
+Report.addVisitor(
+V.castAs(),
+/*Assumption=*/fals

[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat requested changes to this pull request.

Unfortunately this PR is not a full solution, because e.g. the following test 
code still triggers the crash (if it is appended to the test file `stream.c`):
```c
struct zerosized {
int foo[0];
};

void fread_zerosized(struct zerosized *buffer) {
  FILE *f = fopen("/tmp/foo.txt", "r");
  fread(buffer, 1, 1, f);
  fclose(f);
}
```

(I know that zero-sized arrays are not standard-compliant, but they are a 
widespread extension: e.g. both clang and gcc accept this `struct zerosized` 
with the default settings.)

Overall, I'd say that it's futile to try to recognize zero-sized types with a 
"canonical type equal to" check, so you should just check whether 
`ElemSizeInChars` is zero and do something based on that. (Either an early 
return, or you can say `ElemSizeInChars = 1` at that point if you think that 
that's the logically correct solution.)

``This way you could also avoid the immediately invoked lambda in 
`getPointeeType` which is really ugly in my opinion.``

https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)

2024-07-01 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> Even protobuf contains this type of code: 
> https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=protobuf_v3.13.0_pointersub1&is-unique=on&diff-type=New&checker-name=alpha.core.PointerSub&report-id=5545776&report-hash=1bcd310fbaeccbcc13645b9b277239a2&report-filepath=%2adescriptor.pb.cc

I still think that this
 (1) is undeniably undefined behavior
 (2) isn't common, so won't cause "spam"  problems and
 (3( can be replaced by standard-compliant code (`offsetof`)
so there is no need to introduce a special case for it.

https://github.com/llvm/llvm-project/pull/96501
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

_I only noticed that this PR was already merged after posting the review. There 
is no need to revert the commit -- it's better than nothing -- but I'd be happy 
if you created a followup change that also handles the testcase that I 
mentioned._

https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, straightforward change. Thanks!

https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits


@@ -21,30 +21,55 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
 
 using namespace clang;
 using namespace ento;
 
 namespace {
-class MmapWriteExecChecker : public Checker {
+class MmapWriteExecChecker
+: public Checker {
   CallDescription MmapFn{CDM::CLibrary, {"mmap"}, 6};
   CallDescription MprotectFn{CDM::CLibrary, {"mprotect"}, 3};
-  static int ProtWrite;
-  static int ProtExec;
-  static int ProtRead;
   const BugType BT{this, "W^X check fails, Write Exec prot flags set",
"Security"};
 
+  mutable bool FlagsInitialized = false;
+  mutable int ProtRead = 0x01;
+  mutable int ProtWrite = 0x02;
+  mutable int ProtExec = 0x04;
+
 public:
+  void checkBeginFunction(CheckerContext &C) const;
   void checkPreCall(const CallEvent &Call, CheckerContext &C) const;
+
   int ProtExecOv;
   int ProtReadOv;
 };
 }
 
-int MmapWriteExecChecker::ProtWrite = 0x02;
-int MmapWriteExecChecker::ProtExec  = 0x04;
-int MmapWriteExecChecker::ProtRead  = 0x01;
+void MmapWriteExecChecker::checkBeginFunction(CheckerContext &C) const {
+  if (FlagsInitialized)
+return;
+
+  FlagsInitialized = true;
+
+  const std::optional FoundProtRead =
+  tryExpandAsInteger("PROT_READ", C.getPreprocessor());
+  const std::optional FoundProtWrite =
+  tryExpandAsInteger("PROT_WRITE", C.getPreprocessor());
+  const std::optional FoundProtExec =
+  tryExpandAsInteger("PROT_EXEC", C.getPreprocessor());
+  if (FoundProtRead && FoundProtWrite && FoundProtExec) {
+ProtRead = *FoundProtRead;
+ProtWrite = *FoundProtWrite;
+ProtExec = *FoundProtExec;
+  } else {
+// FIXME: Are these useful?

NagyDonat wrote:

They are not too useful (and it's a little bit surprising that there is no 
"ProtWriteOv"), but they're harmless and they don't require complex code, so 
let's keep them for now.

https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland "[analyzer][NFC] Reorganize Z3 report refutation" (PR #97265)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Let's merge this again, after the Z3 version there shouldn't be any additional 
problems. 

https://github.com/llvm/llvm-project/pull/97265
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland "[analyzer][NFC] Reorganize Z3 report refutation" (PR #97265)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97265
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits


@@ -21,30 +21,55 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
 
 using namespace clang;
 using namespace ento;
 
 namespace {
-class MmapWriteExecChecker : public Checker {
+class MmapWriteExecChecker
+: public Checker {
   CallDescription MmapFn{CDM::CLibrary, {"mmap"}, 6};
   CallDescription MprotectFn{CDM::CLibrary, {"mprotect"}, 3};
-  static int ProtWrite;
-  static int ProtExec;
-  static int ProtRead;
   const BugType BT{this, "W^X check fails, Write Exec prot flags set",
"Security"};
 
+  mutable bool FlagsInitialized = false;
+  mutable int ProtRead = 0x01;
+  mutable int ProtWrite = 0x02;
+  mutable int ProtExec = 0x04;
+
 public:
+  void checkBeginFunction(CheckerContext &C) const;
   void checkPreCall(const CallEvent &Call, CheckerContext &C) const;
+
   int ProtExecOv;
   int ProtReadOv;
 };
 }
 
-int MmapWriteExecChecker::ProtWrite = 0x02;
-int MmapWriteExecChecker::ProtExec  = 0x04;
-int MmapWriteExecChecker::ProtRead  = 0x01;
+void MmapWriteExecChecker::checkBeginFunction(CheckerContext &C) const {
+  if (FlagsInitialized)
+return;
+
+  FlagsInitialized = true;
+
+  const std::optional FoundProtRead =
+  tryExpandAsInteger("PROT_READ", C.getPreprocessor());
+  const std::optional FoundProtWrite =
+  tryExpandAsInteger("PROT_WRITE", C.getPreprocessor());
+  const std::optional FoundProtExec =
+  tryExpandAsInteger("PROT_EXEC", C.getPreprocessor());
+  if (FoundProtRead && FoundProtWrite && FoundProtExec) {
+ProtRead = *FoundProtRead;
+ProtWrite = *FoundProtWrite;
+ProtExec = *FoundProtExec;
+  } else {
+// FIXME: Are these useful?
+ProtRead = ProtReadOv;
+ProtExec = ProtExecOv;

NagyDonat wrote:

This code fragment lets the user specify the value of the `PROT_READ` and 
`PROT_EXEC` macros as checker options:
```cpp
Mwec->ProtExecOv =
  mgr.getAnalyzerOptions()
.getCheckerIntegerOption(Mwec, "MmapProtExec");
Mwec->ProtReadOv =
  mgr.getAnalyzerOptions()
.getCheckerIntegerOption(Mwec, "MmapProtRead");
```
This is a very marginal feature and I wouldn't spend time on implementing it; 
but now that it's already implemented I'm not sure that we should delete it.

https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Reland "[analyzer] Harden safeguards for Z3 query times" (PR #97298)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

This should be re-landed as well; when it was merged earlier the only issue was 
the incompatibility with old Z3 and that was addressed since then (by bumping 
the version requirement).

https://github.com/llvm/llvm-project/pull/97298
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Thanks for the updates! I added few minor comments, but the PR is already good 
enough to be merged.

> >  I only noticed that this PR was already merged after posting the review. 
> > There is no need to revert the commit -- it's better than nothing -- but 
> > I'd be happy if you created a followup change that also handles the 
> > testcase that I mentioned.
>
> I'm not sure what you refer to. This PR is not approved, hence not merged. 
> Please continue with the review.

Oops, sorry -- I was probably confused by the purple "merged" icon in 
![image](https://github.com/llvm/llvm-project/assets/43410265/e4b494d4-014f-4b03-942b-00debb90a709)
...but the real reason is that I got up at 4 AM today :sleeping: 


https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits


@@ -1034,16 +1034,16 @@ void StreamChecker::preWrite(const FnDescription *Desc, 
const CallEvent &Call,
   C.addTransition(State);
 }
 
-static std::optional getPointeeType(const MemRegion *R) {
+static QualType getPointeeType(const MemRegion *R) {
   if (!R)
-return std::nullopt;
+return {};

NagyDonat wrote:

It's a bit strange that here you return `{}`, while at the end of the function 
`QualType{}` is explicitly specified.

https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits


@@ -1034,16 +1034,16 @@ void StreamChecker::preWrite(const FnDescription *Desc, 
const CallEvent &Call,
   C.addTransition(State);
 }
 
-static std::optional getPointeeType(const MemRegion *R) {
+static QualType getPointeeType(const MemRegion *R) {
   if (!R)
-return std::nullopt;
+return {};
   if (const auto *ER = dyn_cast(R))
-return ER->getElementType();
+return ER->getElementType()->getCanonicalTypeUnqualified();

NagyDonat wrote:

Consider moving the `->getCanonicalTypeUnqualified()` calls into 
`escapeByStartIndexAndCount()` because that's the only place where it will be 
relevant. (This would eliminate the code duplication.)


https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,37 @@
+Obtaining the Static Analyzer
+=
+
+This page describes how to download and install the analyzer. Once the 
analyzer is installed, follow the asdf :doc:`CommandLineUsage` on using the 
commandline to get started analyzing your code.

NagyDonat wrote:

Nice catch! :laughing: 

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/95550

From 06adc063c2388ea534537f5a417751fdf64b22cd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Fri, 14 Jun 2024 15:16:34 +0200
Subject: [PATCH 1/4] [clang-tidy] Clarify diagnostics of
 bugprone-sizeof-expression

...becasue they were strangely worded and in a few cases outright
incorrect.
---
 .../bugprone/SizeofExpressionCheck.cpp| 22 ++---
 .../checkers/bugprone/sizeof-expression-2.c   | 12 +--
 .../sizeof-expression-any-pointer.cpp | 86 +--
 ...on-warn-on-sizeof-pointer-to-aggregate.cpp |  4 +-
 .../checkers/bugprone/sizeof-expression.cpp   | 84 +-
 5 files changed, 105 insertions(+), 103 deletions(-)

diff --git a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp 
b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
index c25ee42d0899a..3afa0e5c39882 100644
--- a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
@@ -296,7 +296,7 @@ void SizeofExpressionCheck::check(const 
MatchFinder::MatchResult &Result) {
   } else if (const auto *E =
  Result.Nodes.getNodeAs("sizeof-integer-call")) {
 diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression "
-   "that results in an integer")
+   "of integer type")
 << E->getSourceRange();
   } else if (const auto *E = Result.Nodes.getNodeAs("sizeof-this")) {
 diag(E->getBeginLoc(),
@@ -314,7 +314,7 @@ void SizeofExpressionCheck::check(const 
MatchFinder::MatchResult &Result) {
   << E->getSourceRange();
 } else {
   diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression "
- "that results in a pointer")
+ "of pointer type")
   << E->getSourceRange();
 }
   } else if (const auto *E = Result.Nodes.getNodeAs(
@@ -342,31 +342,33 @@ void SizeofExpressionCheck::check(const 
MatchFinder::MatchResult &Result) {
 
 if (DenominatorSize > CharUnits::Zero() &&
 !NumeratorSize.isMultipleOf(DenominatorSize)) {
-  diag(E->getOperatorLoc(), "suspicious usage of 
'sizeof(...)/sizeof(...)';"
+  diag(E->getOperatorLoc(), "suspicious usage of 
'sizeof(...)/sizeof(...)':"
 " numerator is not a multiple of denominator")
   << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
 } else if (ElementSize > CharUnits::Zero() &&
DenominatorSize > CharUnits::Zero() &&
ElementSize != DenominatorSize) {
-  diag(E->getOperatorLoc(), "suspicious usage of 
'sizeof(...)/sizeof(...)';"
-" numerator is not a multiple of denominator")
+  diag(E->getOperatorLoc(),
+   "suspicious usage of 'sizeof(array)/sizeof(...)':"
+   " denominator differs from the size of array elements")
   << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
 } else if (NumTy && DenomTy && NumTy == DenomTy) {
-  // FIXME: This message is wrong, it should not refer to sizeof "pointer"
-  // usage (and by the way, it would be to clarify all the messages).
   diag(E->getOperatorLoc(),
-   "suspicious usage of sizeof pointer 'sizeof(T)/sizeof(T)'")
+   "suspicious usage of 'sizeof(...)/sizeof(...)': both expressions "
+   "have the same type")
   << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
 } else if (!WarnOnSizeOfPointer) {
   // When 'WarnOnSizeOfPointer' is enabled, these messages become 
redundant:
   if (PointedTy && DenomTy && PointedTy == DenomTy) {
 diag(E->getOperatorLoc(),
- "suspicious usage of sizeof pointer 'sizeof(T*)/sizeof(T)'")
+ "suspicious usage of 'sizeof(...)/sizeof(...)': size of pointer "
+ "is divided by size of pointed type")
 << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
   } else if (NumTy && DenomTy && NumTy->isPointerType() &&
  DenomTy->isPointerType()) {
 diag(E->getOperatorLoc(),
- "suspicious usage of sizeof pointer 'sizeof(P*)/sizeof(Q*)'")
+ "suspicious usage of 'sizeof(...)/sizeof(...)': both expressions "
+ "have pointer types")
 << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
   }
 }
diff --git 
a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c 
b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
index aef930f2c8fda..b898071a56613 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
@@ -34,24 +34,24 @@ int Test5() {
 
   int sum = 0;
   sum +=

[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)

2024-07-02 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

I mentioned this change in the paragraph that was describing my earlier commit 
(that was also modifying this check).

https://github.com/llvm/llvm-project/pull/95550
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat closed 
https://github.com/llvm/llvm-project/pull/95550
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate checkers-related docs from HTML to RST (PR #97032)

2024-07-02 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

:thinking: Deleting the html files could break some links on external sites, so 
I think it would be better to replace them with a very simple "This content was 
moved to " placeholder.

https://github.com/llvm/llvm-project/pull/97032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,238 @@
+Command-Line Usage: CodeChecker and scan-build
+===
+
+This document provides guidelines for running Clang Static Analyzer from the 
command line on whole projects.
+CodeChecker and scan-build are two CLI tools for using CSA on multiple files 
(tranlation units).
+Both provide a way of driving the analyzer, detecting compilation flags, and 
generating reports.
+CodeChecker is more actively maintained, provides heuristics for working with 
multiple versions of popular compilers and it also comes with a web-based GUI 
for viewing, filtering, categorizing and suppressing the results.
+Therefore CodeChecker is recommended in case you need any of the above 
features or just more customizability in general.
+
+Comparison of CodeChecker and scan-build
+
+
+Static Analyzer is by design a GUI tool originally intended to be consumed by 
the XCode IDE.
+Its purpose is to find buggy execution paths in the program, and such paths 
are very hard to comprehend by looking at a non-interactive standard output.
+It is possible, however, to invoke the Static Analyzer from the command line 
in order to obtain analysis results, and then later view them interactively in 
a graphical interface.
+The following tools are used commonly to run the analyzer from the commandline.
+Both tools are wrapper scripts to drive the analysis and the underlying 
invocations of the Clang compiler:
+
+1. CodeChecker_ is a driver and web server that runs the Static Analyzer on 
your projects on demand and maintains a database of issues.

NagyDonat wrote:

It's still strange that there is only one underscore which is _after_ the word 
'CodeChecker'. I'd guess that the intent was to italicize 'CodeChecker' and we 
should add another underscore _before_ this word for the sake of clarity (even 
if the actual HTML generation can work with this half-broken situation.

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Check the correct first and last elements in cstring.UninitializedRead (PR #95408)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, with the disclaimer that I'd still prefer e.g. "second argument instead 
of "2nd argument" -- but I won't block the review with this bikeshedding.

https://github.com/llvm/llvm-project/pull/95408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,238 @@
+Command-Line Usage: CodeChecker and scan-build
+===
+
+This document provides guidelines for running Clang Static Analyzer from the 
command line on whole projects.
+CodeChecker and scan-build are two CLI tools for using CSA on multiple files 
(tranlation units).
+Both provide a way of driving the analyzer, detecting compilation flags, and 
generating reports.
+CodeChecker is more actively maintained, provides heuristics for working with 
multiple versions of popular compilers and it also comes with a web-based GUI 
for viewing, filtering, categorizing and suppressing the results.
+Therefore CodeChecker is recommended in case you need any of the above 
features or just more customizability in general.
+
+Comparison of CodeChecker and scan-build
+
+
+Static Analyzer is by design a GUI tool originally intended to be consumed by 
the XCode IDE.
+Its purpose is to find buggy execution paths in the program, and such paths 
are very hard to comprehend by looking at a non-interactive standard output.
+It is possible, however, to invoke the Static Analyzer from the command line 
in order to obtain analysis results, and then later view them interactively in 
a graphical interface.
+The following tools are used commonly to run the analyzer from the commandline.
+Both tools are wrapper scripts to drive the analysis and the underlying 
invocations of the Clang compiler:
+
+1. CodeChecker_ is a driver and web server that runs the Static Analyzer on 
your projects on demand and maintains a database of issues.

NagyDonat wrote:

Oh, that's surprising -- I was not familiar with that feature of RST. Naturally 
this invalidates my concern.

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

I'm really happy to see that this duplicated documentation is finally cleaned 
up after so many years :partying_face:

I added a few minor remarks in inline comments, but the change looks good 
overall.

As we discussed privately, this change should be (mostly) limited to the format 
changes, and then separate follow-up commit should eliminate the outdated 
content (and, perhaps, add new information if that's necessary). I'll update 
the PR description to emphasize this.

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,239 @@
+Command Line Usage: scan-build and CodeChecker
+==
+
+This document provides guidelines for running Clang Static Analyzer from the 
command line on whole projects.
+CodeChecker and scan-build are two CLI tools for using CSA on multiple files 
(tranlation units).
+Both provide a way of driving the analyzer, detecting compilation flags, and 
generating reports.
+CodeChecker is more actively maintained, provides heuristics for working with 
multiple versions of popular compilers and it also comes with a web-based GUI 
for viewing, filtering, categorizing and suppressing the results.
+Therefore CodeChecker is recommended in case you need any of the above 
features or just more customizability in general.
+
+Comparison of CodeChecker and scan-build
+
+
+Static Analyzer is by design a GUI tool originally intended to be consumed by 
the XCode IDE.

NagyDonat wrote:

Is this still true?

(Even if it's obsolete, we should probably leave it for a follow-up commit.)

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,239 @@
+Command Line Usage: scan-build and CodeChecker
+==
+
+This document provides guidelines for running Clang Static Analyzer from the 
command line on whole projects.
+CodeChecker and scan-build are two CLI tools for using CSA on multiple files 
(tranlation units).
+Both provide a way of driving the analyzer, detecting compilation flags, and 
generating reports.
+CodeChecker is more actively maintained, provides heuristics for working with 
multiple versions of popular compilers and it also comes with a web-based GUI 
for viewing, filtering, categorizing and suppressing the results.
+Therefore CodeChecker is recommended in case you need any of the above 
features or just more customizability in general.
+
+Comparison of CodeChecker and scan-build
+
+
+Static Analyzer is by design a GUI tool originally intended to be consumed by 
the XCode IDE.
+Its purpose is to find buggy execution paths in the program, and such paths 
are very hard to comprehend by looking at a non-interactive standard output.
+It is possible, however, to invoke the Static Analyzer from the command line 
in order to obtain analysis results, and then later view them interactively in 
a graphical interface.
+The following tools are used commonly to run the analyzer from the command 
line.
+Both tools are wrapper scripts to drive the analysis and the underlying 
invocations of the Clang compiler:
+
+1. scan-build_ is an old and simple command line tool that emits static 
analyzer warnings as HTML files while compiling your project. You can view the 
analysis results in your web browser.
+- Useful for individual developers who simply want to view static analysis 
results at their desk, or in a very simple collaborative environment.
+- Works on all major platforms (Windows, Linux, macOS) and is available as 
a package in many Linux distributions.
+- Does not include support for cross-translation-unit analysis.
+
+2. CodeChecker_ is a driver and web server that runs the Static Analyzer on 
your projects on demand and maintains a database of issues.
+- Perfect for managing large amounts of Static Analyzer warnings in a 
collaborative environment.
+- Generally much more feature-rich than scan-build.
+- Supports incremental analysis: Results can be stored in a database, 
subsequent analysis runs can be compared to list the newly added defects.
+- :doc:`CrossTranslationUnit` is supported fully on Linux via CodeChecker.
+- Can run clang-tidy checkers too.
+- Open source, but out-of-tree, i.e. not part of the LLVM project.
+
+scan-build
+--
+
+**scan-build** is a command line utility that enables a user to run the static 
analyzer over their codebase as part of performing a regular build (from the 
command line).

NagyDonat wrote:

The use of "the Static Analyzer" vs "the static analyzer" is inconsistent -- it 
would be good to pick one of them and use it consistently.


https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

I'm really happy that you decided to document these :smile: 

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is 
always
+/// typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in

NagyDonat wrote:

Is it truly a _copy_ of the store object? This seems to contradict the "is a 
very cheap operation" line in the next paragraph...?

Or is it a _copied store object_ that _shares the same bulk allocations_ with 
the original one (but differs in minor details?) 

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown

NagyDonat wrote:

```suggestion
/// being referred to as "concrete", as many fields inside it would be unknown
```

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is 
always
+/// typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in

NagyDonat wrote:

Perhaps we should mention that `Store` is just a typedef for `const void *`...

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-03 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is 
always
+/// typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in

NagyDonat wrote:

I do understand functional programming (I had a phase when I learnt Haskell and 
for a few years I thought that it's the absolutely perfect language 
:star_struck: -- since then I'm more realistic but I still like it), I was just 
confused by the fact that this line emphasizes "a copy of the whole Store 
object" as opposed to the "reference to" in the previous line.

In a functional language the word "copy" is practically non-existent (we speak 
about values, not objects and their identity, so there is no reason to say that 
this is _a copy of_ e.g. that list -- we simply say that "this is that list"), 
while in C++ saying "copy" instead of "move" or "pointer/reference" is an 
important synonym of "we allocate lots of new memory", so I'd still prefer 
tweaking this line.

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-03 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is 
always
+/// typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in
+/// which it was created.

NagyDonat wrote:

```suggestion
///  * a reference to the whole Store object, obtained from the ProgramState in
///which the nonloc::LazyCompoundVal was created.
///
/// Note that the old ProgramState and its Store is kept alive during the
/// analysis because these are immutable functional data structures and each new
/// Store value is represented as "earlier Store" + "additional binding".
```

What would you think about this phrasing?

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] [MallocChecker] suspect all release functions as candidate for suppression (PR #104599)

2024-09-04 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

I'm satisfied with the current state of this commit, but let's wait a few days 
for a review from @Szelethus (or anybody else interested).

https://github.com/llvm/llvm-project/pull/104599
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)

2024-09-04 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/106982
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)

2024-09-04 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat requested changes to this pull request.

Unfortunately I found several fundamental issues within the implementation of 
`extractActualValueFrom()` -- see inline comments for details.

Moreover I'm not convinced that this "replace `LazyCompoundVal`s with actual 
values" logic is the right approach for your goals.

When we discussed this yesterday, you told me that this logic was intended to 
handle the situations where a `LazyCompoundVal` represents a non-aggregate type 
(like an integer) -- but I think that we should catch that kind of 
non-canonical representation at an earlier point (e.g. within some `getSVal` 
method). Hiding an integer behind this kind of lazy wrapper is a serious defect 
of our store model which inhibits many operations -- not just `evalBinOp`.

On the other hand, this "replace `LazyCompoundVal`s with a better 
representation" approach could be a good idea in the cases when the 
`LazyCompoundVal` represents an aggregate. However in this case I don't think 
that you can get a non-lazy representation by simply selecting the right 
binding (either via `iterBinding`s or via a direct `getBinding` call) -- you 
will need to find or write an explicit "convert this `LazyCompoundVal` into a 
non-lazy `CompoundVal`" function.

https://github.com/llvm/llvm-project/pull/106982
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)

2024-09-04 Thread Donát Nagy via cfe-commits


@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, 
UnaryOperator::Opcode opc,
   llvm_unreachable("Unexpected unary operator");
 }
 
+namespace {
+/// Iterate through to store to find the actual value this LazyCompoundVal
+/// corresponds to. Further reading is in LazyCompoundVal's docs.
+struct LazyHandler : public StoreManager::BindingsHandler {
+  nonloc::LazyCompoundVal l;
+  std::optional &Ret;
+
+  LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret)
+  : l(l), Ret(Ret) {}
+
+  virtual bool HandleBinding(StoreManager &SMgr, Store Store,
+ const MemRegion *Region, SVal Val) override {
+if (const MemRegion *MR = Val.getAsRegion()) {
+  if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) {
+Ret = Val;
+return false;
+  }
+}
+return true;
+  }
+};
+} // namespace
+
+/// Find the actual value behind a LazyCompoundVal. May return none if the
+/// query fails, or only finds another LazyCompoundVal (e.g. circular
+/// reference).
+static std::optional extractActualValueFrom(ProgramStateRef State,
+  nonloc::LazyCompoundVal L) {
+  std::optional Ret;
+  LazyHandler handler{L, Ret};
+  State->getStateManager().getStoreManager().iterBindings(State->getStore(),
+  handler);
+  if (Ret) {
+std::optional RVal = Ret = State->getSVal(Ret->getAsRegion());
+// If our best efforts lead us back to another LazyCompoundValue, give up.
+if (Ret == RVal)
+  return {};

NagyDonat wrote:

Something is fishy here -- after `RVal = Ret` the condition `Ret == RVal` 
should always be true.

https://github.com/llvm/llvm-project/pull/106982
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)

2024-09-04 Thread Donát Nagy via cfe-commits


@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, 
UnaryOperator::Opcode opc,
   llvm_unreachable("Unexpected unary operator");
 }
 
+namespace {
+/// Iterate through to store to find the actual value this LazyCompoundVal
+/// corresponds to. Further reading is in LazyCompoundVal's docs.
+struct LazyHandler : public StoreManager::BindingsHandler {
+  nonloc::LazyCompoundVal l;
+  std::optional &Ret;
+
+  LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret)
+  : l(l), Ret(Ret) {}
+
+  virtual bool HandleBinding(StoreManager &SMgr, Store Store,
+ const MemRegion *Region, SVal Val) override {
+if (const MemRegion *MR = Val.getAsRegion()) {
+  if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) {
+Ret = Val;
+return false;
+  }
+}
+return true;
+  }
+};
+} // namespace
+
+/// Find the actual value behind a LazyCompoundVal. May return none if the
+/// query fails, or only finds another LazyCompoundVal (e.g. circular
+/// reference).
+static std::optional extractActualValueFrom(ProgramStateRef State,
+  nonloc::LazyCompoundVal L) {
+  std::optional Ret;
+  LazyHandler handler{L, Ret};
+  State->getStateManager().getStoreManager().iterBindings(State->getStore(),

NagyDonat wrote:

I checked the implementation of `iterBindings` and the other methods of 
`StoreManager` and I still think that `iterBindings` is not the right tool for 
this job and you should use `SVal StoreManager::getBinding(Store S, Loc L, 
QualType T)` instead.

I know that the documentation of `LazyCompoundVal` suggests using 
`iterBindings` and that those docs were blessed by several reviewers (including 
me), but that doesn't change the fact that there are other more practical 
methods _and_ `iterBindings` doesn't always provide a complete answer, because 
it only iterates over _direct_ bindings and skips the so-called default 
bindings.

https://github.com/llvm/llvm-project/pull/106982
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)

2024-09-04 Thread Donát Nagy via cfe-commits


@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, 
UnaryOperator::Opcode opc,
   llvm_unreachable("Unexpected unary operator");
 }
 
+namespace {
+/// Iterate through to store to find the actual value this LazyCompoundVal
+/// corresponds to. Further reading is in LazyCompoundVal's docs.
+struct LazyHandler : public StoreManager::BindingsHandler {
+  nonloc::LazyCompoundVal l;
+  std::optional &Ret;
+
+  LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret)
+  : l(l), Ret(Ret) {}
+
+  virtual bool HandleBinding(StoreManager &SMgr, Store Store,
+ const MemRegion *Region, SVal Val) override {
+if (const MemRegion *MR = Val.getAsRegion()) {
+  if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) {
+Ret = Val;
+return false;
+  }
+}
+return true;
+  }
+};
+} // namespace
+
+/// Find the actual value behind a LazyCompoundVal. May return none if the
+/// query fails, or only finds another LazyCompoundVal (e.g. circular
+/// reference).
+static std::optional extractActualValueFrom(ProgramStateRef State,
+  nonloc::LazyCompoundVal L) {
+  std::optional Ret;
+  LazyHandler handler{L, Ret};
+  State->getStateManager().getStoreManager().iterBindings(State->getStore(),

NagyDonat wrote:

Instead of `State->getStore()` (the current state of the memory) you must use 
`L->getStore()` (the old memory snapshot which defines the value represented by 
the LazyCompoundVal `L`).

https://github.com/llvm/llvm-project/pull/106982
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)

2024-09-04 Thread Donát Nagy via cfe-commits


@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, 
UnaryOperator::Opcode opc,
   llvm_unreachable("Unexpected unary operator");
 }
 
+namespace {
+/// Iterate through to store to find the actual value this LazyCompoundVal
+/// corresponds to. Further reading is in LazyCompoundVal's docs.
+struct LazyHandler : public StoreManager::BindingsHandler {
+  nonloc::LazyCompoundVal l;
+  std::optional &Ret;
+
+  LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret)
+  : l(l), Ret(Ret) {}
+
+  virtual bool HandleBinding(StoreManager &SMgr, Store Store,
+ const MemRegion *Region, SVal Val) override {
+if (const MemRegion *MR = Val.getAsRegion()) {
+  if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) {
+Ret = Val;
+return false;
+  }
+}
+return true;
+  }
+};
+} // namespace
+
+/// Find the actual value behind a LazyCompoundVal. May return none if the
+/// query fails, or only finds another LazyCompoundVal (e.g. circular
+/// reference).
+static std::optional extractActualValueFrom(ProgramStateRef State,
+  nonloc::LazyCompoundVal L) {
+  std::optional Ret;
+  LazyHandler handler{L, Ret};
+  State->getStateManager().getStoreManager().iterBindings(State->getStore(),
+  handler);
+  if (Ret) {
+std::optional RVal = Ret = State->getSVal(Ret->getAsRegion());

NagyDonat wrote:

Why are you calling `State->getSval()` within this function?

This method queries the _value currently stored within that region_ which is as 
far as I see completely irrelevant for the goals of this function.

https://github.com/llvm/llvm-project/pull/106982
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)

2024-09-04 Thread Donát Nagy via cfe-commits


@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, 
UnaryOperator::Opcode opc,
   llvm_unreachable("Unexpected unary operator");
 }
 
+namespace {
+/// Iterate through to store to find the actual value this LazyCompoundVal
+/// corresponds to. Further reading is in LazyCompoundVal's docs.
+struct LazyHandler : public StoreManager::BindingsHandler {
+  nonloc::LazyCompoundVal l;
+  std::optional &Ret;
+
+  LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret)
+  : l(l), Ret(Ret) {}
+
+  virtual bool HandleBinding(StoreManager &SMgr, Store Store,
+ const MemRegion *Region, SVal Val) override {
+if (const MemRegion *MR = Val.getAsRegion()) {
+  if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) {

NagyDonat wrote:

Instead of introducing this `MR`, you should check 
`Region->isSubRegionOf(l.getRegion())` or something similar!

When `iterBindings()` calls this callback, it will pass `[Region, value bound 
to that region]` pairs (as the third and fourth arguments), and if you want to 
only consider the values stored within the region `l.getRegion()`, you need to 
check that the key (i.e. `Region` i.e. the third argument) is a subregion of 
the region whose contents were "frozen" as a `LazyCompoundVal`.

Your current code looks for values _stored anywhere within the frozen old 
state_ that are pointer-typed and _point to a subregion of `l.getRegion()`_.

https://github.com/llvm/llvm-project/pull/106982
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)

2024-09-04 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/106982
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)

2024-09-05 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

I investigated this situation and I found that this crash is not limited to 
empty source files -- I'd guess that the analyzer would crash on any input if 
it's executed as
```
// RUN: %clang_analyze_cc1 -w -analyzer-checker=nullability \
// RUN:   -analyzer-output=text -verify %s
```

The real reason why this crash is rare is that two core checkers 
(`core.NullDereference` and `core.NonNullParamChecker`) are derived from 
`EventDispatcher` so the assertion is not triggered in 
the "normal" case when the core checkers are enabled.

Note that the documentation of core checkers says that _"These checkers must be 
always switched on as other checker rely on them."_; however we should still 
eliminate this assertion failure because it's ugly.

Registering `NullabilityChecker` as an `EventDispatcher` (which happens to 
never dispatch any events) definitely works, but I think it would be more 
elegant to simply remove the assertion that caused the crash. Registering a 
handler for an event which cannot be emitted (under the current unusual config) 
is not an error, it should not trigger an assertion failure.

https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)

2024-09-05 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,10 @@
+// RUN: %clang_analyze_cc1 -w -analyzer-checker=nullability \
+// RUN:   -analyzer-output=text -verify %s
+//
+// expected-no-diagnostics
+//
+// This case previously crashed because of an assert in CheckerManager.cpp,
+// checking for registered event dispatchers. This check is too strict so
+// was removed by this commit. This test case covers the previous crash,
+// and is expected to simply not crash. The source file can be anything,
+// and does not need to be empty.

NagyDonat wrote:

```suggestion
// Previously there was an assertion requiring that if an Event is handled by
// some enabled checker, then there must be at least one enabled checker which
// can emit that kind of Event.
// This assertion failed when NullabilityChecker (which is a subclass of
// check::Event) was enabled, but the checkers
// inheriting from EventDispatcher were all disabled.
// This test file validates that enabling the nullability checkers (without any
// other checkers) no longer causes a crash.
```
I rewrote this explanation to make it (hopefully) more useful for someone who 
encounters this test file without any context.

https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)

2024-09-05 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)

2024-09-05 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

Hmm, I see that this assertion was hidden behind an `#ifndef NDEBUG` so it is 
only active within debug builds of the analyzer.

Nevertheless, I still think that it's better to remove it, because
1. it does not offer significant protection against introducing bugs (it's 
unlikely that somebody would develop an event handler without writing a 
corresponding event dispatcher);
2. it can cause trouble in test files where it's often good to enable a very 
minimal set of checkers.

https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)

2024-09-05 Thread Donát Nagy via cfe-commits


@@ -48,15 +48,7 @@ bool CheckerManager::hasPathSensitiveCheckers() const {
   EvalCallCheckers, EndOfTranslationUnitCheckers);
 }
 
-void CheckerManager::finishedCheckerRegistration() {
-#ifndef NDEBUG
-  // Make sure that for every event that has listeners, there is at least
-  // one dispatcher registered for it.
-  for (const auto &Event : Events)
-assert(Event.second.HasDispatcher &&
-   "No dispatcher registered for an event");
-#endif
-}
+void CheckerManager::finishedCheckerRegistration() {}

NagyDonat wrote:

Let's just remove this method altogether -- there is no reason to keep it. (If 
somebody needs some "run this after checker registration" logic in the future, 
they can easily re-add something like this.)

https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Remove overzealous "No dispatcher registered" assertion (PR #107294)

2024-09-05 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Remove overzealous "No dispatcher registered" assertion (PR #107294)

2024-09-05 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Remove overzealous "No dispatcher registered" assertion (PR #107294)

2024-09-05 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

@vabridgers I rewrote the title and description of this PR to describe the 
current approach. Feel free to adjust (or partially restore) it if you'd prefer 
a different phrasing.

https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model constructor initializer for an array member (PR #107537)

2024-09-06 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> LGTM. FYI "modelled" should contain only 1 "l" if I'm not mistaken.

British English uses "modeled", while American uses "modelled". I don't know 
which is preferred in LLVM. 

https://github.com/llvm/llvm-project/pull/107537
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Remove overzealous "No dispatcher registered" assertion (PR #107294)

2024-09-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Looks good to me :)

https://github.com/llvm/llvm-project/pull/107294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Fix #embed crash (PR #107764)

2024-09-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

LGTM, there is no reason to crash on `#embed`.

I think it would be nice to have a few testcases that show the behavior of the 
analyzer around `#embed`:
- Can we produce bug reports if there is an (unrelated) `#embed` expression on 
the execution path? Or do we abort the analysis there?
- If the analysis is not aborted, then how do we represent the contents of a 
memory region that's initialized by `#embed`? Is it `UnknownVal`? Or 
`UndefinedVal` (that may be problematic)?

https://github.com/llvm/llvm-project/pull/107764
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] fix crash on binding to symbolic region with `void *` type (PR #107572)

2024-09-09 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

To me this solution seems to be a bit hacky -- I don't like that we need to 
scatter "handle `void *` as if it was `char *`" special cases in various parts 
of the analyzer (I vaguely recall that I have also seen similar hacks 
elsewhere).

I'd prefer solutions that are as generic as possible and ensure that `void *` 
is consistently handled like `char *` if that's what we want. (By the way, will 
the analyzer be able to use this `UnknownVal` bound to the `void *` pointer? If 
not, then just avoid the binding.)

These are not blocking issues, feel free to merge this commit to fix the crash 
(instead of waiting for a theoretically better solution).

https://github.com/llvm/llvm-project/pull/107572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve handling of unsigned values in ArrayBoundCheckerV2 (PR #81034)

2024-09-09 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

These reports are definitely FPs caused by buggy number handling, so it would 
be good to suppress them.

I was planning to rewrite both `alpha.security.ReturnPtrRange` and 
`alpha.unix.cstring.OutOfBounds` to rely on the "backend" prototyped within 
ArrayBoundV2 instead of the current logic (which is AFAIK equivalent to 
ArrayBound V1). In fact, `alpha.security.ReturnPtrRange` is so close to array 
bounds checking, that probably it would be good to implement it as a subchecker 
within the `ArrayBoundCheckerV2` checker class (which can be enabled/disabled 
independently, but uses the same infrastructure). 

However, I was planning to do these improvements after bringing `ArrayBoundV2` 
out of the alpha state, and unfortunately that's severely delayed because I'm 
bogged down with the loop handling improvements.

If you want to suppress these reports, it would be very nice if you could do it 
by refactoring these checkers to reuse the bounds checking logic that's defined 
within ArrayBoundsV2. You could also tweak the current implementation of these 
checkers, but I'll probably throw away those changes when I'll (hopefully) 
switch to using the logic of ArrayBoundV2.

https://github.com/llvm/llvm-project/pull/81034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] fix crash on binding to symbolic region with `void *` type (PR #107572)

2024-09-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/107572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] fix crash on binding to symbolic region with `void *` type (PR #107572)

2024-09-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Bikeshedding: let's delete that empty line.

Otherwise LGTM.

https://github.com/llvm/llvm-project/pull/107572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] fix crash on binding to symbolic region with `void *` type (PR #107572)

2024-09-09 Thread Donát Nagy via cfe-commits


@@ -3,6 +3,9 @@
 
 int clang_analyzer_eval(int);
 

NagyDonat wrote:

```suggestion
```

https://github.com/llvm/llvm-project/pull/107572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

In the PR/commit message you write that

> At least theoretically the array bounds checker (when finalized) should find 
> the same cases that were detected by the PointerSubChecker.

but I'm pretty sure that the array bound checker already does find all these 
cases (and much more complex cases as well). (The only remaining reason why 
ArrayBoundV2 is in alpha is that it emits too many false positives, because 
engine issues like the inaccurate modeling of loops feed it with incorrect 
information.)

Instead of deleting the array bound testcases, you could also enable 
ArrayBoundV2 in the pointer-sub test files and validate that it does find those 
reports.

Also note that the array bounds checker is named `ArrayBoundV2` with no "s" in 
the name.

https://github.com/llvm/llvm-project/pull/102580
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102580
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)

2024-08-09 Thread Donát Nagy via cfe-commits


@@ -2501,7 +2501,14 @@ alpha.core.PointerSub (C)
 Check for pointer subtractions on two pointers pointing to different memory
 chunks. According to the C standard §6.5.6 only subtraction of pointers that
 point into (or one past the end) the same array object is valid (for this
-purpose non-array variables are like arrays of size 1).
+purpose non-array variables are like arrays of size 1). This checker only
+searches for different memory objects at subtraction, but does not check if the
+array index is correct (
+:ref:`alpha.security.ArrayBoundsV2 ` checks the
+index to some extent).

NagyDonat wrote:

```suggestion
reports subtraction between different memory objects and does not check whether
the index (or more generally, memory offset) is within bounds. Bounds checking
is done by :ref:`alpha.security.ArrayBoundV2 `.
```

https://github.com/llvm/llvm-project/pull/102580
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102580
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102580
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Add more notes to PointerSubChecker (PR #102432)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat requested changes to this pull request.

Thanks!

We should close this when that other PR is accepted and merged. Until then I'm 
putting a "Request changes" mark on this to prevent an accidental merge.

https://github.com/llvm/llvm-project/pull/102432
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)

2024-08-09 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

Oh, you're right, invalid pointer arithmetic like `(&x - 1)` is not handled by 
ArrayBoundV2, because right now that's the responsibility of a THIRD checker, 
`alpha.core.PointerArithm`.

However, directly after bringing `ArrayBoundV2` out of alpha, I'll continue 
with working on this `PointerArithm` and either I'll completely merge it with 
`ArrayBoundV2` or I'll keep it as a separate "frontend" which also calls the 
bounds-checking logic that's behind `ArrayBoundV2`.

Based on this I'd suggest that you should
- delete the bounds checking logic from `PointerSubChecker`,
- delete the testcases that were testing it (because it would be too 
complicated to bring in *two* other checkers + AFAIK `PointerArithm` is in a 
bad shape currently),
- refer to both ArrayBoundV2 and PointerArithm in the documentation.

https://github.com/llvm/llvm-project/pull/102580
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

Thanks for adding support for these functions! The code LGTM overall, except 
for one mostly theoretical issue (about the use of `assert`) which I described 
in an inline comment.

https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)

2024-08-09 Thread Donát Nagy via cfe-commits


@@ -21,16 +21,67 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h"
 
 using namespace clang;
 using namespace ento;
 
 namespace {
 
+QualType getOverflowBuiltinResultType(const CallEvent &Call) {
+  assert(Call.getNumArgs() == 3);
+
+  return Call.getArgExpr(2)->getType()->getPointeeType();
+}
+
+QualType getOverflowBuiltinResultType(const CallEvent &Call, CheckerContext &C,
+  unsigned BI) {
+  assert(Call.getNumArgs() == 3);
+
+  ASTContext &Ast = C.getASTContext();

NagyDonat wrote:

``
Checker code usually uses the name `ACtx` for 'the' `ASTContext` object, so 
consider using that here. However, `Ast` is also a completely correct name, so 
if you prefer it, feel free to keep it.
``

https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)

2024-08-09 Thread Donát Nagy via cfe-commits


@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker {
 
 } // namespace
 
+void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call,
+   CheckerContext &C,
+   BinaryOperator::Opcode Op,
+   QualType ResultType) const {
+  // All __builtin_*_overflow functions take 3 argumets.
+  assert(Call.getNumArgs() == 3);

NagyDonat wrote:

The use of `assert()` should be limited to situations where we realize that the 
_analyzer code_ is buggy (we reached an "impossible" situation). If the 
_analyzed code_ contains an invalid call like `__builtin_add_overflow(10, 20, 
&res, "spam")`, then the analyzer _may_ report an error, but must not crash 
with an assertion failure.

Unfortunately, this particular checker is a so-called "modeling" checker, so it 
is hidden from the user (as an undocumented implementation detail), and 
therefore it cannot create bug reports.

This means that if we encounter an invalid `__builtin_*_overflow` call, then we 
should probably just ignore it, because we should not assert and we cannot 
create a bug report. I'd assume that this is an extremely rare situation (if 
someone uses these builtin function, they're unlikely to mess up the argument 
count), so a more complex solution (e.g. introducing a new non-modeling checker 
which creates bug reports) is probably not worth the effort.

https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)

2024-08-09 Thread Donát Nagy via cfe-commits


@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker {
 
 } // namespace
 
+void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call,
+   CheckerContext &C,
+   BinaryOperator::Opcode Op,
+   QualType ResultType) const {
+  // All __builtin_*_overflow functions take 3 argumets.
+  assert(Call.getNumArgs() == 3);
+
+  ProgramStateRef State = C.getState();
+  SValBuilder &SVB = C.getSValBuilder();
+  const Expr *CE = Call.getOriginExpr();
+
+  SVal Arg1 = Call.getArgSVal(0);
+  SVal Arg2 = Call.getArgSVal(1);
+
+  SVal RetVal = SVB.evalBinOp(State, Op, Arg1, Arg2, ResultType);
+
+  // TODO: Handle overflows with values that known to overflow. Like INT_MAX + 
1
+  // should not produce state for non-overflow case and threat it as

NagyDonat wrote:

I thought about this for some time and checked the implementation of 
`evalBinOp` (more precisely `SValBuilder:evalBinOpNN`, the case that's relevant 
here), but unfortunately I don't see a clear solution.

You are right that `SVB` _should_ take care of this, but unfortunately that 
does not happen currently.

Currently the analyzer (and `SValBuilder` in particular) has its own rigid 
opinion about overflow handling: it allows overflow on all numeric operations 
(including signed ones, where it's undefined behavior according to the 
standard) and models it without splitting the state, so e.g. if `x` and `y` are 
`signed char` values within the range `50 <= x, y <= 100`, then it calculates 
that the possible values of `x+y` are $[-128, -56] \cup [100, 127]$.

It would be good to enhance this behavior by tagging the ranges within the 
`RangeSet` with "did overflow happen?" bits that would let us perform a state 
split. This is theoretically doable (I don't see anything that would block 
this), but would be a complex refactoring effort (we'd need to pass along this 
"did we overflow?" bit through all the spaghetti code).

Without this, you can implement a workaround that calls 
`SValBuilder::getMinValue()` and `SValBuilder::getMaxValue()` to get the 
minimal and maximal values of the operands (as `llvm::APSInt` values, that is, 
**a**rbitrary **p**recision **s**igned **int**egers) and then uses these to 
determine which of the two branches (overflow, no-overflow) are possible.

This workaround solution would be better than the current "always act as if 
both branches were possible", but it is not sufficient for deducing the set of 
possible results on the no-overflow branch. (That is, if `x` and `y` are 
`signed char` values within the range `50 <= x, y <= 100`, then this workaround 
would create two branches: one where `__builtin_add_overflow` returns true, and 
and one where it returns `false` and stores a symbol with range $[-128, -56] 
\cup [100, 127]$ within the result pointer; while the correct assumption would 
be that the result has range $[100, 127]$ on this no-overflow branch.)

https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Thanks for the updates, I'm satisfied with the state of this commit. However 
let's wait for an independent approval from @steakhal @haoNoQ or someone else, 
because this change does reduce the scope of this checker.

https://github.com/llvm/llvm-project/pull/102580
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ASTImporter] New fix for default template parameter values. (PR #101836)

2024-08-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

This change seems to be correct, but I don't know enough to dig into the 
details and provide a confident review. However, I as far as I know there are 
no significantly better reviewers, so I can give you the formal approval.

I think you can merge this commit once you update the comment that you marked 
as inaccurate.

https://github.com/llvm/llvm-project/pull/101836
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Trivial refactoring of region invalidation (PR #102456)

2024-08-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat closed 
https://github.com/llvm/llvm-project/pull/102456
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model overflow builtins (PR #102602)

2024-08-12 Thread Donát Nagy via cfe-commits


@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker {
 
 } // namespace
 
+void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call,
+   CheckerContext &C,
+   BinaryOperator::Opcode Op,
+   QualType ResultType) const {
+  // All __builtin_*_overflow functions take 3 argumets.
+  assert(Call.getNumArgs() == 3);

NagyDonat wrote:

> I've tried to catch this assert locally by calling __builtin_*_overflow with 
> 2 arguments, but clang just rejects it.

Oh, I see. Perhaps add a comment saying that "Calling a builtin with an 
incorrect argument count produces compiler error (...)", because e.g. I didn't 
know that this is the case.

https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model overflow builtins (PR #102602)

2024-08-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Improve documentation of `invalidateRegion` methods (PR #102477)

2024-08-12 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> Do you plan to apply more refactors to invalidation and Store?

These were opportunistic unplanned refactors, I was just familiarizing myself 
with the invalidation logic (because I'll need to use it in loop widening), and 
I quickly fixed some minor issues that I spotted.

However, I'll probably need one change to make more features of the 
invalidation logic accessible without a `CallEvent`. (Currently the `CallEvent` 
is passed down even to `RegionStoreManager::invalidateRegions` where it's only 
used to calculate an `enum GlobalsFilterKind` value; while I'd  like to have a 
variant of `ProgramState::invalidateRegions` that directly takes a 
`GlobalFiltersKind` instead of a `CallEvent`.)

However I'm not planning any functional/algorithmic changes in the invalidation 
logic, and I think and hope that I won't need to touch other parts of the 
`State`.

What can I do to help you avoid merge conflicts?



https://github.com/llvm/llvm-project/pull/102477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Improve documentation of `invalidateRegion` methods (PR #102477)

2024-08-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat closed 
https://github.com/llvm/llvm-project/pull/102477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model overflow builtins (PR #102602)

2024-08-12 Thread Donát Nagy via cfe-commits


@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker {
 
 } // namespace
 
+void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call,
+   CheckerContext &C,
+   BinaryOperator::Opcode Op,
+   QualType ResultType) const {
+  // All __builtin_*_overflow functions take 3 argumets.
+  assert(Call.getNumArgs() == 3);
+
+  ProgramStateRef State = C.getState();
+  SValBuilder &SVB = C.getSValBuilder();
+  const Expr *CE = Call.getOriginExpr();
+
+  SVal Arg1 = Call.getArgSVal(0);
+  SVal Arg2 = Call.getArgSVal(1);
+
+  SVal RetVal = SVB.evalBinOp(State, Op, Arg1, Arg2, ResultType);
+
+  // TODO: Handle overflows with values that known to overflow. Like INT_MAX + 
1
+  // should not produce state for non-overflow case and threat it as

NagyDonat wrote:

> `clang_analyzer_eval(a + b < 30); <--- Prints 1 and 0, but why ???`

Assuming that `a` and `b` are signed integers, they can be very negative, and 
then their sum can be a positive value above 30 (after an overflow). This means 
that both boolean values are possible for the expression `a + b < 30`, and the 
analyzer represents this by printing both 1 and 0.

(If I understand this correctly, we get two definite numbers instead of one 
range because the on-by-default `eagerlyAssume` mode causes a state split when 
it sees the comparison operator in `a + b < 30`, despite the fact that this is 
not in a conditional expression.)




https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model overflow builtins (PR #102602)

2024-08-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model overflow builtins (PR #102602)

2024-08-13 Thread Donát Nagy via cfe-commits


@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker {
 
 } // namespace
 
+void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call,
+   CheckerContext &C,
+   BinaryOperator::Opcode Op,
+   QualType ResultType) const {
+  // All __builtin_*_overflow functions take 3 argumets.
+  assert(Call.getNumArgs() == 3);
+
+  ProgramStateRef State = C.getState();
+  SValBuilder &SVB = C.getSValBuilder();
+  const Expr *CE = Call.getOriginExpr();
+
+  SVal Arg1 = Call.getArgSVal(0);
+  SVal Arg2 = Call.getArgSVal(1);
+
+  SVal RetVal = SVB.evalBinOp(State, Op, Arg1, Arg2, ResultType);
+
+  // TODO: Handle overflows with values that known to overflow. Like INT_MAX + 
1
+  // should not produce state for non-overflow case and threat it as

NagyDonat wrote:

I did some digging in `RangedConstraintManager` and it seems that algebraic 
expressions that involve two symbolic values **are handled as black boxes** by 
the analyzer: in your example it creates a `SymSymExpr` that is known to be the 
sum of `a` and `b`, but we don't have any logic which would constrain the 
potential values of this `SymSymExpr` based on the constraints known about the 
operands. (The analyzer can propagate information in the easier case when it 
knows the exact value of one of the operands.)

I don't know why do we have this shameful limitation... I'd guess that 
implementing the necessary `RangeSet` operation wouldn't be prohibitively 
difficult, but there may be performance implications, and perhaps 
well-constrained (but not exactly known) values are not common enough to make 
this a valuable investment.

https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model overflow builtins (PR #102602)

2024-08-13 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model overflow builtins (PR #102602)

2024-08-13 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/102602
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)

2024-08-13 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat created 
https://github.com/llvm/llvm-project/pull/103059

...because it is too noisy to be useful right now, and its architecture is 
terrible, so it can't act a starting point of future development.

The main problem with this checker is that it tries to do (or at least fake) 
path-sensitive analysis without actually using the established path-sensitive 
analysis engine.

Instead of actually tracking the symbolic values and the known constraints on 
them, this checker blindly gropes the AST and uses heuristics like "this 
variable was seen in a comparison operator expression that is not a loop 
condition, so it's probably not too large" (which was improved in a separate 
commit to at least ignore comparison operators that appear after the actual 
`malloc()` call).

This might have been acceptable in 2011 (when this checker was added), but 
since then we developed a significantly better standard approach for analysis 
and this old relic doesn't deserve to remain in the codebase.

Needless to say, this primitive approach causes lots of false positives (and 
presumably false negatives as well), which ensures that this alpha checker 
won't be missed by the users.

It would be good to eventually have a stable, path-sensitive checker that could 
succeed in the task where this hacky implementation fails, but the first step 
towards that goal is removing this old garbage, which just confuses the 
potential users or contributors.

(I don't have plans for reimplementing the goals of this checker. It could 
happen eventually, but right now I have many goals that are higher priority 
than this.)

From 36821708145587553f13df8648920f281b318240 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Tue, 13 Aug 2024 14:50:17 +0200
Subject: [PATCH] [analyzer] Delete alpha.security.MallocOverflow

...because it is too noisy to be useful right now, and its architecture
is terrible, so it can't act a starting point of future development.

The main problem with this checker is that it tries to do (or at least
fake) path-sensitive analysis without actually using the established
path-sensitive analysis engine.

Instead of actually tracking the symbolic values and the known
constraints on them, this checker blindly gropes the AST and uses
heuristics like "this variable was seen in a comparison operator
expression that is not a loop condition, so it's probably not too
large" (which was improved in a separate commit to at least ignore
comparison operators that appear after the actual `malloc()` call).

This might have been acceptable in 2011 (when this checker was added),
but since then we developed a significantly better standard approach for
analysis and this old relic doesn't deserve to remain in the codebase.

Needless to say, this primitive approach causes lots of false positives
(and presumably false negatives as well), which ensures that this
alpha checker won't be missed by the users.

It would be good to eventually have a stable, path-sensitive checker
that could succeed in the task where this hacky implementation fails,
but the first step towards that goal is taking out this old garbage,
which confuses the potential users or contributors.

(I don't have plans for reimplementing the goals of this checker. It
could happen eventually, but right now I have many plans that are higher
priority than this.)
---
 clang/docs/analyzer/checkers.rst  |  43 ---
 .../clang/StaticAnalyzer/Checkers/Checkers.td |   4 -
 .../StaticAnalyzer/Checkers/CMakeLists.txt|   1 -
 .../MallocOverflowSecurityChecker.cpp | 341 --
 clang/test/Analysis/malloc-overflow.c | 150 
 clang/test/Analysis/malloc-overflow.cpp   |  12 -
 clang/test/Analysis/malloc-overflow2.c|  40 --
 clang/www/analyzer/potential_checkers.html|   2 -
 .../lib/StaticAnalyzer/Checkers/BUILD.gn  |   1 -
 9 files changed, 594 deletions(-)
 delete mode 100644 
clang/lib/StaticAnalyzer/Checkers/MallocOverflowSecurityChecker.cpp
 delete mode 100644 clang/test/Analysis/malloc-overflow.c
 delete mode 100644 clang/test/Analysis/malloc-overflow.cpp
 delete mode 100644 clang/test/Analysis/malloc-overflow2.c

diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst
index 46b0b7b9c82376..0bfbc995579d41 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -2951,49 +2951,6 @@ Warn about buffer overflows (newer checker).
char c = s[x]; // warn: index is tainted
  }
 
-.. _alpha-security-MallocOverflow:
-
-alpha.security.MallocOverflow (C)
-"
-Check for overflows in the arguments to ``malloc()``.
-It tries to catch ``malloc(n * c)`` patterns, where:
-
- - ``n``: a variable or member access of an object
- - ``c``: a constant foldable integral
-
-This checker was designed for code audits, so expect false-positive reports.
-One is supposed to silence this checker by ensuring proper bounds c

[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)

2024-08-14 Thread Donát Nagy via cfe-commits


@@ -1039,10 +1039,6 @@ def ArrayBoundCheckerV2 : Checker<"ArrayBoundV2">,
   HelpText<"Warn about buffer overflows (newer checker)">,
   Documentation;
 
-def MallocOverflowSecurityChecker : Checker<"MallocOverflow">,

NagyDonat wrote:

Is it possible to write a "this checker no longer exists, but references to its 
name should not be an error" placeholder in this TD file?

If you readily know a solution, I'd be happy to add it, but I don't want to 
spend time on researching it.

https://github.com/llvm/llvm-project/pull/103059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)

2024-08-14 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/103059

From 36821708145587553f13df8648920f281b318240 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Tue, 13 Aug 2024 14:50:17 +0200
Subject: [PATCH 1/2] [analyzer] Delete alpha.security.MallocOverflow

...because it is too noisy to be useful right now, and its architecture
is terrible, so it can't act a starting point of future development.

The main problem with this checker is that it tries to do (or at least
fake) path-sensitive analysis without actually using the established
path-sensitive analysis engine.

Instead of actually tracking the symbolic values and the known
constraints on them, this checker blindly gropes the AST and uses
heuristics like "this variable was seen in a comparison operator
expression that is not a loop condition, so it's probably not too
large" (which was improved in a separate commit to at least ignore
comparison operators that appear after the actual `malloc()` call).

This might have been acceptable in 2011 (when this checker was added),
but since then we developed a significantly better standard approach for
analysis and this old relic doesn't deserve to remain in the codebase.

Needless to say, this primitive approach causes lots of false positives
(and presumably false negatives as well), which ensures that this
alpha checker won't be missed by the users.

It would be good to eventually have a stable, path-sensitive checker
that could succeed in the task where this hacky implementation fails,
but the first step towards that goal is taking out this old garbage,
which confuses the potential users or contributors.

(I don't have plans for reimplementing the goals of this checker. It
could happen eventually, but right now I have many plans that are higher
priority than this.)
---
 clang/docs/analyzer/checkers.rst  |  43 ---
 .../clang/StaticAnalyzer/Checkers/Checkers.td |   4 -
 .../StaticAnalyzer/Checkers/CMakeLists.txt|   1 -
 .../MallocOverflowSecurityChecker.cpp | 341 --
 clang/test/Analysis/malloc-overflow.c | 150 
 clang/test/Analysis/malloc-overflow.cpp   |  12 -
 clang/test/Analysis/malloc-overflow2.c|  40 --
 clang/www/analyzer/potential_checkers.html|   2 -
 .../lib/StaticAnalyzer/Checkers/BUILD.gn  |   1 -
 9 files changed, 594 deletions(-)
 delete mode 100644 
clang/lib/StaticAnalyzer/Checkers/MallocOverflowSecurityChecker.cpp
 delete mode 100644 clang/test/Analysis/malloc-overflow.c
 delete mode 100644 clang/test/Analysis/malloc-overflow.cpp
 delete mode 100644 clang/test/Analysis/malloc-overflow2.c

diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst
index 46b0b7b9c82376..0bfbc995579d41 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -2951,49 +2951,6 @@ Warn about buffer overflows (newer checker).
char c = s[x]; // warn: index is tainted
  }
 
-.. _alpha-security-MallocOverflow:
-
-alpha.security.MallocOverflow (C)
-"
-Check for overflows in the arguments to ``malloc()``.
-It tries to catch ``malloc(n * c)`` patterns, where:
-
- - ``n``: a variable or member access of an object
- - ``c``: a constant foldable integral
-
-This checker was designed for code audits, so expect false-positive reports.
-One is supposed to silence this checker by ensuring proper bounds checking on
-the variable in question using e.g. an ``assert()`` or a branch.
-
-.. code-block:: c
-
- void test(int n) {
-   void *p = malloc(n * sizeof(int)); // warn
- }
-
- void test2(int n) {
-   if (n > 100) // gives an upper-bound
- return;
-   void *p = malloc(n * sizeof(int)); // no warning
- }
-
- void test3(int n) {
-   assert(n <= 100 && "Contract violated.");
-   void *p = malloc(n * sizeof(int)); // no warning
- }
-
-Limitations:
-
- - The checker won't warn for variables involved in explicit casts,
-   since that might limit the variable's domain.
-   E.g.: ``(unsigned char)int x`` would limit the domain to ``[0,255]``.
-   The checker will miss the true-positive cases when the explicit cast would
-   not tighten the domain to prevent the overflow in the subsequent
-   multiplication operation.
-
- - It is an AST-based checker, thus it does not make use of the
-   path-sensitive taint-analysis.
-
 .. _alpha-security-MmapWriteExec:
 
 alpha.security.MmapWriteExec (C)
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td 
b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
index 38b55a0eb0a7b0..fb4114619ac3d3 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -1039,10 +1039,6 @@ def ArrayBoundCheckerV2 : Checker<"ArrayBoundV2">,
   HelpText<"Warn about buffer overflows (newer checker)">,
   Documentation;
 
-def MallocOverflowSecurityChecker : Checker<"MallocOverflow">,
-  HelpText<"Check for overflows in th

[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)

2024-08-14 Thread Donát Nagy via cfe-commits


@@ -1,40 +0,0 @@
-// RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown 
-analyzer-checker=alpha.security.MallocOverflow,unix -verify %s
-// RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown 
-analyzer-checker=alpha.security.MallocOverflow,unix,optin.portability 
-DPORTABILITY -verify %s

NagyDonat wrote:

I agree, that this might have some minor usefulness by accident, but I still 
think that it's better to remove it. (After all, we are not copying random "no 
results found" source files into our test directory...)

https://github.com/llvm/llvm-project/pull/103059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)

2024-08-14 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/103059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)

2024-08-14 Thread Donát Nagy via cfe-commits


@@ -1039,10 +1039,6 @@ def ArrayBoundCheckerV2 : Checker<"ArrayBoundV2">,
   HelpText<"Warn about buffer overflows (newer checker)">,
   Documentation;
 
-def MallocOverflowSecurityChecker : Checker<"MallocOverflow">,

NagyDonat wrote:

Now that I think about it, we already have lots of precedent for moving and 
renaming alpha checkers without prior warning, so the "updated to the new 
version and the old command line invocation doesn't work without changes" 
situation is not a problem.

The only difference is that after this change they won't be able to restore the 
behavior by using a new name for the checker, but I think that this is 
acceptable, because:
1. this checker is so unreliable that it's extremely unlikely that somebody was 
relying on it;
2. deleting a checker is a _conservative_ change: it removes the results of 
that particular checker, but it doesn't affects the results of the other 
checkers and doesn't hinder the investigation of the results by introducing 
lots of spammy results.

Based on this, I'm merging this commit now.

https://github.com/llvm/llvm-project/pull/103059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   5   6   7   8   9   10   >