[clang] 25b9696 - [analyzer] Upstream BitwiseShiftChecker
Author: Donát Nagy Date: 2023-08-18T10:47:05+02:00 New Revision: 25b9696b61e53a958e217bb3d0eab66350dc187f URL: https://github.com/llvm/llvm-project/commit/25b9696b61e53a958e217bb3d0eab66350dc187f DIFF: https://github.com/llvm/llvm-project/commit/25b9696b61e53a958e217bb3d0eab66350dc187f.diff LOG: [analyzer] Upstream BitwiseShiftChecker This commit releases a checker that was developed to a stable level in the Ericsson-internal fork of Clang Static Analyzer. Note that the functionality of this checker overlaps with core.UndefinedBinaryOperatorResult ("UBOR"), but there are several differences between them: (1) UBOR is only triggered when the constant folding performed by the Clang Static Analyzer engine determines that the value of a binary operator expression is undefined; this checker can report issues where the operands are not constants. (2) UBOR has unrelated checks for handling other binary operators, this checker only examines bitwise shifts. (3) This checker has a Pedantic flag and by default does not report expressions (e.g. -2 << 2) that're undefined by the standard but consistently supported in practice. (4) UBOR exhibits buggy behavior in code that involves cast expressions, e.g. void foo(unsigned short s) { if (s == 2) { (void) ((unsigned int) s) << 16; } } Later it would be good to eliminate this overlap (perhaps by deprecating and then eliminating the bitwise shift handling in UBOR), but in my opinion that belongs to separate commits. Differential Revision: https://reviews.llvm.org/D156312 Co-authored-by: Endre Fulop Added: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp clang/test/Analysis/bitwise-shift-common.c clang/test/Analysis/bitwise-shift-pedantic.c clang/test/Analysis/bitwise-shift-sanity-checks.c clang/test/Analysis/bitwise-shift-state-update.c Modified: clang/docs/ReleaseNotes.rst clang/docs/analyzer/checkers.rst clang/include/clang/StaticAnalyzer/Checkers/Checkers.td clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt clang/test/Analysis/analyzer-config.c clang/test/Analysis/analyzer-enabled-checkers.c clang/test/Analysis/bitwise-ops-nocrash.c clang/test/Analysis/bitwise-ops.c clang/test/Analysis/casts.c clang/test/Analysis/diagnostics/track_subexpressions.cpp clang/test/Analysis/left-shift-cxx2a.cpp clang/test/Analysis/std-c-library-functions-arg-enabled-checkers.c clang/test/Analysis/symbol-simplification-nonloc-loc.cpp Removed: diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 4571f16c37b5eb..30a8d57b34384e 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -283,6 +283,10 @@ libclang Static Analyzer --- +- Added a new checker ``core.BitwiseShift`` which reports situations where + bitwise shift operators produce undefined behavior (because some operand is + negative or too large). + .. _release-notes-sanitizers: Sanitizers diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst index 98cf799ae46280..ac6919e4c98297 100644 --- a/clang/docs/analyzer/checkers.rst +++ b/clang/docs/analyzer/checkers.rst @@ -29,6 +29,56 @@ Models core language features and contains general-purpose checkers such as divi null pointer dereference, usage of uninitialized values, etc. *These checkers must be always switched on as other checker rely on them.* +.. _core-BitwiseShift: + +core.BitwiseShift (C, C++) +"" + +Finds undefined behavior caused by the bitwise left- and right-shift operator +operating on integer types. + +By default, this checker only reports situations when the right operand is +either negative or larger than the bit width of the type of the left operand; +these are logically unsound. + +Moreover, if the pedantic mode is activated by +``-analyzer-config core.BitwiseShift:Pedantic=true``, then this checker also +reports situations where the _left_ operand of a shift operator is negative or +overflow occurs during the right shift of a signed value. (Most compilers +handle these predictably, but the C standard and the C++ standards before C++20 +say that they're undefined behavior. In the C++20 standard these constructs are +well-defined, so activating pedantic mode in C++20 has no effect.) + +**Examples** + +.. code-block:: cpp + + static_assert(sizeof(int) == 4, "assuming 32-bit int") + + void basic_examples(int a, int b) { + if (b < 0) { + b = a << b; // warn: right operand is negative in left shift + } else if (b >= 32) { + b = a >> b; // warn: right shift overflows the capacity of 'int' + } + } + + int pedantic_examples(int a, int b) { + if (a < 0) { + return a >> b; // warn: left operand is negative in right shift + } + a = 1000u << 31; // OK, overflow of unsigned value is well-defined, a == 0 + if (b > 10) { + a =
[clang] 3e01403 - [analyzer] Improve underflow handling in ArrayBoundV2
Author: Donát Nagy Date: 2023-08-21T17:17:02+02:00 New Revision: 3e014038b373e5a4a96d89d46cea17e4d2456a04 URL: https://github.com/llvm/llvm-project/commit/3e014038b373e5a4a96d89d46cea17e4d2456a04 DIFF: https://github.com/llvm/llvm-project/commit/3e014038b373e5a4a96d89d46cea17e4d2456a04.diff LOG: [analyzer] Improve underflow handling in ArrayBoundV2 This minor change ensures that underflow errors are reported on memory regions that are in unknown space but have a well-defined beginning. As a concrete example, the following test case did not produce a warning previously, but will produce a warning after this patch: typedef struct { int id; char name[256]; } user_t; user_t *get_symbolic_user(void); char test_underflow_symbolic_2() { user_t *user = get_symbolic_user(); return user->name[-1]; } Differential Revision: https://reviews.llvm.org/D157104 Added: Modified: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp clang/test/Analysis/out-of-bounds.c Removed: diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp index 269277aaf357f0..93da34ed5da4c5 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp @@ -172,13 +172,18 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, return; NonLoc ByteOffset = RawOffset->getByteOffset(); + const SubRegion *Reg = RawOffset->getRegion(); // CHECK LOWER BOUND - const MemSpaceRegion *SR = RawOffset->getRegion()->getMemorySpace(); - if (!llvm::isa(SR)) { -// A pointer to UnknownSpaceRegion may point to the middle of -// an allocated region. - + const MemSpaceRegion *Space = Reg->getMemorySpace(); + if (!(isa(Reg) && isa(Space))) { +// A symbolic region in unknown space represents an unknown pointer that +// may point into the middle of an array, so we don't look for underflows. +// Both conditions are significant because we want to check underflows in +// symbolic regions on the heap (which may be introduced by checkers like +// MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and +// non-symbolic regions (e.g. a field subregion of a symbolic region) in +// unknown space. auto [state_precedesLowerBound, state_withinLowerBound] = compareValueToThreshold(state, ByteOffset, svalBuilder.makeZeroArrayIndex(), svalBuilder); @@ -194,8 +199,7 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, } // CHECK UPPER BOUND - DefinedOrUnknownSVal Size = - getDynamicExtent(state, RawOffset->getRegion(), svalBuilder); + DefinedOrUnknownSVal Size = getDynamicExtent(state, Reg, svalBuilder); if (auto KnownSize = Size.getAs()) { auto [state_withinUpperBound, state_exceedsUpperBound] = compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder); diff --git a/clang/test/Analysis/out-of-bounds.c b/clang/test/Analysis/out-of-bounds.c index edf70085e18fae..cf8e60f66ac063 100644 --- a/clang/test/Analysis/out-of-bounds.c +++ b/clang/test/Analysis/out-of-bounds.c @@ -151,11 +151,23 @@ void test_assume_after_access(unsigned long x) { // Don't warn when indexing below the start of a symbolic region's whose // base extent we don't know. int *get_symbolic(void); -void test_index_below_symboloc(void) { +void test_underflow_symbolic(void) { int *buf = get_symbolic(); buf[-1] = 0; // no-warning; } +// But warn if we understand the internal memory layout of a symbolic region. +typedef struct { + int id; + char name[256]; +} user_t; + +user_t *get_symbolic_user(void); +char test_underflow_symbolic_2() { + user_t *user = get_symbolic_user(); + return user->name[-1]; // expected-warning{{Out of bound memory access}} +} + void test_incomplete_struct(void) { extern struct incomplete incomplete; int *p = (int *)&incomplete; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 8a5cfdf - [analyzer][NFC] Remove useless class BuiltinBug
Author: Donát Nagy Date: 2023-08-28T15:20:14+02:00 New Revision: 8a5cfdf7851dcdb4e16c510b133d7d0e79e43fc4 URL: https://github.com/llvm/llvm-project/commit/8a5cfdf7851dcdb4e16c510b133d7d0e79e43fc4 DIFF: https://github.com/llvm/llvm-project/commit/8a5cfdf7851dcdb4e16c510b133d7d0e79e43fc4.diff LOG: [analyzer][NFC] Remove useless class BuiltinBug ...because it provides no useful functionality compared to its base class `BugType`. A long time ago there were substantial differences between `BugType` and `BuiltinBug`, but they were eliminated by commit 1bd58233 in 2009 (!). Since then the only functionality provided by `BuiltinBug` was that it specified `categories::LogicError` as the bug category and it stored an extra data member `desc`. This commit sets `categories::LogicError` as the default value of the third argument (bug category) in the constructors of BugType and replaces use of the `desc` field with simpler logic. Note that `BugType` has a data member `Description` and a non-virtual method `BugType::getDescription()` which queries it; these are distinct from the member `desc` of `BuiltinBug` and the identically named method `BuiltinBug::getDescription()` which queries it. This confusing name collision was a major motivation for the elimination of `BuiltinBug`. As this commit touches many files, I avoided functional changes and left behind FIXME notes to mark minor issues that should be fixed later. Differential Revision: https://reviews.llvm.org/D158855 Added: Modified: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h clang/lib/StaticAnalyzer/Checkers/ArrayBoundChecker.cpp clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp clang/lib/StaticAnalyzer/Checkers/BoolAssignmentChecker.cpp clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp clang/lib/StaticAnalyzer/Checkers/CastSizeChecker.cpp clang/lib/StaticAnalyzer/Checkers/ChrootChecker.cpp clang/lib/StaticAnalyzer/Checkers/ConversionChecker.cpp clang/lib/StaticAnalyzer/Checkers/DebugCheckers.cpp clang/lib/StaticAnalyzer/Checkers/EnumCastOutOfRangeChecker.cpp clang/lib/StaticAnalyzer/Checkers/FixedAddressChecker.cpp clang/lib/StaticAnalyzer/Checkers/NonNullParamChecker.cpp clang/lib/StaticAnalyzer/Checkers/ObjCAtSyncChecker.cpp clang/lib/StaticAnalyzer/Checkers/PointerArithChecker.cpp clang/lib/StaticAnalyzer/Checkers/PointerSubChecker.cpp clang/lib/StaticAnalyzer/Checkers/ReturnPointerRangeChecker.cpp clang/lib/StaticAnalyzer/Checkers/ReturnUndefChecker.cpp clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp clang/lib/StaticAnalyzer/Checkers/TestAfterDivZeroChecker.cpp clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp clang/lib/StaticAnalyzer/Checkers/UndefCapturedBlockVarChecker.cpp clang/lib/StaticAnalyzer/Checkers/UndefResultChecker.cpp clang/lib/StaticAnalyzer/Checkers/UndefinedArraySubscriptChecker.cpp clang/lib/StaticAnalyzer/Checkers/UndefinedAssignmentChecker.cpp clang/lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp clang/lib/StaticAnalyzer/Checkers/VforkChecker.cpp clang/lib/StaticAnalyzer/Core/BugReporter.cpp clang/unittests/StaticAnalyzer/CallEventTest.cpp clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp clang/unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp Removed: diff --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h index 5588811763273b..e50afd6d0da7e5 100644 --- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h +++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h @@ -35,13 +35,13 @@ class BugType { virtual void anchor(); public: - BugType(CheckerNameRef CheckerName, StringRef Name, StringRef Cat, - bool SuppressOnSink = false) - : CheckerName(CheckerName), Description(Name), Category(Cat), + BugType(CheckerNameRef CheckerName, StringRef Desc, + StringRef Cat = categories::LogicError, bool SuppressOnSink = false) + : CheckerName(CheckerName), Description(Desc), Category(Cat), Checker(nullptr), SuppressOnSink(SuppressOnSink) {} - BugType(const CheckerBase *Checker, StringRef Name, StringRef Cat, - bool SuppressOnSink = false) - : CheckerName(Checker->getCheckerName()), Description(Name), + BugType(const CheckerBase *Checker, StringRef Desc, + StringRef Cat = categories::LogicError, bool SuppressOnSink = false) + : CheckerName(Checker->getCheckerName()), Description(Desc), Category(Cat), Checker(Checker), SuppressOnSink(SuppressOnSink) {} virtual ~BugType() = default; @@ -64,27 +64,6 @@ class BugType { bool isSuppressOnSink() const { return SuppressOnSin
[clang] cec30e2 - [dataflow] Fix complie on gcc7 part 2
Author: Donát Nagy Date: 2023-06-30T16:27:36+02:00 New Revision: cec30e2b190bd58a26f54b5fd4232d3828632466 URL: https://github.com/llvm/llvm-project/commit/cec30e2b190bd58a26f54b5fd4232d3828632466 DIFF: https://github.com/llvm/llvm-project/commit/cec30e2b190bd58a26f54b5fd4232d3828632466.diff LOG: [dataflow] Fix complie on gcc7 part 2 Change c2bb68078eb9 triggered a bug in gcc7 by disabling a copy constructor. Commit 2f7d30dee826 fixed one occurrence of this issue, but after that the compilation still failed with another very similar error. This commit eliminates this second compilation error, and ensures that `ninja check-clang-analysis` can be successfully executed with gcc 7.5. Added: Modified: clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h Removed: diff --git a/clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h b/clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h index c6e9f848477f8a..33eb42897b8a1a 100644 --- a/clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h +++ b/clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h @@ -235,7 +235,7 @@ runDataflowAnalysis( std::move(State.Env)}; }); }); - return BlockStates; + return std::move(BlockStates); } /// Abstract base class for dataflow "models": reusable analysis components that ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 1d75b18 - [analyzer][NFC] Fix dangling StringRef in barely used code
Author: Donát Nagy Date: 2023-06-30T17:15:43+02:00 New Revision: 1d75b18843fbca52655e240a120b5fdeeef17c0e URL: https://github.com/llvm/llvm-project/commit/1d75b18843fbca52655e240a120b5fdeeef17c0e DIFF: https://github.com/llvm/llvm-project/commit/1d75b18843fbca52655e240a120b5fdeeef17c0e.diff LOG: [analyzer][NFC] Fix dangling StringRef in barely used code CheckerContext::getNoteTag has a shorthand version that takes a plain 'StringRef Note' instead of a lambda that calculates the note. The old implementation of this method was incorrect because it created a lambda that captured the StringRef, which was dereferenced much later, when the NoteTags were visited. In the current codebase this does not cause errors because this method is called only once, and there the `Note` argument is a string literal that remains valid. However, I tried to use this method in a checker that I was prototyping, and there it printed random memory junk (instead of the message that I composed in a local variable). Differential Revision: https://reviews.llvm.org/D153889 Added: Modified: clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h Removed: diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h index 9bd5a802d5d64e..9923c41e6ad2d1 100644 --- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h +++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h @@ -316,8 +316,8 @@ class CheckerContext { ///bug path significantly shorter. const NoteTag *getNoteTag(StringRef Note, bool IsPrunable = false) { return getNoteTag( -[Note](BugReporterContext &, - PathSensitiveBugReport &) { return std::string(Note); }, +[Note = std::string(Note)](BugReporterContext &, + PathSensitiveBugReport &) { return Note; }, IsPrunable); } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] de25473 - [analyzer] Fix comparison logic in ArrayBoundCheckerV2
Author: Donát Nagy Date: 2023-04-26T15:02:23+02:00 New Revision: de2547329b41ad6ea4ea876d12731bde5a6b64c5 URL: https://github.com/llvm/llvm-project/commit/de2547329b41ad6ea4ea876d12731bde5a6b64c5 DIFF: https://github.com/llvm/llvm-project/commit/de2547329b41ad6ea4ea876d12731bde5a6b64c5.diff LOG: [analyzer] Fix comparison logic in ArrayBoundCheckerV2 The prototype checker alpha.security.ArrayBoundV2 performs two comparisons to check that in an expression like Array[Index] 0 <= Index < length(Array) holds. These comparisons are handled by almost identical logic: the inequality is first rearranged by getSimplifiedOffsets(), then evaluated with evalBinOpNN(). However the simplification used "naive" elementary mathematical schematics, but evalBinOpNN() performed the signed -> unsigned conversions described in the C/C++ standards, and this confusion led to wildly inaccurate results: false positives from the lower bound check and false negatives from the upper bound check. This commit eliminates the code duplication by moving the comparison logic into a separate function, then adds an explicit check to this unified code path, which handles the problematic case separately. In addition to this, the commit also cleans up a testcase that was demonstrating the presence of this problem. Note that while that testcase was failing with an overflow error, its actual problem was in the underflow handler logic: (0) The testcase introduces a five-element array "char a[5]" and an unknown argument "size_t len"; then evaluates "a[len+1]". (1) The underflow check tries to determine whether "len+1 < 0" holds. (2) This inequality is rearranged to "len < -1". (3) evalBinOpNN() evaluates this with the schematics of C/C++ and converts -1 to the size_t value SIZE_MAX. (4) The engine concludes that len == SIZE_MAX, because otherwise we'd have an underflow here. (5) The overflow check tries to determine whether "len+1 >= 5". (6) This inequality is rearranged to "len >= 4". (7) The engine substitutes len == SIZE_MAX and reports that we have an overflow. Differential Revision: https://reviews.llvm.org/D135375 Added: clang/test/Analysis/array-bound-v2-constraint-check.c Modified: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp Removed: clang/test/Analysis/out-of-bounds-false-positive.c diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp index 142577174ead8..a476613b6dcc7 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp @@ -57,8 +57,8 @@ class RegionRawOffsetV2 { : baseRegion(nullptr), byteOffset(UnknownVal()) {} public: - RegionRawOffsetV2(const SubRegion* base, SVal offset) -: baseRegion(base), byteOffset(offset) {} + RegionRawOffsetV2(const SubRegion *base, NonLoc offset) + : baseRegion(base), byteOffset(offset) { assert(base); } NonLoc getByteOffset() const { return byteOffset.castAs(); } const SubRegion *getRegion() const { return baseRegion; } @@ -72,19 +72,12 @@ class RegionRawOffsetV2 { }; } -static SVal computeExtentBegin(SValBuilder &svalBuilder, - const MemRegion *region) { - const MemSpaceRegion *SR = region->getMemorySpace(); - if (SR->getKind() == MemRegion::UnknownSpaceRegionKind) -return UnknownVal(); - else -return svalBuilder.makeZeroArrayIndex(); -} - // TODO: once the constraint manager is smart enough to handle non simplified // symbolic expressions remove this function. Note that this can not be used in // the constraint manager as is, since this does not handle overflows. It is // safe to assume, however, that memory offsets will not overflow. +// NOTE: callers of this function need to be aware of the effects of overflows +// and signed<->unsigned conversions! static std::pair getSimplifiedOffsets(NonLoc offset, nonloc::ConcreteInt extent, SValBuilder &svalBuilder) { @@ -117,6 +110,38 @@ getSimplifiedOffsets(NonLoc offset, nonloc::ConcreteInt extent, return std::pair(offset, extent); } +// Evaluate the comparison Value < Threshold with the help of the custom +// simplification algorithm defined for this checker. Return a pair of states, +// where the first one corresponds to "value below threshold" and the second +// corresponds to "value at or above threshold". Returns {nullptr, nullptr} in +// the case when the evaluation fails. +static std::pair +compareValueToThreshold(ProgramStateRef State, NonLoc Value, NonLoc Threshold, +SValBuilder &SVB) { + if (auto ConcreteThreshold = Threshold.getAs()) { +std::tie(Value, Threshold) = getSimplifiedOffsets(Value, *ConcreteThreshold, SVB); + } + if (auto ConcreteThreshold = Threshold.getAs()) { +QualType T = Value.getType(SVB.getContex
[clang] 8c22cbe - [analyzer] ArrayBoundCheckerV2: suppress false positives from ctype macros
Author: Donát Nagy Date: 2023-05-03T18:52:27+02:00 New Revision: 8c22cbea87beb74da3dc5891c40cdf574cd5fe56 URL: https://github.com/llvm/llvm-project/commit/8c22cbea87beb74da3dc5891c40cdf574cd5fe56 DIFF: https://github.com/llvm/llvm-project/commit/8c22cbea87beb74da3dc5891c40cdf574cd5fe56.diff LOG: [analyzer] ArrayBoundCheckerV2: suppress false positives from ctype macros The checker alpha.security.ArrayBoundV2 created bug reports in situations when the (tainted) result of fgetc() or getchar() was passed to one of the isX() macros from ctype.h. This is a common input handling pattern (within the limited toolbox of the C language) and several open source projects contained code where it led to false positive reports; so this commit suppresses ArrayBoundV2 reports generated within the isX() macros. Note that here even true positive reports would be difficult to understand, as they'd refer to the implementation details of these macros. Differential Revision: https://reviews.llvm.org/D149460 Added: Modified: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp clang/test/Analysis/taint-generic.c Removed: diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp index a476613b6dcc7..6d0cc505756b8 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp @@ -42,8 +42,10 @@ class ArrayBoundCheckerV2 : void reportTaintOOB(CheckerContext &C, ProgramStateRef errorState, SVal TaintedSVal) const; + static bool isFromCtypeMacro(const Stmt *S, ASTContext &AC); + public: - void checkLocation(SVal l, bool isLoad, const Stmt*S, + void checkLocation(SVal l, bool isLoad, const Stmt *S, CheckerContext &C) const; }; @@ -155,6 +157,15 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, // memory access is within the extent of the base region. Since we // have some flexibility in defining the base region, we can achieve // various levels of conservatism in our buffer overflow checking. + + // The header ctype.h (from e.g. glibc) implements the isX() macros as + // #define isX(arg) (LOOKUP_TABLE[arg] & BITMASK_FOR_X) + // and incomplete analysis of these leads to false positives. As even + // accurate reports would be confusing for the users, just disable reports + // from these macros: + if (isFromCtypeMacro(LoadS, checkerContext.getASTContext())) +return; + ProgramStateRef state = checkerContext.getState(); SValBuilder &svalBuilder = checkerContext.getSValBuilder(); @@ -267,6 +278,25 @@ void ArrayBoundCheckerV2::reportOOB(CheckerContext &checkerContext, checkerContext.emitReport(std::move(BR)); } +bool ArrayBoundCheckerV2::isFromCtypeMacro(const Stmt *S, ASTContext &ACtx) { + SourceLocation Loc = S->getBeginLoc(); + if (!Loc.isMacroID()) +return false; + + StringRef MacroName = Lexer::getImmediateMacroName( + Loc, ACtx.getSourceManager(), ACtx.getLangOpts()); + + if (MacroName.size() < 7 || MacroName[0] != 'i' || MacroName[1] != 's') +return false; + + return ((MacroName == "isalnum") || (MacroName == "isalpha") || + (MacroName == "isblank") || (MacroName == "isdigit") || + (MacroName == "isgraph") || (MacroName == "islower") || + (MacroName == "isnctrl") || (MacroName == "isprint") || + (MacroName == "ispunct") || (MacroName == "isspace") || + (MacroName == "isupper") || (MacroName == "isxdigit")); +} + #ifndef NDEBUG LLVM_DUMP_METHOD void RegionRawOffsetV2::dump() const { dumpToStream(llvm::errs()); diff --git a/clang/test/Analysis/taint-generic.c b/clang/test/Analysis/taint-generic.c index 626e01e39d158..62e1f570b6622 100644 --- a/clang/test/Analysis/taint-generic.c +++ b/clang/test/Analysis/taint-generic.c @@ -156,6 +156,28 @@ void bufferGetchar(int x) { Buffer[m] = 1; //expected-warning {{Out of bound memory access (index is tainted)}} } +extern const unsigned short int **__ctype_b_loc (void); +enum { _ISdigit = 2048 }; +# define isdigit(c) ((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) _ISdigit) + +int isdigitImplFalsePositive(void) { + // If this code no longer produces a bug report, then consider removing the + // special case that disables buffer overflow reports coming from the isX + // macros in ctypes.h. + int c = getchar(); + return ((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) _ISdigit); + //expected-warning@-1 {{Out of bound memory access (index is tainted)}} +} + +int isdigitSuppressed(void) { + // Same code as above, but reports are suppressed based on macro name: + int c = getchar(); + return isdigit(c); //no-warning +} + +// Some later tests use isdigit as a function, so we need to undef it: +#undef isdig
[clang] b88023c - [analyzer][NFC] Use std::optional instead of custom "empty" state
Author: Donát Nagy Date: 2023-05-04T12:56:15+02:00 New Revision: b88023c25729f4dd4548a25e5e12d6624e2adbaf URL: https://github.com/llvm/llvm-project/commit/b88023c25729f4dd4548a25e5e12d6624e2adbaf DIFF: https://github.com/llvm/llvm-project/commit/b88023c25729f4dd4548a25e5e12d6624e2adbaf.diff LOG: [analyzer][NFC] Use std::optional instead of custom "empty" state This commit eliminates the uninitialized error state from the class RegionRawOffsetV2 (which is locally used by the Clang Static Analyzer checker alpha.security.ArrayBoundV2) and replaces its use with std::optional. Motivated by https://reviews.llvm.org/D148355#inline-1437928 Moreover, the code of RegionRawOffsetV2::computeOffset() is rearranged to clarify its behavior. The helper function getValue() was eliminated by picking a better initial value for the variable Offset; two other helper functions were replaced by the lambda function Calc() because this way it doesn't need to take the "context" objects as parameters. This reorganization revealed some surprising (but not outright buggy) behavior that's marked by a FIXME and will be revisited in a separate commit. Differential Revision: https://reviews.llvm.org/D149259 Added: Modified: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp Removed: diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp index 6d0cc505756b8..269277aaf357f 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp @@ -53,21 +53,17 @@ class ArrayBoundCheckerV2 : class RegionRawOffsetV2 { private: const SubRegion *baseRegion; - SVal byteOffset; - - RegionRawOffsetV2() -: baseRegion(nullptr), byteOffset(UnknownVal()) {} + NonLoc byteOffset; public: RegionRawOffsetV2(const SubRegion *base, NonLoc offset) : baseRegion(base), byteOffset(offset) { assert(base); } - NonLoc getByteOffset() const { return byteOffset.castAs(); } + NonLoc getByteOffset() const { return byteOffset; } const SubRegion *getRegion() const { return baseRegion; } - static RegionRawOffsetV2 computeOffset(ProgramStateRef state, - SValBuilder &svalBuilder, - SVal location); + static std::optional + computeOffset(ProgramStateRef State, SValBuilder &SVB, SVal Location); void dump() const; void dumpToStream(raw_ostream &os) const; @@ -169,16 +165,16 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, ProgramStateRef state = checkerContext.getState(); SValBuilder &svalBuilder = checkerContext.getSValBuilder(); - const RegionRawOffsetV2 &rawOffset = -RegionRawOffsetV2::computeOffset(state, svalBuilder, location); + const std::optional &RawOffset = + RegionRawOffsetV2::computeOffset(state, svalBuilder, location); - if (!rawOffset.getRegion()) + if (!RawOffset) return; - NonLoc ByteOffset = rawOffset.getByteOffset(); + NonLoc ByteOffset = RawOffset->getByteOffset(); // CHECK LOWER BOUND - const MemSpaceRegion *SR = rawOffset.getRegion()->getMemorySpace(); + const MemSpaceRegion *SR = RawOffset->getRegion()->getMemorySpace(); if (!llvm::isa(SR)) { // A pointer to UnknownSpaceRegion may point to the middle of // an allocated region. @@ -199,7 +195,7 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, // CHECK UPPER BOUND DefinedOrUnknownSVal Size = - getDynamicExtent(state, rawOffset.getRegion(), svalBuilder); + getDynamicExtent(state, RawOffset->getRegion(), svalBuilder); if (auto KnownSize = Size.getAs()) { auto [state_withinUpperBound, state_exceedsUpperBound] = compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder); @@ -307,84 +303,55 @@ void RegionRawOffsetV2::dumpToStream(raw_ostream &os) const { } #endif -// Lazily computes a value to be used by 'computeOffset'. If 'val' -// is unknown or undefined, we lazily substitute '0'. Otherwise, -// return 'val'. -static inline SVal getValue(SVal val, SValBuilder &svalBuilder) { - return val.isUndef() ? svalBuilder.makeZeroArrayIndex() : val; -} - -// Scale a base value by a scaling factor, and return the scaled -// value as an SVal. Used by 'computeOffset'. -static inline SVal scaleValue(ProgramStateRef state, - NonLoc baseVal, CharUnits scaling, - SValBuilder &sb) { - return sb.evalBinOpNN(state, BO_Mul, baseVal, -sb.makeArrayIndex(scaling.getQuantity()), -sb.getArrayIndexType()); -} - -// Add an SVal to another, treating unknown and undefined values as -// summing to UnknownVal. Used by 'computeOffset'. -static SVal addValue(ProgramStateRef state, SVal x, SVal y, -
[clang] b16a593 - [analyzer][NFC] Pass the diagnostic message to the TrackConstraintBRVisitor
Author: Endre Fulop Date: 2023-06-06T16:28:31+02:00 New Revision: b16a59328fc5120006aeac501637229cc7e30357 URL: https://github.com/llvm/llvm-project/commit/b16a59328fc5120006aeac501637229cc7e30357 DIFF: https://github.com/llvm/llvm-project/commit/b16a59328fc5120006aeac501637229cc7e30357.diff LOG: [analyzer][NFC] Pass the diagnostic message to the TrackConstraintBRVisitor The `TrackConstraintBRVisitor` should accept a message for the note instead of creating one. It would let us inject domain-specific knowledge in a non-intrusive way, leading to a more generic visitor. Differential Revision: https://reviews.llvm.org/D152255 Added: Modified: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp Removed: diff --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h index 9c81e02245ecd..d9b3d9352d322 100644 --- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h +++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h @@ -392,19 +392,19 @@ const Expr *getDerefExpr(const Stmt *S); } // namespace bugreporter class TrackConstraintBRVisitor final : public BugReporterVisitor { - DefinedSVal Constraint; - bool Assumption; + const SmallString<64> Message; + const DefinedSVal Constraint; + const bool Assumption; bool IsSatisfied = false; - bool IsZeroCheck; /// We should start tracking from the last node along the path in which the /// value is constrained. bool IsTrackingTurnedOn = false; public: - TrackConstraintBRVisitor(DefinedSVal constraint, bool assumption) - : Constraint(constraint), Assumption(assumption), -IsZeroCheck(!Assumption && isa(Constraint)) {} + TrackConstraintBRVisitor(DefinedSVal constraint, bool assumption, + StringRef Message) + : Message(Message), Constraint(constraint), Assumption(assumption) {} void Profile(llvm::FoldingSetNodeID &ID) const override; @@ -417,6 +417,9 @@ class TrackConstraintBRVisitor final : public BugReporterVisitor { PathSensitiveBugReport &BR) override; private: + /// Checks if the constraint refers to a null-location. + bool isZeroCheck() const; + /// Checks if the constraint is valid in the current state. bool isUnderconstrained(const ExplodedNode *N) const; }; diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp index 2b461acf9a731..42d03f67510cf 100644 --- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp +++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp @@ -1786,6 +1786,7 @@ PathDiagnosticPieceRef StoreSiteFinder::VisitNode(const ExplodedNode *Succ, void TrackConstraintBRVisitor::Profile(llvm::FoldingSetNodeID &ID) const { static int tag = 0; ID.AddPointer(&tag); + ID.AddString(Message); ID.AddBoolean(Assumption); ID.Add(Constraint); } @@ -1796,8 +1797,12 @@ const char *TrackConstraintBRVisitor::getTag() { return "TrackConstraintBRVisitor"; } +bool TrackConstraintBRVisitor::isZeroCheck() const { + return !Assumption && Constraint.getAs(); +} + bool TrackConstraintBRVisitor::isUnderconstrained(const ExplodedNode *N) const { - if (IsZeroCheck) + if (isZeroCheck()) return N->getState()->isNull(Constraint).isUnderconstrained(); return (bool)N->getState()->assume(Constraint, !Assumption); } @@ -1827,19 +1832,6 @@ PathDiagnosticPieceRef TrackConstraintBRVisitor::VisitNode( // the transition point. assert(!isUnderconstrained(N)); -// We found the transition point for the constraint. We now need to -// pretty-print the constraint. (work-in-progress) -SmallString<64> sbuf; -llvm::raw_svector_ostream os(sbuf); - -if (isa(Constraint)) { - os << "Assuming pointer value is "; - os << (Assumption ? "non-null" : "null"); -} - -if (os.str().empty()) - return nullptr; - // Construct a new PathDiagnosticPiece. ProgramPoint P = N->getLocation(); @@ -1854,7 +1846,7 @@ PathDiagnosticPieceRef TrackConstraintBRVisitor::VisitNode( if (!L.isValid()) return nullptr; -auto X = std::make_shared(L, os.str()); +auto X = std::make_shared(L, Message); X->setTag(getTag()); return std::move(X); } @@ -2366,8 +2358,9 @@ class InterestingLValueHandler final : public ExpressionHandler { // null. if (V.getAsLocSymbol(/*IncludeBaseRegions=*/true)) if (LVState->isNull(V).isConstrainedTrue()) - Report.addVisitor(V.castAs(), -false); +Report.addVisitor( +V.castAs(), +/*Assumption=*/fals
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
https://github.com/NagyDonat requested changes to this pull request. Unfortunately this PR is not a full solution, because e.g. the following test code still triggers the crash (if it is appended to the test file `stream.c`): ```c struct zerosized { int foo[0]; }; void fread_zerosized(struct zerosized *buffer) { FILE *f = fopen("/tmp/foo.txt", "r"); fread(buffer, 1, 1, f); fclose(f); } ``` (I know that zero-sized arrays are not standard-compliant, but they are a widespread extension: e.g. both clang and gcc accept this `struct zerosized` with the default settings.) Overall, I'd say that it's futile to try to recognize zero-sized types with a "canonical type equal to" check, so you should just check whether `ElemSizeInChars` is zero and do something based on that. (Either an early return, or you can say `ElemSizeInChars = 1` at that point if you think that that's the logically correct solution.) ``This way you could also avoid the immediately invoked lambda in `getPointeeType` which is really ugly in my opinion.`` https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)
NagyDonat wrote: > Even protobuf contains this type of code: > https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=protobuf_v3.13.0_pointersub1&is-unique=on&diff-type=New&checker-name=alpha.core.PointerSub&report-id=5545776&report-hash=1bcd310fbaeccbcc13645b9b277239a2&report-filepath=%2adescriptor.pb.cc I still think that this (1) is undeniably undefined behavior (2) isn't common, so won't cause "spam" problems and (3( can be replaced by standard-compliant code (`offsetof`) so there is no need to introduce a special case for it. https://github.com/llvm/llvm-project/pull/96501 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
NagyDonat wrote: _I only noticed that this PR was already merged after posting the review. There is no need to revert the commit -- it's better than nothing -- but I'd be happy if you created a followup change that also handles the testcase that I mentioned._ https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
https://github.com/NagyDonat approved this pull request. LGTM, straightforward change. Thanks! https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
@@ -21,30 +21,55 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h" using namespace clang; using namespace ento; namespace { -class MmapWriteExecChecker : public Checker { +class MmapWriteExecChecker +: public Checker { CallDescription MmapFn{CDM::CLibrary, {"mmap"}, 6}; CallDescription MprotectFn{CDM::CLibrary, {"mprotect"}, 3}; - static int ProtWrite; - static int ProtExec; - static int ProtRead; const BugType BT{this, "W^X check fails, Write Exec prot flags set", "Security"}; + mutable bool FlagsInitialized = false; + mutable int ProtRead = 0x01; + mutable int ProtWrite = 0x02; + mutable int ProtExec = 0x04; + public: + void checkBeginFunction(CheckerContext &C) const; void checkPreCall(const CallEvent &Call, CheckerContext &C) const; + int ProtExecOv; int ProtReadOv; }; } -int MmapWriteExecChecker::ProtWrite = 0x02; -int MmapWriteExecChecker::ProtExec = 0x04; -int MmapWriteExecChecker::ProtRead = 0x01; +void MmapWriteExecChecker::checkBeginFunction(CheckerContext &C) const { + if (FlagsInitialized) +return; + + FlagsInitialized = true; + + const std::optional FoundProtRead = + tryExpandAsInteger("PROT_READ", C.getPreprocessor()); + const std::optional FoundProtWrite = + tryExpandAsInteger("PROT_WRITE", C.getPreprocessor()); + const std::optional FoundProtExec = + tryExpandAsInteger("PROT_EXEC", C.getPreprocessor()); + if (FoundProtRead && FoundProtWrite && FoundProtExec) { +ProtRead = *FoundProtRead; +ProtWrite = *FoundProtWrite; +ProtExec = *FoundProtExec; + } else { +// FIXME: Are these useful? NagyDonat wrote: They are not too useful (and it's a little bit surprising that there is no "ProtWriteOv"), but they're harmless and they don't require complex code, so let's keep them for now. https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland "[analyzer][NFC] Reorganize Z3 report refutation" (PR #97265)
https://github.com/NagyDonat approved this pull request. Let's merge this again, after the Z3 version there shouldn't be any additional problems. https://github.com/llvm/llvm-project/pull/97265 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland "[analyzer][NFC] Reorganize Z3 report refutation" (PR #97265)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97265 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
@@ -21,30 +21,55 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h" using namespace clang; using namespace ento; namespace { -class MmapWriteExecChecker : public Checker { +class MmapWriteExecChecker +: public Checker { CallDescription MmapFn{CDM::CLibrary, {"mmap"}, 6}; CallDescription MprotectFn{CDM::CLibrary, {"mprotect"}, 3}; - static int ProtWrite; - static int ProtExec; - static int ProtRead; const BugType BT{this, "W^X check fails, Write Exec prot flags set", "Security"}; + mutable bool FlagsInitialized = false; + mutable int ProtRead = 0x01; + mutable int ProtWrite = 0x02; + mutable int ProtExec = 0x04; + public: + void checkBeginFunction(CheckerContext &C) const; void checkPreCall(const CallEvent &Call, CheckerContext &C) const; + int ProtExecOv; int ProtReadOv; }; } -int MmapWriteExecChecker::ProtWrite = 0x02; -int MmapWriteExecChecker::ProtExec = 0x04; -int MmapWriteExecChecker::ProtRead = 0x01; +void MmapWriteExecChecker::checkBeginFunction(CheckerContext &C) const { + if (FlagsInitialized) +return; + + FlagsInitialized = true; + + const std::optional FoundProtRead = + tryExpandAsInteger("PROT_READ", C.getPreprocessor()); + const std::optional FoundProtWrite = + tryExpandAsInteger("PROT_WRITE", C.getPreprocessor()); + const std::optional FoundProtExec = + tryExpandAsInteger("PROT_EXEC", C.getPreprocessor()); + if (FoundProtRead && FoundProtWrite && FoundProtExec) { +ProtRead = *FoundProtRead; +ProtWrite = *FoundProtWrite; +ProtExec = *FoundProtExec; + } else { +// FIXME: Are these useful? +ProtRead = ProtReadOv; +ProtExec = ProtExecOv; NagyDonat wrote: This code fragment lets the user specify the value of the `PROT_READ` and `PROT_EXEC` macros as checker options: ```cpp Mwec->ProtExecOv = mgr.getAnalyzerOptions() .getCheckerIntegerOption(Mwec, "MmapProtExec"); Mwec->ProtReadOv = mgr.getAnalyzerOptions() .getCheckerIntegerOption(Mwec, "MmapProtRead"); ``` This is a very marginal feature and I wouldn't spend time on implementing it; but now that it's already implemented I'm not sure that we should delete it. https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reland "[analyzer] Harden safeguards for Z3 query times" (PR #97298)
https://github.com/NagyDonat approved this pull request. This should be re-landed as well; when it was merged earlier the only issue was the incompatibility with old Z3 and that was addressed since then (by bumping the version requirement). https://github.com/llvm/llvm-project/pull/97298 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
https://github.com/NagyDonat approved this pull request. Thanks for the updates! I added few minor comments, but the PR is already good enough to be merged. > > I only noticed that this PR was already merged after posting the review. > > There is no need to revert the commit -- it's better than nothing -- but > > I'd be happy if you created a followup change that also handles the > > testcase that I mentioned. > > I'm not sure what you refer to. This PR is not approved, hence not merged. > Please continue with the review. Oops, sorry -- I was probably confused by the purple "merged" icon in  ...but the real reason is that I got up at 4 AM today :sleeping: https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
@@ -1034,16 +1034,16 @@ void StreamChecker::preWrite(const FnDescription *Desc, const CallEvent &Call, C.addTransition(State); } -static std::optional getPointeeType(const MemRegion *R) { +static QualType getPointeeType(const MemRegion *R) { if (!R) -return std::nullopt; +return {}; NagyDonat wrote: It's a bit strange that here you return `{}`, while at the end of the function `QualType{}` is explicitly specified. https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
@@ -1034,16 +1034,16 @@ void StreamChecker::preWrite(const FnDescription *Desc, const CallEvent &Call, C.addTransition(State); } -static std::optional getPointeeType(const MemRegion *R) { +static QualType getPointeeType(const MemRegion *R) { if (!R) -return std::nullopt; +return {}; if (const auto *ER = dyn_cast(R)) -return ER->getElementType(); +return ER->getElementType()->getCanonicalTypeUnqualified(); NagyDonat wrote: Consider moving the `->getCanonicalTypeUnqualified()` calls into `escapeByStartIndexAndCount()` because that's the only place where it will be relevant. (This would eliminate the code duplication.) https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,37 @@ +Obtaining the Static Analyzer += + +This page describes how to download and install the analyzer. Once the analyzer is installed, follow the asdf :doc:`CommandLineUsage` on using the commandline to get started analyzing your code. NagyDonat wrote: Nice catch! :laughing: https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/95550 From 06adc063c2388ea534537f5a417751fdf64b22cd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Fri, 14 Jun 2024 15:16:34 +0200 Subject: [PATCH 1/4] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression ...becasue they were strangely worded and in a few cases outright incorrect. --- .../bugprone/SizeofExpressionCheck.cpp| 22 ++--- .../checkers/bugprone/sizeof-expression-2.c | 12 +-- .../sizeof-expression-any-pointer.cpp | 86 +-- ...on-warn-on-sizeof-pointer-to-aggregate.cpp | 4 +- .../checkers/bugprone/sizeof-expression.cpp | 84 +- 5 files changed, 105 insertions(+), 103 deletions(-) diff --git a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp index c25ee42d0899a..3afa0e5c39882 100644 --- a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp +++ b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp @@ -296,7 +296,7 @@ void SizeofExpressionCheck::check(const MatchFinder::MatchResult &Result) { } else if (const auto *E = Result.Nodes.getNodeAs("sizeof-integer-call")) { diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression " - "that results in an integer") + "of integer type") << E->getSourceRange(); } else if (const auto *E = Result.Nodes.getNodeAs("sizeof-this")) { diag(E->getBeginLoc(), @@ -314,7 +314,7 @@ void SizeofExpressionCheck::check(const MatchFinder::MatchResult &Result) { << E->getSourceRange(); } else { diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression " - "that results in a pointer") + "of pointer type") << E->getSourceRange(); } } else if (const auto *E = Result.Nodes.getNodeAs( @@ -342,31 +342,33 @@ void SizeofExpressionCheck::check(const MatchFinder::MatchResult &Result) { if (DenominatorSize > CharUnits::Zero() && !NumeratorSize.isMultipleOf(DenominatorSize)) { - diag(E->getOperatorLoc(), "suspicious usage of 'sizeof(...)/sizeof(...)';" + diag(E->getOperatorLoc(), "suspicious usage of 'sizeof(...)/sizeof(...)':" " numerator is not a multiple of denominator") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } else if (ElementSize > CharUnits::Zero() && DenominatorSize > CharUnits::Zero() && ElementSize != DenominatorSize) { - diag(E->getOperatorLoc(), "suspicious usage of 'sizeof(...)/sizeof(...)';" -" numerator is not a multiple of denominator") + diag(E->getOperatorLoc(), + "suspicious usage of 'sizeof(array)/sizeof(...)':" + " denominator differs from the size of array elements") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } else if (NumTy && DenomTy && NumTy == DenomTy) { - // FIXME: This message is wrong, it should not refer to sizeof "pointer" - // usage (and by the way, it would be to clarify all the messages). diag(E->getOperatorLoc(), - "suspicious usage of sizeof pointer 'sizeof(T)/sizeof(T)'") + "suspicious usage of 'sizeof(...)/sizeof(...)': both expressions " + "have the same type") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } else if (!WarnOnSizeOfPointer) { // When 'WarnOnSizeOfPointer' is enabled, these messages become redundant: if (PointedTy && DenomTy && PointedTy == DenomTy) { diag(E->getOperatorLoc(), - "suspicious usage of sizeof pointer 'sizeof(T*)/sizeof(T)'") + "suspicious usage of 'sizeof(...)/sizeof(...)': size of pointer " + "is divided by size of pointed type") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } else if (NumTy && DenomTy && NumTy->isPointerType() && DenomTy->isPointerType()) { diag(E->getOperatorLoc(), - "suspicious usage of sizeof pointer 'sizeof(P*)/sizeof(Q*)'") + "suspicious usage of 'sizeof(...)/sizeof(...)': both expressions " + "have pointer types") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } } diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c index aef930f2c8fda..b898071a56613 100644 --- a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c +++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c @@ -34,24 +34,24 @@ int Test5() { int sum = 0; sum +=
[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)
NagyDonat wrote: I mentioned this change in the paragraph that was describing my earlier commit (that was also modifying this check). https://github.com/llvm/llvm-project/pull/95550 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)
https://github.com/NagyDonat closed https://github.com/llvm/llvm-project/pull/95550 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate checkers-related docs from HTML to RST (PR #97032)
NagyDonat wrote: :thinking: Deleting the html files could break some links on external sites, so I think it would be better to replace them with a very simple "This content was moved to " placeholder. https://github.com/llvm/llvm-project/pull/97032 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,238 @@ +Command-Line Usage: CodeChecker and scan-build +=== + +This document provides guidelines for running Clang Static Analyzer from the command line on whole projects. +CodeChecker and scan-build are two CLI tools for using CSA on multiple files (tranlation units). +Both provide a way of driving the analyzer, detecting compilation flags, and generating reports. +CodeChecker is more actively maintained, provides heuristics for working with multiple versions of popular compilers and it also comes with a web-based GUI for viewing, filtering, categorizing and suppressing the results. +Therefore CodeChecker is recommended in case you need any of the above features or just more customizability in general. + +Comparison of CodeChecker and scan-build + + +Static Analyzer is by design a GUI tool originally intended to be consumed by the XCode IDE. +Its purpose is to find buggy execution paths in the program, and such paths are very hard to comprehend by looking at a non-interactive standard output. +It is possible, however, to invoke the Static Analyzer from the command line in order to obtain analysis results, and then later view them interactively in a graphical interface. +The following tools are used commonly to run the analyzer from the commandline. +Both tools are wrapper scripts to drive the analysis and the underlying invocations of the Clang compiler: + +1. CodeChecker_ is a driver and web server that runs the Static Analyzer on your projects on demand and maintains a database of issues. NagyDonat wrote: It's still strange that there is only one underscore which is _after_ the word 'CodeChecker'. I'd guess that the intent was to italicize 'CodeChecker' and we should add another underscore _before_ this word for the sake of clarity (even if the actual HTML generation can work with this half-broken situation. https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Check the correct first and last elements in cstring.UninitializedRead (PR #95408)
https://github.com/NagyDonat approved this pull request. LGTM, with the disclaimer that I'd still prefer e.g. "second argument instead of "2nd argument" -- but I won't block the review with this bikeshedding. https://github.com/llvm/llvm-project/pull/95408 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,238 @@ +Command-Line Usage: CodeChecker and scan-build +=== + +This document provides guidelines for running Clang Static Analyzer from the command line on whole projects. +CodeChecker and scan-build are two CLI tools for using CSA on multiple files (tranlation units). +Both provide a way of driving the analyzer, detecting compilation flags, and generating reports. +CodeChecker is more actively maintained, provides heuristics for working with multiple versions of popular compilers and it also comes with a web-based GUI for viewing, filtering, categorizing and suppressing the results. +Therefore CodeChecker is recommended in case you need any of the above features or just more customizability in general. + +Comparison of CodeChecker and scan-build + + +Static Analyzer is by design a GUI tool originally intended to be consumed by the XCode IDE. +Its purpose is to find buggy execution paths in the program, and such paths are very hard to comprehend by looking at a non-interactive standard output. +It is possible, however, to invoke the Static Analyzer from the command line in order to obtain analysis results, and then later view them interactively in a graphical interface. +The following tools are used commonly to run the analyzer from the commandline. +Both tools are wrapper scripts to drive the analysis and the underlying invocations of the Clang compiler: + +1. CodeChecker_ is a driver and web server that runs the Static Analyzer on your projects on demand and maintains a database of issues. NagyDonat wrote: Oh, that's surprising -- I was not familiar with that feature of RST. Naturally this invalidates my concern. https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
https://github.com/NagyDonat commented: I'm really happy to see that this duplicated documentation is finally cleaned up after so many years :partying_face: I added a few minor remarks in inline comments, but the change looks good overall. As we discussed privately, this change should be (mostly) limited to the format changes, and then separate follow-up commit should eliminate the outdated content (and, perhaps, add new information if that's necessary). I'll update the PR description to emphasize this. https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,239 @@ +Command Line Usage: scan-build and CodeChecker +== + +This document provides guidelines for running Clang Static Analyzer from the command line on whole projects. +CodeChecker and scan-build are two CLI tools for using CSA on multiple files (tranlation units). +Both provide a way of driving the analyzer, detecting compilation flags, and generating reports. +CodeChecker is more actively maintained, provides heuristics for working with multiple versions of popular compilers and it also comes with a web-based GUI for viewing, filtering, categorizing and suppressing the results. +Therefore CodeChecker is recommended in case you need any of the above features or just more customizability in general. + +Comparison of CodeChecker and scan-build + + +Static Analyzer is by design a GUI tool originally intended to be consumed by the XCode IDE. NagyDonat wrote: Is this still true? (Even if it's obsolete, we should probably leave it for a follow-up commit.) https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,239 @@ +Command Line Usage: scan-build and CodeChecker +== + +This document provides guidelines for running Clang Static Analyzer from the command line on whole projects. +CodeChecker and scan-build are two CLI tools for using CSA on multiple files (tranlation units). +Both provide a way of driving the analyzer, detecting compilation flags, and generating reports. +CodeChecker is more actively maintained, provides heuristics for working with multiple versions of popular compilers and it also comes with a web-based GUI for viewing, filtering, categorizing and suppressing the results. +Therefore CodeChecker is recommended in case you need any of the above features or just more customizability in general. + +Comparison of CodeChecker and scan-build + + +Static Analyzer is by design a GUI tool originally intended to be consumed by the XCode IDE. +Its purpose is to find buggy execution paths in the program, and such paths are very hard to comprehend by looking at a non-interactive standard output. +It is possible, however, to invoke the Static Analyzer from the command line in order to obtain analysis results, and then later view them interactively in a graphical interface. +The following tools are used commonly to run the analyzer from the command line. +Both tools are wrapper scripts to drive the analysis and the underlying invocations of the Clang compiler: + +1. scan-build_ is an old and simple command line tool that emits static analyzer warnings as HTML files while compiling your project. You can view the analysis results in your web browser. +- Useful for individual developers who simply want to view static analysis results at their desk, or in a very simple collaborative environment. +- Works on all major platforms (Windows, Linux, macOS) and is available as a package in many Linux distributions. +- Does not include support for cross-translation-unit analysis. + +2. CodeChecker_ is a driver and web server that runs the Static Analyzer on your projects on demand and maintains a database of issues. +- Perfect for managing large amounts of Static Analyzer warnings in a collaborative environment. +- Generally much more feature-rich than scan-build. +- Supports incremental analysis: Results can be stored in a database, subsequent analysis runs can be compared to list the newly added defects. +- :doc:`CrossTranslationUnit` is supported fully on Linux via CodeChecker. +- Can run clang-tidy checkers too. +- Open source, but out-of-tree, i.e. not part of the LLVM project. + +scan-build +-- + +**scan-build** is a command line utility that enables a user to run the static analyzer over their codebase as part of performing a regular build (from the command line). NagyDonat wrote: The use of "the Static Analyzer" vs "the static analyzer" is inconsistent -- it would be good to pick one of them and use it consistently. https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
https://github.com/NagyDonat commented: I'm really happy that you decided to document these :smile: https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown +/// or symbolic. nonloc::LazyCompoundVal operates by storing two things: +/// * a reference to the TypedValueRegion being snapshotted (yes, it is always +/// typed), and also +/// * a copy of the whole Store object, obtained from the ProgramState in NagyDonat wrote: Is it truly a _copy_ of the store object? This seems to contradict the "is a very cheap operation" line in the next paragraph...? Or is it a _copied store object_ that _shares the same bulk allocations_ with the original one (but differs in minor details?) https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown NagyDonat wrote: ```suggestion /// being referred to as "concrete", as many fields inside it would be unknown ``` https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown +/// or symbolic. nonloc::LazyCompoundVal operates by storing two things: +/// * a reference to the TypedValueRegion being snapshotted (yes, it is always +/// typed), and also +/// * a copy of the whole Store object, obtained from the ProgramState in NagyDonat wrote: Perhaps we should mention that `Store` is just a typedef for `const void *`... https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown +/// or symbolic. nonloc::LazyCompoundVal operates by storing two things: +/// * a reference to the TypedValueRegion being snapshotted (yes, it is always +/// typed), and also +/// * a copy of the whole Store object, obtained from the ProgramState in NagyDonat wrote: I do understand functional programming (I had a phase when I learnt Haskell and for a few years I thought that it's the absolutely perfect language :star_struck: -- since then I'm more realistic but I still like it), I was just confused by the fact that this line emphasizes "a copy of the whole Store object" as opposed to the "reference to" in the previous line. In a functional language the word "copy" is practically non-existent (we speak about values, not objects and their identity, so there is no reason to say that this is _a copy of_ e.g. that list -- we simply say that "this is that list"), while in C++ saying "copy" instead of "move" or "pointer/reference" is an important synonym of "we allocate lots of new memory", so I'd still prefer tweaking this line. https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown +/// or symbolic. nonloc::LazyCompoundVal operates by storing two things: +/// * a reference to the TypedValueRegion being snapshotted (yes, it is always +/// typed), and also +/// * a copy of the whole Store object, obtained from the ProgramState in +/// which it was created. NagyDonat wrote: ```suggestion /// * a reference to the whole Store object, obtained from the ProgramState in ///which the nonloc::LazyCompoundVal was created. /// /// Note that the old ProgramState and its Store is kept alive during the /// analysis because these are immutable functional data structures and each new /// Store value is represented as "earlier Store" + "additional binding". ``` What would you think about this phrasing? https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] [MallocChecker] suspect all release functions as candidate for suppression (PR #104599)
https://github.com/NagyDonat approved this pull request. I'm satisfied with the current state of this commit, but let's wait a few days for a review from @Szelethus (or anybody else interested). https://github.com/llvm/llvm-project/pull/104599 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/106982 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)
https://github.com/NagyDonat requested changes to this pull request. Unfortunately I found several fundamental issues within the implementation of `extractActualValueFrom()` -- see inline comments for details. Moreover I'm not convinced that this "replace `LazyCompoundVal`s with actual values" logic is the right approach for your goals. When we discussed this yesterday, you told me that this logic was intended to handle the situations where a `LazyCompoundVal` represents a non-aggregate type (like an integer) -- but I think that we should catch that kind of non-canonical representation at an earlier point (e.g. within some `getSVal` method). Hiding an integer behind this kind of lazy wrapper is a serious defect of our store model which inhibits many operations -- not just `evalBinOp`. On the other hand, this "replace `LazyCompoundVal`s with a better representation" approach could be a good idea in the cases when the `LazyCompoundVal` represents an aggregate. However in this case I don't think that you can get a non-lazy representation by simply selecting the right binding (either via `iterBinding`s or via a direct `getBinding` call) -- you will need to find or write an explicit "convert this `LazyCompoundVal` into a non-lazy `CompoundVal`" function. https://github.com/llvm/llvm-project/pull/106982 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)
@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, UnaryOperator::Opcode opc, llvm_unreachable("Unexpected unary operator"); } +namespace { +/// Iterate through to store to find the actual value this LazyCompoundVal +/// corresponds to. Further reading is in LazyCompoundVal's docs. +struct LazyHandler : public StoreManager::BindingsHandler { + nonloc::LazyCompoundVal l; + std::optional &Ret; + + LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret) + : l(l), Ret(Ret) {} + + virtual bool HandleBinding(StoreManager &SMgr, Store Store, + const MemRegion *Region, SVal Val) override { +if (const MemRegion *MR = Val.getAsRegion()) { + if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) { +Ret = Val; +return false; + } +} +return true; + } +}; +} // namespace + +/// Find the actual value behind a LazyCompoundVal. May return none if the +/// query fails, or only finds another LazyCompoundVal (e.g. circular +/// reference). +static std::optional extractActualValueFrom(ProgramStateRef State, + nonloc::LazyCompoundVal L) { + std::optional Ret; + LazyHandler handler{L, Ret}; + State->getStateManager().getStoreManager().iterBindings(State->getStore(), + handler); + if (Ret) { +std::optional RVal = Ret = State->getSVal(Ret->getAsRegion()); +// If our best efforts lead us back to another LazyCompoundValue, give up. +if (Ret == RVal) + return {}; NagyDonat wrote: Something is fishy here -- after `RVal = Ret` the condition `Ret == RVal` should always be true. https://github.com/llvm/llvm-project/pull/106982 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)
@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, UnaryOperator::Opcode opc, llvm_unreachable("Unexpected unary operator"); } +namespace { +/// Iterate through to store to find the actual value this LazyCompoundVal +/// corresponds to. Further reading is in LazyCompoundVal's docs. +struct LazyHandler : public StoreManager::BindingsHandler { + nonloc::LazyCompoundVal l; + std::optional &Ret; + + LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret) + : l(l), Ret(Ret) {} + + virtual bool HandleBinding(StoreManager &SMgr, Store Store, + const MemRegion *Region, SVal Val) override { +if (const MemRegion *MR = Val.getAsRegion()) { + if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) { +Ret = Val; +return false; + } +} +return true; + } +}; +} // namespace + +/// Find the actual value behind a LazyCompoundVal. May return none if the +/// query fails, or only finds another LazyCompoundVal (e.g. circular +/// reference). +static std::optional extractActualValueFrom(ProgramStateRef State, + nonloc::LazyCompoundVal L) { + std::optional Ret; + LazyHandler handler{L, Ret}; + State->getStateManager().getStoreManager().iterBindings(State->getStore(), NagyDonat wrote: I checked the implementation of `iterBindings` and the other methods of `StoreManager` and I still think that `iterBindings` is not the right tool for this job and you should use `SVal StoreManager::getBinding(Store S, Loc L, QualType T)` instead. I know that the documentation of `LazyCompoundVal` suggests using `iterBindings` and that those docs were blessed by several reviewers (including me), but that doesn't change the fact that there are other more practical methods _and_ `iterBindings` doesn't always provide a complete answer, because it only iterates over _direct_ bindings and skips the so-called default bindings. https://github.com/llvm/llvm-project/pull/106982 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)
@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, UnaryOperator::Opcode opc, llvm_unreachable("Unexpected unary operator"); } +namespace { +/// Iterate through to store to find the actual value this LazyCompoundVal +/// corresponds to. Further reading is in LazyCompoundVal's docs. +struct LazyHandler : public StoreManager::BindingsHandler { + nonloc::LazyCompoundVal l; + std::optional &Ret; + + LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret) + : l(l), Ret(Ret) {} + + virtual bool HandleBinding(StoreManager &SMgr, Store Store, + const MemRegion *Region, SVal Val) override { +if (const MemRegion *MR = Val.getAsRegion()) { + if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) { +Ret = Val; +return false; + } +} +return true; + } +}; +} // namespace + +/// Find the actual value behind a LazyCompoundVal. May return none if the +/// query fails, or only finds another LazyCompoundVal (e.g. circular +/// reference). +static std::optional extractActualValueFrom(ProgramStateRef State, + nonloc::LazyCompoundVal L) { + std::optional Ret; + LazyHandler handler{L, Ret}; + State->getStateManager().getStoreManager().iterBindings(State->getStore(), NagyDonat wrote: Instead of `State->getStore()` (the current state of the memory) you must use `L->getStore()` (the old memory snapshot which defines the value represented by the LazyCompoundVal `L`). https://github.com/llvm/llvm-project/pull/106982 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)
@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, UnaryOperator::Opcode opc, llvm_unreachable("Unexpected unary operator"); } +namespace { +/// Iterate through to store to find the actual value this LazyCompoundVal +/// corresponds to. Further reading is in LazyCompoundVal's docs. +struct LazyHandler : public StoreManager::BindingsHandler { + nonloc::LazyCompoundVal l; + std::optional &Ret; + + LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret) + : l(l), Ret(Ret) {} + + virtual bool HandleBinding(StoreManager &SMgr, Store Store, + const MemRegion *Region, SVal Val) override { +if (const MemRegion *MR = Val.getAsRegion()) { + if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) { +Ret = Val; +return false; + } +} +return true; + } +}; +} // namespace + +/// Find the actual value behind a LazyCompoundVal. May return none if the +/// query fails, or only finds another LazyCompoundVal (e.g. circular +/// reference). +static std::optional extractActualValueFrom(ProgramStateRef State, + nonloc::LazyCompoundVal L) { + std::optional Ret; + LazyHandler handler{L, Ret}; + State->getStateManager().getStoreManager().iterBindings(State->getStore(), + handler); + if (Ret) { +std::optional RVal = Ret = State->getSVal(Ret->getAsRegion()); NagyDonat wrote: Why are you calling `State->getSval()` within this function? This method queries the _value currently stored within that region_ which is as far as I see completely irrelevant for the goals of this function. https://github.com/llvm/llvm-project/pull/106982 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)
@@ -490,6 +491,47 @@ SVal SValBuilder::evalUnaryOp(ProgramStateRef state, UnaryOperator::Opcode opc, llvm_unreachable("Unexpected unary operator"); } +namespace { +/// Iterate through to store to find the actual value this LazyCompoundVal +/// corresponds to. Further reading is in LazyCompoundVal's docs. +struct LazyHandler : public StoreManager::BindingsHandler { + nonloc::LazyCompoundVal l; + std::optional &Ret; + + LazyHandler(nonloc::LazyCompoundVal l, std::optional &Ret) + : l(l), Ret(Ret) {} + + virtual bool HandleBinding(StoreManager &SMgr, Store Store, + const MemRegion *Region, SVal Val) override { +if (const MemRegion *MR = Val.getAsRegion()) { + if (MR->isSubRegionOf(l.getRegion()) && MR != l.getAsRegion()) { NagyDonat wrote: Instead of introducing this `MR`, you should check `Region->isSubRegionOf(l.getRegion())` or something similar! When `iterBindings()` calls this callback, it will pass `[Region, value bound to that region]` pairs (as the third and fourth arguments), and if you want to only consider the values stored within the region `l.getRegion()`, you need to check that the key (i.e. `Region` i.e. the third argument) is a subregion of the region whose contents were "frozen" as a `LazyCompoundVal`. Your current code looks for values _stored anywhere within the frozen old state_ that are pointer-typed and _point to a subregion of `l.getRegion()`_. https://github.com/llvm/llvm-project/pull/106982 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Implement binary operations on LazyCompoundVals (PR #106982)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/106982 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)
https://github.com/NagyDonat commented: I investigated this situation and I found that this crash is not limited to empty source files -- I'd guess that the analyzer would crash on any input if it's executed as ``` // RUN: %clang_analyze_cc1 -w -analyzer-checker=nullability \ // RUN: -analyzer-output=text -verify %s ``` The real reason why this crash is rare is that two core checkers (`core.NullDereference` and `core.NonNullParamChecker`) are derived from `EventDispatcher` so the assertion is not triggered in the "normal" case when the core checkers are enabled. Note that the documentation of core checkers says that _"These checkers must be always switched on as other checker rely on them."_; however we should still eliminate this assertion failure because it's ugly. Registering `NullabilityChecker` as an `EventDispatcher` (which happens to never dispatch any events) definitely works, but I think it would be more elegant to simply remove the assertion that caused the crash. Registering a handler for an event which cannot be emitted (under the current unusual config) is not an error, it should not trigger an assertion failure. https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)
@@ -0,0 +1,10 @@ +// RUN: %clang_analyze_cc1 -w -analyzer-checker=nullability \ +// RUN: -analyzer-output=text -verify %s +// +// expected-no-diagnostics +// +// This case previously crashed because of an assert in CheckerManager.cpp, +// checking for registered event dispatchers. This check is too strict so +// was removed by this commit. This test case covers the previous crash, +// and is expected to simply not crash. The source file can be anything, +// and does not need to be empty. NagyDonat wrote: ```suggestion // Previously there was an assertion requiring that if an Event is handled by // some enabled checker, then there must be at least one enabled checker which // can emit that kind of Event. // This assertion failed when NullabilityChecker (which is a subclass of // check::Event) was enabled, but the checkers // inheriting from EventDispatcher were all disabled. // This test file validates that enabling the nullability checkers (without any // other checkers) no longer causes a crash. ``` I rewrote this explanation to make it (hopefully) more useful for someone who encounters this test file without any context. https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)
https://github.com/NagyDonat commented: Hmm, I see that this assertion was hidden behind an `#ifndef NDEBUG` so it is only active within debug builds of the analyzer. Nevertheless, I still think that it's better to remove it, because 1. it does not offer significant protection against introducing bugs (it's unlikely that somebody would develop an event handler without writing a corresponding event dispatcher); 2. it can cause trouble in test files where it's often good to enable a very minimal set of checkers. https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Prevent crash due to missing EventDispatch in corner case (PR #107294)
@@ -48,15 +48,7 @@ bool CheckerManager::hasPathSensitiveCheckers() const { EvalCallCheckers, EndOfTranslationUnitCheckers); } -void CheckerManager::finishedCheckerRegistration() { -#ifndef NDEBUG - // Make sure that for every event that has listeners, there is at least - // one dispatcher registered for it. - for (const auto &Event : Events) -assert(Event.second.HasDispatcher && - "No dispatcher registered for an event"); -#endif -} +void CheckerManager::finishedCheckerRegistration() {} NagyDonat wrote: Let's just remove this method altogether -- there is no reason to keep it. (If somebody needs some "run this after checker registration" logic in the future, they can easily re-add something like this.) https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Remove overzealous "No dispatcher registered" assertion (PR #107294)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Remove overzealous "No dispatcher registered" assertion (PR #107294)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Remove overzealous "No dispatcher registered" assertion (PR #107294)
NagyDonat wrote: @vabridgers I rewrote the title and description of this PR to describe the current approach. Feel free to adjust (or partially restore) it if you'd prefer a different phrasing. https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model constructor initializer for an array member (PR #107537)
NagyDonat wrote: > LGTM. FYI "modelled" should contain only 1 "l" if I'm not mistaken. British English uses "modeled", while American uses "modelled". I don't know which is preferred in LLVM. https://github.com/llvm/llvm-project/pull/107537 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Remove overzealous "No dispatcher registered" assertion (PR #107294)
https://github.com/NagyDonat approved this pull request. Looks good to me :) https://github.com/llvm/llvm-project/pull/107294 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Fix #embed crash (PR #107764)
https://github.com/NagyDonat commented: LGTM, there is no reason to crash on `#embed`. I think it would be nice to have a few testcases that show the behavior of the analyzer around `#embed`: - Can we produce bug reports if there is an (unrelated) `#embed` expression on the execution path? Or do we abort the analysis there? - If the analysis is not aborted, then how do we represent the contents of a memory region that's initialized by `#embed`? Is it `UnknownVal`? Or `UndefinedVal` (that may be problematic)? https://github.com/llvm/llvm-project/pull/107764 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] fix crash on binding to symbolic region with `void *` type (PR #107572)
NagyDonat wrote: To me this solution seems to be a bit hacky -- I don't like that we need to scatter "handle `void *` as if it was `char *`" special cases in various parts of the analyzer (I vaguely recall that I have also seen similar hacks elsewhere). I'd prefer solutions that are as generic as possible and ensure that `void *` is consistently handled like `char *` if that's what we want. (By the way, will the analyzer be able to use this `UnknownVal` bound to the `void *` pointer? If not, then just avoid the binding.) These are not blocking issues, feel free to merge this commit to fix the crash (instead of waiting for a theoretically better solution). https://github.com/llvm/llvm-project/pull/107572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve handling of unsigned values in ArrayBoundCheckerV2 (PR #81034)
NagyDonat wrote: These reports are definitely FPs caused by buggy number handling, so it would be good to suppress them. I was planning to rewrite both `alpha.security.ReturnPtrRange` and `alpha.unix.cstring.OutOfBounds` to rely on the "backend" prototyped within ArrayBoundV2 instead of the current logic (which is AFAIK equivalent to ArrayBound V1). In fact, `alpha.security.ReturnPtrRange` is so close to array bounds checking, that probably it would be good to implement it as a subchecker within the `ArrayBoundCheckerV2` checker class (which can be enabled/disabled independently, but uses the same infrastructure). However, I was planning to do these improvements after bringing `ArrayBoundV2` out of the alpha state, and unfortunately that's severely delayed because I'm bogged down with the loop handling improvements. If you want to suppress these reports, it would be very nice if you could do it by refactoring these checkers to reuse the bounds checking logic that's defined within ArrayBoundsV2. You could also tweak the current implementation of these checkers, but I'll probably throw away those changes when I'll (hopefully) switch to using the logic of ArrayBoundV2. https://github.com/llvm/llvm-project/pull/81034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] fix crash on binding to symbolic region with `void *` type (PR #107572)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/107572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] fix crash on binding to symbolic region with `void *` type (PR #107572)
https://github.com/NagyDonat approved this pull request. Bikeshedding: let's delete that empty line. Otherwise LGTM. https://github.com/llvm/llvm-project/pull/107572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] fix crash on binding to symbolic region with `void *` type (PR #107572)
@@ -3,6 +3,9 @@ int clang_analyzer_eval(int); NagyDonat wrote: ```suggestion ``` https://github.com/llvm/llvm-project/pull/107572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)
https://github.com/NagyDonat commented: In the PR/commit message you write that > At least theoretically the array bounds checker (when finalized) should find > the same cases that were detected by the PointerSubChecker. but I'm pretty sure that the array bound checker already does find all these cases (and much more complex cases as well). (The only remaining reason why ArrayBoundV2 is in alpha is that it emits too many false positives, because engine issues like the inaccurate modeling of loops feed it with incorrect information.) Instead of deleting the array bound testcases, you could also enable ArrayBoundV2 in the pointer-sub test files and validate that it does find those reports. Also note that the array bounds checker is named `ArrayBoundV2` with no "s" in the name. https://github.com/llvm/llvm-project/pull/102580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)
@@ -2501,7 +2501,14 @@ alpha.core.PointerSub (C) Check for pointer subtractions on two pointers pointing to different memory chunks. According to the C standard §6.5.6 only subtraction of pointers that point into (or one past the end) the same array object is valid (for this -purpose non-array variables are like arrays of size 1). +purpose non-array variables are like arrays of size 1). This checker only +searches for different memory objects at subtraction, but does not check if the +array index is correct ( +:ref:`alpha.security.ArrayBoundsV2 ` checks the +index to some extent). NagyDonat wrote: ```suggestion reports subtraction between different memory objects and does not check whether the index (or more generally, memory offset) is within bounds. Bounds checking is done by :ref:`alpha.security.ArrayBoundV2 `. ``` https://github.com/llvm/llvm-project/pull/102580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Add more notes to PointerSubChecker (PR #102432)
https://github.com/NagyDonat requested changes to this pull request. Thanks! We should close this when that other PR is accepted and merged. Until then I'm putting a "Request changes" mark on this to prevent an accidental merge. https://github.com/llvm/llvm-project/pull/102432 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)
NagyDonat wrote: Oh, you're right, invalid pointer arithmetic like `(&x - 1)` is not handled by ArrayBoundV2, because right now that's the responsibility of a THIRD checker, `alpha.core.PointerArithm`. However, directly after bringing `ArrayBoundV2` out of alpha, I'll continue with working on this `PointerArithm` and either I'll completely merge it with `ArrayBoundV2` or I'll keep it as a separate "frontend" which also calls the bounds-checking logic that's behind `ArrayBoundV2`. Based on this I'd suggest that you should - delete the bounds checking logic from `PointerSubChecker`, - delete the testcases that were testing it (because it would be too complicated to bring in *two* other checkers + AFAIK `PointerArithm` is in a bad shape currently), - refer to both ArrayBoundV2 and PointerArithm in the documentation. https://github.com/llvm/llvm-project/pull/102580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)
https://github.com/NagyDonat commented: Thanks for adding support for these functions! The code LGTM overall, except for one mostly theoretical issue (about the use of `assert`) which I described in an inline comment. https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)
@@ -21,16 +21,67 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h" #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h" using namespace clang; using namespace ento; namespace { +QualType getOverflowBuiltinResultType(const CallEvent &Call) { + assert(Call.getNumArgs() == 3); + + return Call.getArgExpr(2)->getType()->getPointeeType(); +} + +QualType getOverflowBuiltinResultType(const CallEvent &Call, CheckerContext &C, + unsigned BI) { + assert(Call.getNumArgs() == 3); + + ASTContext &Ast = C.getASTContext(); NagyDonat wrote: `` Checker code usually uses the name `ACtx` for 'the' `ASTContext` object, so consider using that here. However, `Ast` is also a completely correct name, so if you prefer it, feel free to keep it. `` https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)
@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker { } // namespace +void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call, + CheckerContext &C, + BinaryOperator::Opcode Op, + QualType ResultType) const { + // All __builtin_*_overflow functions take 3 argumets. + assert(Call.getNumArgs() == 3); NagyDonat wrote: The use of `assert()` should be limited to situations where we realize that the _analyzer code_ is buggy (we reached an "impossible" situation). If the _analyzed code_ contains an invalid call like `__builtin_add_overflow(10, 20, &res, "spam")`, then the analyzer _may_ report an error, but must not crash with an assertion failure. Unfortunately, this particular checker is a so-called "modeling" checker, so it is hidden from the user (as an undocumented implementation detail), and therefore it cannot create bug reports. This means that if we encounter an invalid `__builtin_*_overflow` call, then we should probably just ignore it, because we should not assert and we cannot create a bug report. I'd assume that this is an extremely rare situation (if someone uses these builtin function, they're unlikely to mess up the argument count), so a more complex solution (e.g. introducing a new non-modeling checker which creates bug reports) is probably not worth the effort. https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)
@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker { } // namespace +void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call, + CheckerContext &C, + BinaryOperator::Opcode Op, + QualType ResultType) const { + // All __builtin_*_overflow functions take 3 argumets. + assert(Call.getNumArgs() == 3); + + ProgramStateRef State = C.getState(); + SValBuilder &SVB = C.getSValBuilder(); + const Expr *CE = Call.getOriginExpr(); + + SVal Arg1 = Call.getArgSVal(0); + SVal Arg2 = Call.getArgSVal(1); + + SVal RetVal = SVB.evalBinOp(State, Op, Arg1, Arg2, ResultType); + + // TODO: Handle overflows with values that known to overflow. Like INT_MAX + 1 + // should not produce state for non-overflow case and threat it as NagyDonat wrote: I thought about this for some time and checked the implementation of `evalBinOp` (more precisely `SValBuilder:evalBinOpNN`, the case that's relevant here), but unfortunately I don't see a clear solution. You are right that `SVB` _should_ take care of this, but unfortunately that does not happen currently. Currently the analyzer (and `SValBuilder` in particular) has its own rigid opinion about overflow handling: it allows overflow on all numeric operations (including signed ones, where it's undefined behavior according to the standard) and models it without splitting the state, so e.g. if `x` and `y` are `signed char` values within the range `50 <= x, y <= 100`, then it calculates that the possible values of `x+y` are $[-128, -56] \cup [100, 127]$. It would be good to enhance this behavior by tagging the ranges within the `RangeSet` with "did overflow happen?" bits that would let us perform a state split. This is theoretically doable (I don't see anything that would block this), but would be a complex refactoring effort (we'd need to pass along this "did we overflow?" bit through all the spaghetti code). Without this, you can implement a workaround that calls `SValBuilder::getMinValue()` and `SValBuilder::getMaxValue()` to get the minimal and maximal values of the operands (as `llvm::APSInt` values, that is, **a**rbitrary **p**recision **s**igned **int**egers) and then uses these to determine which of the two branches (overflow, no-overflow) are possible. This workaround solution would be better than the current "always act as if both branches were possible", but it is not sufficient for deducing the set of possible results on the no-overflow branch. (That is, if `x` and `y` are `signed char` values within the range `50 <= x, y <= 100`, then this workaround would create two branches: one where `__builtin_add_overflow` returns true, and and one where it returns `false` and stores a symbol with range $[-128, -56] \cup [100, 127]$ within the result pointer; while the correct assumption would be that the result has range $[100, 127]$ on this no-overflow branch.) https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Remove array bounds check from PointerSubChecker (PR #102580)
https://github.com/NagyDonat approved this pull request. Thanks for the updates, I'm satisfied with the state of this commit. However let's wait for an independent approval from @steakhal @haoNoQ or someone else, because this change does reduce the scope of this checker. https://github.com/llvm/llvm-project/pull/102580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][ASTImporter] New fix for default template parameter values. (PR #101836)
https://github.com/NagyDonat approved this pull request. This change seems to be correct, but I don't know enough to dig into the details and provide a confident review. However, I as far as I know there are no significantly better reviewers, so I can give you the formal approval. I think you can merge this commit once you update the comment that you marked as inaccurate. https://github.com/llvm/llvm-project/pull/101836 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Trivial refactoring of region invalidation (PR #102456)
https://github.com/NagyDonat closed https://github.com/llvm/llvm-project/pull/102456 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model overflow builtins (PR #102602)
@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker { } // namespace +void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call, + CheckerContext &C, + BinaryOperator::Opcode Op, + QualType ResultType) const { + // All __builtin_*_overflow functions take 3 argumets. + assert(Call.getNumArgs() == 3); NagyDonat wrote: > I've tried to catch this assert locally by calling __builtin_*_overflow with > 2 arguments, but clang just rejects it. Oh, I see. Perhaps add a comment saying that "Calling a builtin with an incorrect argument count produces compiler error (...)", because e.g. I didn't know that this is the case. https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model overflow builtins (PR #102602)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Improve documentation of `invalidateRegion` methods (PR #102477)
NagyDonat wrote: > Do you plan to apply more refactors to invalidation and Store? These were opportunistic unplanned refactors, I was just familiarizing myself with the invalidation logic (because I'll need to use it in loop widening), and I quickly fixed some minor issues that I spotted. However, I'll probably need one change to make more features of the invalidation logic accessible without a `CallEvent`. (Currently the `CallEvent` is passed down even to `RegionStoreManager::invalidateRegions` where it's only used to calculate an `enum GlobalsFilterKind` value; while I'd like to have a variant of `ProgramState::invalidateRegions` that directly takes a `GlobalFiltersKind` instead of a `CallEvent`.) However I'm not planning any functional/algorithmic changes in the invalidation logic, and I think and hope that I won't need to touch other parts of the `State`. What can I do to help you avoid merge conflicts? https://github.com/llvm/llvm-project/pull/102477 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Improve documentation of `invalidateRegion` methods (PR #102477)
https://github.com/NagyDonat closed https://github.com/llvm/llvm-project/pull/102477 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model overflow builtins (PR #102602)
@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker { } // namespace +void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call, + CheckerContext &C, + BinaryOperator::Opcode Op, + QualType ResultType) const { + // All __builtin_*_overflow functions take 3 argumets. + assert(Call.getNumArgs() == 3); + + ProgramStateRef State = C.getState(); + SValBuilder &SVB = C.getSValBuilder(); + const Expr *CE = Call.getOriginExpr(); + + SVal Arg1 = Call.getArgSVal(0); + SVal Arg2 = Call.getArgSVal(1); + + SVal RetVal = SVB.evalBinOp(State, Op, Arg1, Arg2, ResultType); + + // TODO: Handle overflows with values that known to overflow. Like INT_MAX + 1 + // should not produce state for non-overflow case and threat it as NagyDonat wrote: > `clang_analyzer_eval(a + b < 30); <--- Prints 1 and 0, but why ???` Assuming that `a` and `b` are signed integers, they can be very negative, and then their sum can be a positive value above 30 (after an overflow). This means that both boolean values are possible for the expression `a + b < 30`, and the analyzer represents this by printing both 1 and 0. (If I understand this correctly, we get two definite numbers instead of one range because the on-by-default `eagerlyAssume` mode causes a state split when it sees the comparison operator in `a + b < 30`, despite the fact that this is not in a conditional expression.) https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model overflow builtins (PR #102602)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model overflow builtins (PR #102602)
@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker { } // namespace +void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call, + CheckerContext &C, + BinaryOperator::Opcode Op, + QualType ResultType) const { + // All __builtin_*_overflow functions take 3 argumets. + assert(Call.getNumArgs() == 3); + + ProgramStateRef State = C.getState(); + SValBuilder &SVB = C.getSValBuilder(); + const Expr *CE = Call.getOriginExpr(); + + SVal Arg1 = Call.getArgSVal(0); + SVal Arg2 = Call.getArgSVal(1); + + SVal RetVal = SVB.evalBinOp(State, Op, Arg1, Arg2, ResultType); + + // TODO: Handle overflows with values that known to overflow. Like INT_MAX + 1 + // should not produce state for non-overflow case and threat it as NagyDonat wrote: I did some digging in `RangedConstraintManager` and it seems that algebraic expressions that involve two symbolic values **are handled as black boxes** by the analyzer: in your example it creates a `SymSymExpr` that is known to be the sum of `a` and `b`, but we don't have any logic which would constrain the potential values of this `SymSymExpr` based on the constraints known about the operands. (The analyzer can propagate information in the easier case when it knows the exact value of one of the operands.) I don't know why do we have this shameful limitation... I'd guess that implementing the necessary `RangeSet` operation wouldn't be prohibitively difficult, but there may be performance implications, and perhaps well-constrained (but not exactly known) values are not common enough to make this a valuable investment. https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model overflow builtins (PR #102602)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model overflow builtins (PR #102602)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/102602 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
https://github.com/NagyDonat created https://github.com/llvm/llvm-project/pull/103059 ...because it is too noisy to be useful right now, and its architecture is terrible, so it can't act a starting point of future development. The main problem with this checker is that it tries to do (or at least fake) path-sensitive analysis without actually using the established path-sensitive analysis engine. Instead of actually tracking the symbolic values and the known constraints on them, this checker blindly gropes the AST and uses heuristics like "this variable was seen in a comparison operator expression that is not a loop condition, so it's probably not too large" (which was improved in a separate commit to at least ignore comparison operators that appear after the actual `malloc()` call). This might have been acceptable in 2011 (when this checker was added), but since then we developed a significantly better standard approach for analysis and this old relic doesn't deserve to remain in the codebase. Needless to say, this primitive approach causes lots of false positives (and presumably false negatives as well), which ensures that this alpha checker won't be missed by the users. It would be good to eventually have a stable, path-sensitive checker that could succeed in the task where this hacky implementation fails, but the first step towards that goal is removing this old garbage, which just confuses the potential users or contributors. (I don't have plans for reimplementing the goals of this checker. It could happen eventually, but right now I have many goals that are higher priority than this.) From 36821708145587553f13df8648920f281b318240 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Tue, 13 Aug 2024 14:50:17 +0200 Subject: [PATCH] [analyzer] Delete alpha.security.MallocOverflow ...because it is too noisy to be useful right now, and its architecture is terrible, so it can't act a starting point of future development. The main problem with this checker is that it tries to do (or at least fake) path-sensitive analysis without actually using the established path-sensitive analysis engine. Instead of actually tracking the symbolic values and the known constraints on them, this checker blindly gropes the AST and uses heuristics like "this variable was seen in a comparison operator expression that is not a loop condition, so it's probably not too large" (which was improved in a separate commit to at least ignore comparison operators that appear after the actual `malloc()` call). This might have been acceptable in 2011 (when this checker was added), but since then we developed a significantly better standard approach for analysis and this old relic doesn't deserve to remain in the codebase. Needless to say, this primitive approach causes lots of false positives (and presumably false negatives as well), which ensures that this alpha checker won't be missed by the users. It would be good to eventually have a stable, path-sensitive checker that could succeed in the task where this hacky implementation fails, but the first step towards that goal is taking out this old garbage, which confuses the potential users or contributors. (I don't have plans for reimplementing the goals of this checker. It could happen eventually, but right now I have many plans that are higher priority than this.) --- clang/docs/analyzer/checkers.rst | 43 --- .../clang/StaticAnalyzer/Checkers/Checkers.td | 4 - .../StaticAnalyzer/Checkers/CMakeLists.txt| 1 - .../MallocOverflowSecurityChecker.cpp | 341 -- clang/test/Analysis/malloc-overflow.c | 150 clang/test/Analysis/malloc-overflow.cpp | 12 - clang/test/Analysis/malloc-overflow2.c| 40 -- clang/www/analyzer/potential_checkers.html| 2 - .../lib/StaticAnalyzer/Checkers/BUILD.gn | 1 - 9 files changed, 594 deletions(-) delete mode 100644 clang/lib/StaticAnalyzer/Checkers/MallocOverflowSecurityChecker.cpp delete mode 100644 clang/test/Analysis/malloc-overflow.c delete mode 100644 clang/test/Analysis/malloc-overflow.cpp delete mode 100644 clang/test/Analysis/malloc-overflow2.c diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst index 46b0b7b9c82376..0bfbc995579d41 100644 --- a/clang/docs/analyzer/checkers.rst +++ b/clang/docs/analyzer/checkers.rst @@ -2951,49 +2951,6 @@ Warn about buffer overflows (newer checker). char c = s[x]; // warn: index is tainted } -.. _alpha-security-MallocOverflow: - -alpha.security.MallocOverflow (C) -" -Check for overflows in the arguments to ``malloc()``. -It tries to catch ``malloc(n * c)`` patterns, where: - - - ``n``: a variable or member access of an object - - ``c``: a constant foldable integral - -This checker was designed for code audits, so expect false-positive reports. -One is supposed to silence this checker by ensuring proper bounds c
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
@@ -1039,10 +1039,6 @@ def ArrayBoundCheckerV2 : Checker<"ArrayBoundV2">, HelpText<"Warn about buffer overflows (newer checker)">, Documentation; -def MallocOverflowSecurityChecker : Checker<"MallocOverflow">, NagyDonat wrote: Is it possible to write a "this checker no longer exists, but references to its name should not be an error" placeholder in this TD file? If you readily know a solution, I'd be happy to add it, but I don't want to spend time on researching it. https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/103059 From 36821708145587553f13df8648920f281b318240 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Tue, 13 Aug 2024 14:50:17 +0200 Subject: [PATCH 1/2] [analyzer] Delete alpha.security.MallocOverflow ...because it is too noisy to be useful right now, and its architecture is terrible, so it can't act a starting point of future development. The main problem with this checker is that it tries to do (or at least fake) path-sensitive analysis without actually using the established path-sensitive analysis engine. Instead of actually tracking the symbolic values and the known constraints on them, this checker blindly gropes the AST and uses heuristics like "this variable was seen in a comparison operator expression that is not a loop condition, so it's probably not too large" (which was improved in a separate commit to at least ignore comparison operators that appear after the actual `malloc()` call). This might have been acceptable in 2011 (when this checker was added), but since then we developed a significantly better standard approach for analysis and this old relic doesn't deserve to remain in the codebase. Needless to say, this primitive approach causes lots of false positives (and presumably false negatives as well), which ensures that this alpha checker won't be missed by the users. It would be good to eventually have a stable, path-sensitive checker that could succeed in the task where this hacky implementation fails, but the first step towards that goal is taking out this old garbage, which confuses the potential users or contributors. (I don't have plans for reimplementing the goals of this checker. It could happen eventually, but right now I have many plans that are higher priority than this.) --- clang/docs/analyzer/checkers.rst | 43 --- .../clang/StaticAnalyzer/Checkers/Checkers.td | 4 - .../StaticAnalyzer/Checkers/CMakeLists.txt| 1 - .../MallocOverflowSecurityChecker.cpp | 341 -- clang/test/Analysis/malloc-overflow.c | 150 clang/test/Analysis/malloc-overflow.cpp | 12 - clang/test/Analysis/malloc-overflow2.c| 40 -- clang/www/analyzer/potential_checkers.html| 2 - .../lib/StaticAnalyzer/Checkers/BUILD.gn | 1 - 9 files changed, 594 deletions(-) delete mode 100644 clang/lib/StaticAnalyzer/Checkers/MallocOverflowSecurityChecker.cpp delete mode 100644 clang/test/Analysis/malloc-overflow.c delete mode 100644 clang/test/Analysis/malloc-overflow.cpp delete mode 100644 clang/test/Analysis/malloc-overflow2.c diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst index 46b0b7b9c82376..0bfbc995579d41 100644 --- a/clang/docs/analyzer/checkers.rst +++ b/clang/docs/analyzer/checkers.rst @@ -2951,49 +2951,6 @@ Warn about buffer overflows (newer checker). char c = s[x]; // warn: index is tainted } -.. _alpha-security-MallocOverflow: - -alpha.security.MallocOverflow (C) -" -Check for overflows in the arguments to ``malloc()``. -It tries to catch ``malloc(n * c)`` patterns, where: - - - ``n``: a variable or member access of an object - - ``c``: a constant foldable integral - -This checker was designed for code audits, so expect false-positive reports. -One is supposed to silence this checker by ensuring proper bounds checking on -the variable in question using e.g. an ``assert()`` or a branch. - -.. code-block:: c - - void test(int n) { - void *p = malloc(n * sizeof(int)); // warn - } - - void test2(int n) { - if (n > 100) // gives an upper-bound - return; - void *p = malloc(n * sizeof(int)); // no warning - } - - void test3(int n) { - assert(n <= 100 && "Contract violated."); - void *p = malloc(n * sizeof(int)); // no warning - } - -Limitations: - - - The checker won't warn for variables involved in explicit casts, - since that might limit the variable's domain. - E.g.: ``(unsigned char)int x`` would limit the domain to ``[0,255]``. - The checker will miss the true-positive cases when the explicit cast would - not tighten the domain to prevent the overflow in the subsequent - multiplication operation. - - - It is an AST-based checker, thus it does not make use of the - path-sensitive taint-analysis. - .. _alpha-security-MmapWriteExec: alpha.security.MmapWriteExec (C) diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td index 38b55a0eb0a7b0..fb4114619ac3d3 100644 --- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td +++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td @@ -1039,10 +1039,6 @@ def ArrayBoundCheckerV2 : Checker<"ArrayBoundV2">, HelpText<"Warn about buffer overflows (newer checker)">, Documentation; -def MallocOverflowSecurityChecker : Checker<"MallocOverflow">, - HelpText<"Check for overflows in th
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
@@ -1,40 +0,0 @@ -// RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown -analyzer-checker=alpha.security.MallocOverflow,unix -verify %s -// RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown -analyzer-checker=alpha.security.MallocOverflow,unix,optin.portability -DPORTABILITY -verify %s NagyDonat wrote: I agree, that this might have some minor usefulness by accident, but I still think that it's better to remove it. (After all, we are not copying random "no results found" source files into our test directory...) https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
@@ -1039,10 +1039,6 @@ def ArrayBoundCheckerV2 : Checker<"ArrayBoundV2">, HelpText<"Warn about buffer overflows (newer checker)">, Documentation; -def MallocOverflowSecurityChecker : Checker<"MallocOverflow">, NagyDonat wrote: Now that I think about it, we already have lots of precedent for moving and renaming alpha checkers without prior warning, so the "updated to the new version and the old command line invocation doesn't work without changes" situation is not a problem. The only difference is that after this change they won't be able to restore the behavior by using a new name for the checker, but I think that this is acceptable, because: 1. this checker is so unreliable that it's extremely unlikely that somebody was relying on it; 2. deleting a checker is a _conservative_ change: it removes the results of that particular checker, but it doesn't affects the results of the other checkers and doesn't hinder the investigation of the results by introducing lots of spammy results. Based on this, I'm merging this commit now. https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits