[clang] 30e5c7e - [analyzer] NFCi: Refactor CStringChecker: use strongly typed internal API
Author: Balazs Benics Date: 2020-04-09T16:06:32+02:00 New Revision: 30e5c7e82fa1c5318540feb83d54757c632e2599 URL: https://github.com/llvm/llvm-project/commit/30e5c7e82fa1c5318540feb83d54757c632e2599 DIFF: https://github.com/llvm/llvm-project/commit/30e5c7e82fa1c5318540feb83d54757c632e2599.diff LOG: [analyzer] NFCi: Refactor CStringChecker: use strongly typed internal API Summary: I wanted to extend the diagnostics of the CStringChecker with taintedness. This requires the CStringChecker to be refactored to support a more flexible reporting mechanism. This patch does only refactorings, such: - eliminates always false parameters (like WarnAboutSize) - reduces the number of parameters - makes strong types differentiating *source* and *destination* buffers (same with size expressions) - binds the argument expression and the index, making diagnostics accurate and easy to emit - removes a bunch of default parameters to make it more readable - remove random const char* warning message parameters, making clear where and what is going to be emitted Note that: - CheckBufferAccess now checks *only* one buffer, this removed about 100 LOC code duplication - not every function was refactored to use the /new/ strongly typed API, since the CString related functions are really closely coupled monolithic beasts, I will refactor them separately - all tests are preserved and passing; only the message changed at some places. In my opinion, these messages are holding the same information. I would also highlight that this refactoring caught a bug in clang/test/Analysis/string.c:454 where the diagnostic did not reflect reality. This catch backs my effort on simplifying this monolithic CStringChecker. Reviewers: NoQ, baloghadamsoftware, Szelethus, rengolin, Charusso Reviewed By: NoQ Subscribers: whisperity, xazax.hun, szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, martong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D74806 Added: Modified: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp clang/test/Analysis/bsd-string.c clang/test/Analysis/bstring.c clang/test/Analysis/null-deref-ps-region.c clang/test/Analysis/string.c Removed: diff --git a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp index 4558b6936e7f..40467b346f69 100644 --- a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp @@ -30,6 +30,47 @@ using namespace clang; using namespace ento; namespace { +struct AnyArgExpr { + // FIXME: Remove constructor in C++17 to turn it into an aggregate. + AnyArgExpr(const Expr *Expression, unsigned ArgumentIndex) + : Expression{Expression}, ArgumentIndex{ArgumentIndex} {} + const Expr *Expression; + unsigned ArgumentIndex; +}; + +struct SourceArgExpr : AnyArgExpr { + using AnyArgExpr::AnyArgExpr; // FIXME: Remove using in C++17. +}; + +struct DestinationArgExpr : AnyArgExpr { + using AnyArgExpr::AnyArgExpr; // FIXME: Same. +}; + +struct SizeArgExpr : AnyArgExpr { + using AnyArgExpr::AnyArgExpr; // FIXME: Same. +}; + +using ErrorMessage = SmallString<128>; +enum class AccessKind { write, read }; + +static ErrorMessage createOutOfBoundErrorMsg(StringRef FunctionDescription, + AccessKind Access) { + ErrorMessage Message; + llvm::raw_svector_ostream Os(Message); + + // Function classification like: Memory copy function + Os << toUppercase(FunctionDescription.front()) + << &FunctionDescription.data()[1]; + + if (Access == AccessKind::write) { +Os << " overflows the destination buffer"; + } else { // read access +Os << " accesses out-of-bound array element"; + } + + return Message; +} + enum class ConcatFnKind { none = 0, strcat = 1, strlcat = 2 }; class CStringChecker : public Checker< eval::Call, check::PreStmt, @@ -113,12 +154,9 @@ class CStringChecker : public Checker< eval::Call, void evalMemmove(CheckerContext &C, const CallExpr *CE) const; void evalBcopy(CheckerContext &C, const CallExpr *CE) const; void evalCopyCommon(CheckerContext &C, const CallExpr *CE, - ProgramStateRef state, - const Expr *Size, - const Expr *Source, - const Expr *Dest, - bool Restricted = false, - bool IsMempcpy = false) const; + ProgramStateRef state, SizeArgExpr Size, + DestinationArgExpr Dest, SourceArgExpr Source, + bool Restricted, bool IsMempcpy) const; void evalMemcmp(CheckerContext &C, const CallExpr *CE) const; @@ -195,40 +233,17 @@ class CStringChecker : public Checker< eval::Call,
[clang] d96a47c - [analyzer] ctu-on-demand-parsing tests: replace linux -> system-linux
Author: Balazs Benics Date: 2020-07-13T14:29:47+02:00 New Revision: d96a47c61625f853ec42a151ae3783e30a3943f3 URL: https://github.com/llvm/llvm-project/commit/d96a47c61625f853ec42a151ae3783e30a3943f3 DIFF: https://github.com/llvm/llvm-project/commit/d96a47c61625f853ec42a151ae3783e30a3943f3.diff LOG: [analyzer] ctu-on-demand-parsing tests: replace linux -> system-linux Differential Revision: https://reviews.llvm.org/D83555 Added: Modified: clang/test/Analysis/ctu-on-demand-parsing.c clang/test/Analysis/ctu-on-demand-parsing.cpp Removed: diff --git a/clang/test/Analysis/ctu-on-demand-parsing.c b/clang/test/Analysis/ctu-on-demand-parsing.c index 5adce7f36963..07a72a104646 100644 --- a/clang/test/Analysis/ctu-on-demand-parsing.c +++ b/clang/test/Analysis/ctu-on-demand-parsing.c @@ -19,7 +19,7 @@ // RUN: -verify ctu-on-demand-parsing.c // // FIXME: Path handling should work on all platforms. -// REQUIRES: linux +// REQUIRES: system-linux void clang_analyzer_eval(int); diff --git a/clang/test/Analysis/ctu-on-demand-parsing.cpp b/clang/test/Analysis/ctu-on-demand-parsing.cpp index 058269662fb3..e4e998c8f64c 100644 --- a/clang/test/Analysis/ctu-on-demand-parsing.cpp +++ b/clang/test/Analysis/ctu-on-demand-parsing.cpp @@ -30,7 +30,7 @@ // CHECK: CTU loaded AST file: {{.*}}ctu-chain.cpp // // FIXME: Path handling should work on all platforms. -// REQUIRES: linux +// REQUIRES: system-linux #include "ctu-hdr.h" ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] e22cae3 - [analyzer][NFC] Add unittest for FalsePositiveRefutationBRVisitor
Author: Balazs Benics Date: 2020-06-29T16:54:17+02:00 New Revision: e22cae32c5c4cf8c49b674cea34c105a6cb015f9 URL: https://github.com/llvm/llvm-project/commit/e22cae32c5c4cf8c49b674cea34c105a6cb015f9 DIFF: https://github.com/llvm/llvm-project/commit/e22cae32c5c4cf8c49b674cea34c105a6cb015f9.diff LOG: [analyzer][NFC] Add unittest for FalsePositiveRefutationBRVisitor Adds the test infrastructure for testing the FalsePositiveRefutationBRVisitor. It will be extended in the D78457 patch, which demonstrates and fixes a bug in the visitor. Differential Revision: https://reviews.llvm.org/D78704 Added: clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp Modified: clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp clang/unittests/StaticAnalyzer/CMakeLists.txt clang/unittests/StaticAnalyzer/CheckerRegistration.h Removed: diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp index ad79f7cb9359..719edfdb51e6 100644 --- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp +++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp @@ -52,6 +52,7 @@ #include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/SmallString.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/Statistic.h" #include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/Casting.h" @@ -2812,12 +2813,19 @@ UndefOrNullArgVisitor::VisitNode(const ExplodedNode *N, BugReporterContext &BRC, // Implementation of FalsePositiveRefutationBRVisitor. //===--===// +#define DEBUG_TYPE "FalsePositiveRefutationBRVisitor" +STATISTIC(CrosscheckedBugReports, + "The # of bug reports which were checked for infeasible constraints"); +STATISTIC(CrosscheckInvalidatedBugReports, + "The # of bug reports invalidated due to infeasible constraints"); + FalsePositiveRefutationBRVisitor::FalsePositiveRefutationBRVisitor() : Constraints(ConstraintRangeTy::Factory().getEmptyMap()) {} void FalsePositiveRefutationBRVisitor::finalizeVisitor( BugReporterContext &BRC, const ExplodedNode *EndPathNode, PathSensitiveBugReport &BR) { + ++CrosscheckedBugReports; // Collect new constraints VisitNode(EndPathNode, BRC, BR); @@ -2848,8 +2856,10 @@ void FalsePositiveRefutationBRVisitor::finalizeVisitor( if (!isSat.hasValue()) return; - if (!isSat.getValue()) + if (!isSat.getValue()) { +++CrosscheckInvalidatedBugReports; BR.markInvalid("Infeasible constraints", EndPathNode->getLocationContext()); + } } PathDiagnosticPieceRef FalsePositiveRefutationBRVisitor::VisitNode( diff --git a/clang/unittests/StaticAnalyzer/CMakeLists.txt b/clang/unittests/StaticAnalyzer/CMakeLists.txt index 564eba879f24..0e6d8763d96d 100644 --- a/clang/unittests/StaticAnalyzer/CMakeLists.txt +++ b/clang/unittests/StaticAnalyzer/CMakeLists.txt @@ -7,6 +7,7 @@ add_clang_unittest(StaticAnalysisTests AnalyzerOptionsTest.cpp CallDescriptionTest.cpp CallEventTest.cpp + FalsePositiveRefutationBRVisitorTest.cpp ParamRegionTest.cpp RangeSetTest.cpp RegisterCustomCheckersTest.cpp diff --git a/clang/unittests/StaticAnalyzer/CheckerRegistration.h b/clang/unittests/StaticAnalyzer/CheckerRegistration.h index 0bbed9b7784f..e02b856e0e9c 100644 --- a/clang/unittests/StaticAnalyzer/CheckerRegistration.h +++ b/clang/unittests/StaticAnalyzer/CheckerRegistration.h @@ -14,6 +14,7 @@ #include "clang/StaticAnalyzer/Frontend/AnalysisConsumer.h" #include "clang/StaticAnalyzer/Frontend/CheckerRegistry.h" #include "clang/Tooling/Tooling.h" +#include "gtest/gtest.h" namespace clang { namespace ento { @@ -65,10 +66,21 @@ class TestAction : public ASTFrontendAction { } }; +inline SmallString<80> getCurrentTestNameAsFileName() { + const ::testing::TestInfo *Info = + ::testing::UnitTest::GetInstance()->current_test_info(); + + SmallString<80> FileName; + (Twine{Info->name()} + ".cc").toVector(FileName); + return FileName; +} + template bool runCheckerOnCode(const std::string &Code, std::string &Diags) { + const SmallVectorImpl &FileName = getCurrentTestNameAsFileName(); llvm::raw_string_ostream OS(Diags); - return tooling::runToolOnCode(std::make_unique>(OS), Code); + return tooling::runToolOnCode(std::make_unique>(OS), Code, +FileName); } template @@ -77,5 +89,22 @@ bool runCheckerOnCode(const std::string &Code) { return runCheckerOnCode(Code, Diags); } +template +bool runCheckerOnCodeWithArgs(const std::string &Code, + const std::vector &Args, + std::string &Diags) { + const SmallVectorImpl &FileName = getCurrentTestNameAsFileName(); + llvm::raw_string_ostream OS(Diags); + return tooling::runToolOnCodeWithArgs
[clang] fe0a555 - [analyzer][NFC] Add unittest for FalsePositiveRefutationBRVisitor
Author: Balazs Benics Date: 2020-06-29T18:18:43+02:00 New Revision: fe0a555aa3c6f37a5b5d79942215b07587893efa URL: https://github.com/llvm/llvm-project/commit/fe0a555aa3c6f37a5b5d79942215b07587893efa DIFF: https://github.com/llvm/llvm-project/commit/fe0a555aa3c6f37a5b5d79942215b07587893efa.diff LOG: [analyzer][NFC] Add unittest for FalsePositiveRefutationBRVisitor Adds the test infrastructure for testing the FalsePositiveRefutationBRVisitor. It will be extended in the D78457 patch, which demonstrates and fixes a bug in the visitor. Differential Revision: https://reviews.llvm.org/D78704 Added: clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp Modified: clang/unittests/StaticAnalyzer/CMakeLists.txt clang/unittests/StaticAnalyzer/CheckerRegistration.h Removed: diff --git a/clang/unittests/StaticAnalyzer/CMakeLists.txt b/clang/unittests/StaticAnalyzer/CMakeLists.txt index 564eba879f24..0e6d8763d96d 100644 --- a/clang/unittests/StaticAnalyzer/CMakeLists.txt +++ b/clang/unittests/StaticAnalyzer/CMakeLists.txt @@ -7,6 +7,7 @@ add_clang_unittest(StaticAnalysisTests AnalyzerOptionsTest.cpp CallDescriptionTest.cpp CallEventTest.cpp + FalsePositiveRefutationBRVisitorTest.cpp ParamRegionTest.cpp RangeSetTest.cpp RegisterCustomCheckersTest.cpp diff --git a/clang/unittests/StaticAnalyzer/CheckerRegistration.h b/clang/unittests/StaticAnalyzer/CheckerRegistration.h index 0bbed9b7784f..e02b856e0e9c 100644 --- a/clang/unittests/StaticAnalyzer/CheckerRegistration.h +++ b/clang/unittests/StaticAnalyzer/CheckerRegistration.h @@ -14,6 +14,7 @@ #include "clang/StaticAnalyzer/Frontend/AnalysisConsumer.h" #include "clang/StaticAnalyzer/Frontend/CheckerRegistry.h" #include "clang/Tooling/Tooling.h" +#include "gtest/gtest.h" namespace clang { namespace ento { @@ -65,10 +66,21 @@ class TestAction : public ASTFrontendAction { } }; +inline SmallString<80> getCurrentTestNameAsFileName() { + const ::testing::TestInfo *Info = + ::testing::UnitTest::GetInstance()->current_test_info(); + + SmallString<80> FileName; + (Twine{Info->name()} + ".cc").toVector(FileName); + return FileName; +} + template bool runCheckerOnCode(const std::string &Code, std::string &Diags) { + const SmallVectorImpl &FileName = getCurrentTestNameAsFileName(); llvm::raw_string_ostream OS(Diags); - return tooling::runToolOnCode(std::make_unique>(OS), Code); + return tooling::runToolOnCode(std::make_unique>(OS), Code, +FileName); } template @@ -77,5 +89,22 @@ bool runCheckerOnCode(const std::string &Code) { return runCheckerOnCode(Code, Diags); } +template +bool runCheckerOnCodeWithArgs(const std::string &Code, + const std::vector &Args, + std::string &Diags) { + const SmallVectorImpl &FileName = getCurrentTestNameAsFileName(); + llvm::raw_string_ostream OS(Diags); + return tooling::runToolOnCodeWithArgs( + std::make_unique>(OS), Code, Args, FileName); +} + +template +bool runCheckerOnCodeWithArgs(const std::string &Code, + const std::vector &Args) { + std::string Diags; + return runCheckerOnCodeWithArgs(Code, Args, Diags); +} + } // namespace ento } // namespace clang diff --git a/clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp b/clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp new file mode 100644 index ..ad202f844597 --- /dev/null +++ b/clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp @@ -0,0 +1,175 @@ +//===- unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#include "CheckerRegistration.h" +#include "Reusables.h" +#include "clang/Frontend/CompilerInstance.h" +#include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h" +#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h" +#include "clang/StaticAnalyzer/Core/Checker.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h" +#include "clang/StaticAnalyzer/Frontend/AnalysisConsumer.h" +#include "clang/StaticAnalyzer/Frontend/CheckerRegistry.h" +#include "llvm/Config/config.h" +#include "gtest/gtest.h" + +// FIXME: Use GTEST_SKIP() instead if GTest is updated to version 1.10.0 +#ifdef LLVM_WITH_Z3 +#define SKIP_WITHOUT_Z3 +#else +#define SKIP_WITHOUT_Z3 return +#endif + +namespace clang { +namespace ento { +namespace { + +class FalsePositiveGenerator : public Checker { + using Self = FalsePositiveGenerat
[clang] de361df - [analyzer][Z3-refutation] Fix a refutation BugReporterVisitor bug
Author: Balazs Benics Date: 2020-06-29T18:51:24+02:00 New Revision: de361df3f6d0f20bf16a8deb9e6219556d028b81 URL: https://github.com/llvm/llvm-project/commit/de361df3f6d0f20bf16a8deb9e6219556d028b81 DIFF: https://github.com/llvm/llvm-project/commit/de361df3f6d0f20bf16a8deb9e6219556d028b81.diff LOG: [analyzer][Z3-refutation] Fix a refutation BugReporterVisitor bug FalsePositiveRefutationBRVisitor had a bug where the constraints were not properly collected thus crosschecked with Z3. This patch demonstratest and fixes that bug. Bug: The visitor wanted to collect all the constraints on a BugPath. Since it is a visitor, it stated the visitation of the BugPath with the node before the ErrorNode. As a final step, it visited the ErrorNode explicitly, before it processed the collected constraints. In principle, the ErrorNode should have visited before every other node. Since the constraints were collected into a map, mapping each symbol to its RangeSet, if the map already had a mapping with the symbol, then it was skipped. This behavior was flawed if: We already had a constraint on a symbol, but at the end in the ErrorNode we have a tighter constraint on that. Therefore, this visitor would not utilize that tighter constraint during the crosscheck validation. Differential Revision: https://reviews.llvm.org/D78457 Added: Modified: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h clang/lib/StaticAnalyzer/Core/BugReporter.cpp clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp Removed: diff --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h index de0ee5de81b5..365b1ff1bfe3 100644 --- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h +++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h @@ -386,6 +386,8 @@ class FalsePositiveRefutationBRVisitor final : public BugReporterVisitor { void finalizeVisitor(BugReporterContext &BRC, const ExplodedNode *EndPathNode, PathSensitiveBugReport &BR) override; + void addConstraints(const ExplodedNode *N, + bool OverwriteConstraintsOnExistingSyms); }; diff --git a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp index efae511da83b..f4fd495956aa 100644 --- a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp +++ b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp @@ -38,6 +38,7 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h" #include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h" #include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/SMTConv.h" #include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h" #include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h" #include "llvm/ADT/ArrayRef.h" @@ -2783,7 +2784,7 @@ Optional PathDiagnosticBuilder::findValidReport( R->clearVisitors(); R->addVisitor(std::make_unique()); -// We don't overrite the notes inserted by other visitors because the +// We don't overwrite the notes inserted by other visitors because the // refutation manager does not add any new note to the path generateVisitorsDiagnostics(R, BugPath->ErrorNode, BRC); } diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp index ad79f7cb9359..ef4d38ff498f 100644 --- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp +++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp @@ -2819,7 +2819,7 @@ void FalsePositiveRefutationBRVisitor::finalizeVisitor( BugReporterContext &BRC, const ExplodedNode *EndPathNode, PathSensitiveBugReport &BR) { // Collect new constraints - VisitNode(EndPathNode, BRC, BR); + addConstraints(EndPathNode, /*OverwriteConstraintsOnExistingSyms=*/true); // Create a refutation manager llvm::SMTSolverRef RefutationSolver = llvm::CreateZ3Solver(); @@ -2830,30 +2830,30 @@ void FalsePositiveRefutationBRVisitor::finalizeVisitor( const SymbolRef Sym = I.first; auto RangeIt = I.second.begin(); -llvm::SMTExprRef Constraints = SMTConv::getRangeExpr( +llvm::SMTExprRef SMTConstraints = SMTConv::getRangeExpr( RefutationSolver, Ctx, Sym, RangeIt->From(), RangeIt->To(), /*InRange=*/true); while ((++RangeIt) != I.second.end()) { - Constraints = RefutationSolver->mkOr( - Constraints, SMTConv::getRangeExpr(RefutationSolver, Ctx, Sym, - RangeIt->From(), RangeIt->To(), - /*InRange=*/true)); + SMTConstraints
[clang] 63d3aeb - [analyzer] Fix out-of-tree only clang build by not relaying on private header
Author: Balazs Benics Date: 2020-07-31T10:28:14+02:00 New Revision: 63d3aeb529a7b0fb95c2092ca38ad21c1f5cfd74 URL: https://github.com/llvm/llvm-project/commit/63d3aeb529a7b0fb95c2092ca38ad21c1f5cfd74 DIFF: https://github.com/llvm/llvm-project/commit/63d3aeb529a7b0fb95c2092ca38ad21c1f5cfd74.diff LOG: [analyzer] Fix out-of-tree only clang build by not relaying on private header It turned out that the D78704 included a private LLVM header, which is excluded from the LLVM install target. I'm substituting that `#include` with the public one by moving the necessary `#define` into that. There was a discussion about this at D78704 and on the cfe-dev mailing list. I'm also placing a note to remind others of this pitfall. Reviewed By: mgorny Differential Revision: https://reviews.llvm.org/D84929 Added: Modified: clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp llvm/include/llvm/Config/config.h.cmake llvm/include/llvm/Config/llvm-config.h.cmake Removed: diff --git a/clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp b/clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp index 7c151c182113..e67dcacca0a9 100644 --- a/clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp +++ b/clang/unittests/StaticAnalyzer/FalsePositiveRefutationBRVisitorTest.cpp @@ -16,7 +16,7 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h" #include "clang/StaticAnalyzer/Frontend/AnalysisConsumer.h" #include "clang/StaticAnalyzer/Frontend/CheckerRegistry.h" -#include "llvm/Config/config.h" +#include "llvm/Config/llvm-config.h" #include "gtest/gtest.h" // FIXME: Use GTEST_SKIP() instead if GTest is updated to version 1.10.0 diff --git a/llvm/include/llvm/Config/config.h.cmake b/llvm/include/llvm/Config/config.h.cmake index 4d76b27df6b6..b8c7e070eb34 100644 --- a/llvm/include/llvm/Config/config.h.cmake +++ b/llvm/include/llvm/Config/config.h.cmake @@ -1,6 +1,9 @@ #ifndef CONFIG_H #define CONFIG_H +// Include this header only under the llvm source tree. +// This is a private header. + /* Exported configuration */ #include "llvm/Config/llvm-config.h" @@ -335,9 +338,6 @@ /* Whether GlobalISel rule coverage is being collected */ #cmakedefine01 LLVM_GISEL_COV_ENABLED -/* Define if we have z3 and want to build it */ -#cmakedefine LLVM_WITH_Z3 ${LLVM_WITH_Z3} - /* Define to the default GlobalISel coverage file prefix */ #cmakedefine LLVM_GISEL_COV_PREFIX "${LLVM_GISEL_COV_PREFIX}" diff --git a/llvm/include/llvm/Config/llvm-config.h.cmake b/llvm/include/llvm/Config/llvm-config.h.cmake index 01892bc5b610..ee299876825e 100644 --- a/llvm/include/llvm/Config/llvm-config.h.cmake +++ b/llvm/include/llvm/Config/llvm-config.h.cmake @@ -79,6 +79,9 @@ */ #cmakedefine01 LLVM_FORCE_ENABLE_STATS +/* Define if we have z3 and want to build it */ +#cmakedefine LLVM_WITH_Z3 ${LLVM_WITH_Z3} + /* Define if LLVM was built with a dependency to the libtensorflow dynamic library */ #cmakedefine LLVM_HAVE_TF_API ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] f8643a9 - [analyzer] Prefer wrapping SymbolicRegions by ElementRegions
Author: Balazs Benics Date: 2022-09-13T08:58:46+02:00 New Revision: f8643a9b31c4029942f67d4534c9139b45173504 URL: https://github.com/llvm/llvm-project/commit/f8643a9b31c4029942f67d4534c9139b45173504 DIFF: https://github.com/llvm/llvm-project/commit/f8643a9b31c4029942f67d4534c9139b45173504.diff LOG: [analyzer] Prefer wrapping SymbolicRegions by ElementRegions It turns out that in certain cases `SymbolRegions` are wrapped by `ElementRegions`; in others, it's not. This discrepancy can cause the analyzer not to recognize if the two regions are actually referring to the same entity, which then can lead to unreachable paths discovered. Consider this example: ```lang=C++ struct Node { int* ptr; }; void with_structs(Node* n1) { Node c = *n1; // copy Node* n2 = &c; clang_analyzer_dump(*n1); // lazy... clang_analyzer_dump(*n2); // lazy... clang_analyzer_dump(n1->ptr); // rval(n1->ptr): reg_$2}.ptr> clang_analyzer_dump(n2->ptr); // rval(n2->ptr): reg_$1},0 S64b,struct Node}.ptr> clang_analyzer_eval(n1->ptr != n2->ptr); // UNKNOWN, bad! (void)(*n1); (void)(*n2); } ``` The copy of `n1` will insert a new binding to the store; but for doing that it actually must create a `TypedValueRegion` which it could pass to the `LazyCompoundVal`. Since the memregion in question is a `SymbolicRegion` - which is untyped, it needs to first wrap it into an `ElementRegion` basically implementing this untyped -> typed conversion for the sake of passing it to the `LazyCompoundVal`. So, this is why we have `Element{SymRegion{.}, 0,struct Node}` for `n1`. The problem appears if the analyzer evaluates a read from the expression `n1->ptr`. The same logic won't apply for `SymbolRegionValues`, since they accept raw `SubRegions`, hence the `SymbolicRegion` won't be wrapped into an `ElementRegion` in that case. Later when we arrive at the equality comparison, we cannot prove that they are equal. For more details check the corresponding thread on discourse: https://discourse.llvm.org/t/are-symbolicregions-really-untyped/64406 --- In this patch, I'm eagerly wrapping each `SymbolicRegion` by an `ElementRegion`; basically canonicalizing to this form. It seems reasonable to do so since any object can be thought of as a single array of that object; so this should not make much of a difference. The tests also underpin this assumption, as only a few were broken by this change; and actually fixed a FIXME along the way. About the second example, which does the same copy operation - but on the heap - it will be fixed by the next patch. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D132142 Added: clang/test/Analysis/trivial-copy-struct.cpp Modified: clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h clang/lib/StaticAnalyzer/Checkers/ExprInspectionChecker.cpp clang/lib/StaticAnalyzer/Checkers/NullabilityChecker.cpp clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp clang/lib/StaticAnalyzer/Core/ExprEngine.cpp clang/lib/StaticAnalyzer/Core/MemRegion.cpp clang/lib/StaticAnalyzer/Core/RegionStore.cpp clang/test/Analysis/ctor.mm clang/test/Analysis/expr-inspection.c clang/test/Analysis/memory-model.cpp clang/test/Analysis/ptr-arith.c clang/test/Analysis/ptr-arith.cpp Removed: diff --git a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h index 4c9c8239113e3..3ae28c1dba3eb 100644 --- a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h +++ b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h @@ -42,6 +42,18 @@ class SValExplainer : public FullSValVisitor { return false; } + bool isThisObject(const ElementRegion *R) { +if (const auto *Idx = R->getIndex().getAsInteger()) { + if (const auto *SR = R->getSuperRegion()->getAs()) { +QualType Ty = SR->getPointeeStaticType(); +bool IsNotReinterpretCast = R->getValueType() == Ty; +if (Idx->isZero() && IsNotReinterpretCast) + return isThisObject(SR); + } +} +return false; + } + public: SValExplainer(ASTContext &Ctx) : ACtx(Ctx) {} @@ -144,7 +156,7 @@ class SValExplainer : public FullSValVisitor { // Add the relevant code once it does. std::string VisitSymbolicRegion(const SymbolicRegion *R) { -// Explain 'this' object here. +// Explain 'this' object here - if it's not wrapped by an ElementRegion. // TODO: Explain CXXThisRegion itself, find a way to test it. if (isThisObject(R)) return "'this' object"; @@ -174,6 +186,13 @@ class SValExplainer : public FullSValVisitor { std::string VisitElementRegion(const ElementRegion *R) { std::string Str; llvm::raw_string_ostream OS(Str); + +// Explain 'this' object here. +// They are represen
[clang] afcd862 - [analyzer] LazyCompoundVals should be always bound as default bindings
Author: Balazs Benics Date: 2022-09-13T08:58:46+02:00 New Revision: afcd862b2e0a561bf03b1e7b83e6eec8e7143098 URL: https://github.com/llvm/llvm-project/commit/afcd862b2e0a561bf03b1e7b83e6eec8e7143098 DIFF: https://github.com/llvm/llvm-project/commit/afcd862b2e0a561bf03b1e7b83e6eec8e7143098.diff LOG: [analyzer] LazyCompoundVals should be always bound as default bindings `LazyCompoundVals` should only appear as `default` bindings in the store. This fixes the second case in this patch-stack. Depends on: D132142 Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D132143 Added: Modified: clang/lib/StaticAnalyzer/Core/RegionStore.cpp clang/test/Analysis/trivial-copy-struct.cpp Removed: diff --git a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp index 58f3b1cebee6..eeadd1579a06 100644 --- a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp +++ b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp @@ -2400,7 +2400,11 @@ RegionStoreManager::bind(RegionBindingsConstRef B, Loc L, SVal V) { // Clear out bindings that may overlap with this binding. RegionBindingsRef NewB = removeSubRegionBindings(B, cast(R)); - return NewB.addBinding(BindingKey::Make(R, BindingKey::Direct), V); + + // LazyCompoundVals should be always bound as 'default' bindings. + auto KeyKind = isa(V) ? BindingKey::Default + : BindingKey::Direct; + return NewB.addBinding(BindingKey::Make(R, KeyKind), V); } RegionBindingsRef diff --git a/clang/test/Analysis/trivial-copy-struct.cpp b/clang/test/Analysis/trivial-copy-struct.cpp index 85efe854c38e..b9a2b1d9b201 100644 --- a/clang/test/Analysis/trivial-copy-struct.cpp +++ b/clang/test/Analysis/trivial-copy-struct.cpp @@ -27,10 +27,10 @@ void copy_on_heap(Node* n1) { clang_analyzer_dump(n2); // expected-warning-re {{&HeapSymRegion{conj_${{[0-9]+}}{Node *, LC{{[0-9]+}}, S{{[0-9]+}}, #{{[0-9]+}} clang_analyzer_dump(n1->ptr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}},0 S64b,struct Node}.ptr>}}} - clang_analyzer_dump(n2->ptr); // expected-warning {{Unknown}} FIXME: This should be the same as above. + clang_analyzer_dump(n2->ptr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}},0 S64b,struct Node}.ptr>}}} if (n1->ptr != n2->ptr) -clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}} FIXME: This should not be reachable. +clang_analyzer_warnIfReached(); // unreachable (void)(n1->ptr); (void)(n2->ptr); } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 7cddf9c - [analyzer] Dump the environment entry kind as well
Author: Balazs Benics Date: 2022-09-13T09:04:27+02:00 New Revision: 7cddf9cad18a65217c8ba0661fefcf78d841a16b URL: https://github.com/llvm/llvm-project/commit/7cddf9cad18a65217c8ba0661fefcf78d841a16b DIFF: https://github.com/llvm/llvm-project/commit/7cddf9cad18a65217c8ba0661fefcf78d841a16b.diff LOG: [analyzer] Dump the environment entry kind as well By this change the `exploded-graph-rewriter` will display the class kind of the expression of the environment entry. It makes easier to decide if the given entry corresponds to the lvalue or to the rvalue of some expression. It turns out the rewriter already had support for visualizing it, but probably was never actually used? Reviewed By: martong Differential Revision: https://reviews.llvm.org/D132109 Added: Modified: clang/lib/StaticAnalyzer/Core/Environment.cpp clang/test/Analysis/expr-inspection.c Removed: diff --git a/clang/lib/StaticAnalyzer/Core/Environment.cpp b/clang/lib/StaticAnalyzer/Core/Environment.cpp index 719793fa224c..3d017b81762a 100644 --- a/clang/lib/StaticAnalyzer/Core/Environment.cpp +++ b/clang/lib/StaticAnalyzer/Core/Environment.cpp @@ -274,7 +274,8 @@ void Environment::printJson(raw_ostream &Out, const ASTContext &Ctx, const Stmt *S = I->first.getStmt(); Indent(Out, InnerSpace, IsDot) - << "{ \"stmt_id\": " << S->getID(Ctx) << ", \"pretty\": "; + << "{ \"stmt_id\": " << S->getID(Ctx) << ", \"kind\": \"" + << S->getStmtClassName() << "\", \"pretty\": "; S->printJson(Out, nullptr, PP, /*AddQuotes=*/true); Out << ", \"value\": "; diff --git a/clang/test/Analysis/expr-inspection.c b/clang/test/Analysis/expr-inspection.c index 47ac0c251c1f..91c27f5b149f 100644 --- a/clang/test/Analysis/expr-inspection.c +++ b/clang/test/Analysis/expr-inspection.c @@ -36,7 +36,7 @@ void foo(int x) { // CHECK-NEXT: ]}, // CHECK-NEXT: "environment": { "pointer": "{{0x[0-9a-f]+}}", "items": [ // CHECK-NEXT: { "lctx_id": {{[0-9]+}}, "location_context": "#0 Call", "calling": "foo", "location": null, "items": [ -// CHECK-NEXT: { "stmt_id": {{[0-9]+}}, "pretty": "clang_analyzer_printState", "value": "&code{clang_analyzer_printState}" } +// CHECK-NEXT: { "stmt_id": {{[0-9]+}}, "kind": "ImplicitCastExpr", "pretty": "clang_analyzer_printState", "value": "&code{clang_analyzer_printState}" } // CHECK-NEXT: ]} // CHECK-NEXT: ]}, // CHECK-NEXT: "constraints": [ ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] b8e1da0 - [analyzer] Initialize ShouldEmitErrorsOnInvalidConfigValue analyzer option
Author: Balazs Benics Date: 2022-09-14T16:45:44+02:00 New Revision: b8e1da050673470f20a75d4ecca2c0a00d1a8691 URL: https://github.com/llvm/llvm-project/commit/b8e1da050673470f20a75d4ecca2c0a00d1a8691 DIFF: https://github.com/llvm/llvm-project/commit/b8e1da050673470f20a75d4ecca2c0a00d1a8691.diff LOG: [analyzer] Initialize ShouldEmitErrorsOnInvalidConfigValue analyzer option Downstream users who doesn't make use of the clang cc1 frontend for commandline argument parsing, won't benefit from the Marshalling provided default initialization of the AnalyzerOptions entries. More about this later. Those analyzer option fields, as they are bitfields, cannot be default initialized at the declaration (prior c++20), hence they are initialized at the constructor. The only problem is that `ShouldEmitErrorsOnInvalidConfigValue` was forgotten. In this patch I'm proposing to initialize that field with the rest. Note that this value is read by `CheckerRegistry.cpp:insertAndValidate()`. The analyzer options are initialized by the marshalling at `CompilerInvocation.cpp:GenerateAnalyzerArgs()` by the expansion of the `ANALYZER_OPTION_WITH_MARSHALLING` xmacro to the appropriate default value regardless of the constructor initialized list which I'm touching. Due to that this only affects users using CSA as a library, without serious effort, I believe we cannot test this. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D133851 Added: Modified: clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h Removed: diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h index 2baa7f2e0475..ea80761fe31e 100644 --- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h +++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h @@ -289,7 +289,8 @@ class AnalyzerOptions : public RefCountedBase { ShowCheckerHelpAlpha(false), ShowCheckerHelpDeveloper(false), ShowCheckerOptionList(false), ShowCheckerOptionAlphaList(false), ShowCheckerOptionDeveloperList(false), ShowEnabledCheckerList(false), -ShowConfigOptionsList(false), AnalyzeAll(false), +ShowConfigOptionsList(false), +ShouldEmitErrorsOnInvalidConfigValue(false), AnalyzeAll(false), AnalyzerDisplayProgress(false), eagerlyAssumeBinOpBifurcation(false), TrimGraph(false), visualizeExplodedGraphWithGraphViz(false), UnoptimizedCFG(false), PrintStats(false), NoRetryExhausted(false), ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions (PR #71284)
steakhal wrote: > @DonatNagyE The most straightforward issue that I see is that (if I > understand the code correctly) the intersected constraint (the value of the > variable `Constraint` at the end of > `handleEquivalentAlternativeSymOperands()`) is just discarded after checking > that it's not empty. It's plausible that this discarded information will be > re-calculated by this process in the follow-up range calculations and don't > have a concrete testcase where it's relevant that we "leave food on the > table" (I can try to find one if you'd find it helpful) but I think that it > may be useful to think about "preserving" this `Constraint` that we > calculated. (This is not a blocking issue: I can support merging the current > logic if you don't see a reasonably simple improvement in this area.) I tried adding this hunk at the end of the function, but the assertion never fired: ```c++ // If we learned something, save it. if (Constraint != OriginalConstraint) { assert(false); State = assign(EquivalenceClass::find(State, SymSym), Constraint); return static_cast(State); } ``` > There is also the choice that you limited the runtime to `|eqclass(X)| + > |eqclass(Y)|` by only checking the alternative constraints where one of the > two representatives is the original one. I think that here it's a good > decision to ensure a reasonable runtime by potentially discarding some > information, but I feel that it'd be good to document this decision with a > testcase. For example, I think > > ``` > void gh_62215_left_and_right(int x, int y, int z, int w) { > if (x != y) return; // x == y > if (z != w) return; // z == w > if (z <= x) return; // z > x > if (w >= y) return; // w < y > } > ``` Fair point. Actually, originally I wanted a full cross-product of the alternatives, which I refined to a half-cross-product by following Gábor's comments. We shall come back here and do something more refined if there is a practical need. I added a FIXME comment explaining our options, along with the test case you proposed. Thanks. > Finally there is the "why do we need this trickery at all?" question that was > also raised by @Xazax-hun. I agree with him that on a longer term, it'd be > good to "strengthen" the equality classes and ensure that we assign the same > RangeSet to each member of an EQClass (or, in other words, we assign the > RangeSet to an EQClass instead of an individual symbol). This would > potentially improve the runtime (although I'm not sure that performance is > especially relevant here) and handle the left-and-right testcase adequately. I think we haven't discussed yet the approach of applying the constraint to every eqclass member. That would feel like defeating the purpose of eqclasses at all. To me, our best option so far is to keep the representative symbols alive and canonicalize (replace) all sub-symbols in constraints with representatives whenever we merge eqclasses or add constraints. Thanks for the reviews folks! https://github.com/llvm/llvm-project/pull/71284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions (PR #71284)
https://github.com/steakhal updated https://github.com/llvm/llvm-project/pull/71284 >From 92ece501b340c3a2a52b5a4614ddb70bb3e35c93 Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Sat, 4 Nov 2023 13:44:28 +0100 Subject: [PATCH 1/3] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions The idea is that if we see a `X RELOP Y` being constrained to a RangeSet `S`, then check the eqclasses of X and Y respectively and for `X' RELOP Y'` SymSymExprs and try to infer their ranges. If there is no contradiction with any of the equivalent alternatives, then intersecting all these RangeSets should never be empty - aka. there should be a value satisfying the constraints we have. It costs around `|eqclass(X)| + |eqclass(y)|`. The approach has its limitations, as demonstrated by `gh_62215_contradicting_nested_right_equivalent`, where we would need to apply the same logic, but on a sub-expression of a direct operand. Before the patch, line 90, 100, and 112 would be reachable; and become unreachable after this. Line 127 will remain still reachable, but keep in mind that when cross-checking with Z3 (aka. Z3 refutation), then all 4 reports would be eliminated. The idea comes from https://github.com/llvm/llvm-project/issues/62215#issuecomment-1792878474 Fixes #62215 --- .../Core/RangeConstraintManager.cpp | 53 +++ clang/test/Analysis/constraint-assignor.c | 48 + 2 files changed, 101 insertions(+) diff --git a/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp b/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp index 5de99384449a4c8..d631369e895d3a9 100644 --- a/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp +++ b/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp @@ -2067,6 +2067,12 @@ class ConstraintAssignor : public ConstraintAssignorBase { return Assignor.assign(CoS, NewConstraint); } + /// Check if using an equivalent operand alternative would lead to + /// contradiction. + /// If a contradiction is witnessed, returns false; otherwise returns true. + bool handleEquivalentAlternativeSymOperands(const SymSymExpr *SymSym, + RangeSet Constraint); + /// Handle expressions like: a % b != 0. template bool handleRemainderOp(const SymT *Sym, RangeSet Constraint) { @@ -2218,11 +2224,58 @@ bool ConstraintAssignor::assignSymExprToConst(const SymExpr *Sym, return true; } +bool ConstraintAssignor::handleEquivalentAlternativeSymOperands( +const SymSymExpr *SymSym, RangeSet Constraint) { + SymbolRef LHS = SymSym->getLHS(); + SymbolRef RHS = SymSym->getRHS(); + EquivalenceClass LHSClass = EquivalenceClass::find(State, LHS); + EquivalenceClass RHSClass = EquivalenceClass::find(State, RHS); + SymbolSet SymbolsEqWithLHS = LHSClass.getClassMembers(State); + SymbolSet SymbolsEqWithRHS = RHSClass.getClassMembers(State); + llvm::SmallVector AlternativeSymSyms; + + // Gather left alternatives. + for (SymbolRef AlternativeLHS : SymbolsEqWithLHS) { +if (AlternativeLHS == LHS) + continue; +AlternativeSymSyms.emplace_back(AlternativeLHS, SymSym->getOpcode(), RHS, +SymSym->getType()); + } + + // Gather right alternatives. + for (SymbolRef AlternativeRHS : SymbolsEqWithRHS) { +if (AlternativeRHS == RHS) + continue; +AlternativeSymSyms.emplace_back(LHS, SymSym->getOpcode(), AlternativeRHS, +SymSym->getType()); + } + + // Crosscheck the inferred ranges. + for (SymSymExpr AltSymSym : AlternativeSymSyms) { +RangeSet AltSymSymConstrant = +SymbolicRangeInferrer::inferRange(RangeFactory, State, &AltSymSym); +Constraint = intersect(RangeFactory, Constraint, AltSymSymConstrant); + +// Check if we witnessed a contradiction with the equivalent alternative +// operand. +if (Constraint.isEmpty()) { + State = nullptr; + return false; +} + } + return true; +} + bool ConstraintAssignor::assignSymSymExprToRangeSet(const SymSymExpr *Sym, RangeSet Constraint) { if (!handleRemainderOp(Sym, Constraint)) return false; + if (const auto *SymSym = dyn_cast(Sym); + SymSym && !handleEquivalentAlternativeSymOperands(SymSym, Constraint)) { +return false; + } + std::optional ConstraintAsBool = interpreteAsBool(Constraint); if (!ConstraintAsBool) diff --git a/clang/test/Analysis/constraint-assignor.c b/clang/test/Analysis/constraint-assignor.c index 8210e576c98bd21..d5078b406e12601 100644 --- a/clang/test/Analysis/constraint-assignor.c +++ b/clang/test/Analysis/constraint-assignor.c @@ -82,3 +82,51 @@ void remainder_with_adjustment_of_composit_lhs(int x, int y) { clang_analyzer_eval(x + y != -1);// expected-warning{{TRUE}} (void)(x * y); // keep the constraints alive. } + +void gh_62215(int x, int y, int z) { + if (x != y) return; // x ==
[clang] [analyzer] Trust base to derived casts for dynamic types (PR #69057)
@@ -492,11 +492,13 @@ void check_required_cast() { void check_cast_behavior(OSObject *obj) { OSArray *arr1 = OSDynamicCast(OSArray, obj); - clang_analyzer_eval(arr1 == obj); // expected-warning{{TRUE}} -// expected-note@-1{{TRUE}} -// expected-note@-2{{Assuming 'arr1' is not equal to 'obj'}} -// expected-warning@-3{{FALSE}} -// expected-note@-4 {{FALSE}} + clang_analyzer_eval(arr1 == obj); // #check_cast_behavior_1 + // expected-warning@#check_cast_behavior_1 {{TRUE}} + // expected-note@#check_cast_behavior_1{{TRUE}} + // expected-note@#check_cast_behavior_1{{Assuming 'arr1' is equal to 'obj'}} steakhal wrote: I've looked at the exploded-graph before and after the patch, and the values and the branches are all the same. There is only one difference: in the new graph, after the `(OSArray *)` cast in init-expr of the decl, we have the type info bound: `Dynamic Types`: ``` SymRegion{reg_$0} -> OSArray (or a sub-class) ``` Later, when we load for the analyzer_eval, we can see that on the branch where the `safeMetaCast` succeeds, `arr1` and `obj` refers to the same lvalue, thus already proven to be equal.  BTW if I change the test into this, it would pass on both the old and new versions: ```c++ void check_cast_behavior(OSObject *obj) { OSArray *arr1 = OSDynamicCast(OSArray, obj); if (!arr1) return; // expected-note@-1 {{'arr1' is non-null}} expected-note@-1 {{Taking false branch}} // expected-note@-2 {{'arr1' is non-null}} expected-note@-2 {{Taking false branch}} clang_analyzer_eval(arr1 == obj); // expected-warning{{TRUE}} expected-note{{TRUE}} OSArray *arr2 = OSRequiredCast(OSArray, obj); clang_analyzer_eval(arr2 == obj); // expected-warning{{TRUE}} expected-note{{TRUE}} } ``` I think this test would describe the behavior for the "cast-succeeds" branch, aka. the branch where `arr1` is non-null. Consequently, I believe this PR would not regress this. https://github.com/llvm/llvm-project/pull/69057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve StdLibraryFunctionsChecker 'readlink' modeling. (PR #71373)
https://github.com/steakhal approved this pull request. LGTM, but please wait for @DonatNagyE to have a look. BTW I noticed that the note messages are not tested, but I'm okay to not test that I think. https://github.com/llvm/llvm-project/pull/71373 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][Analyzer][NFC] Simplify preDefault/preFseek/preFreadFwrite of StreamChecker (PR #71394)
https://github.com/steakhal requested changes to this pull request. I find this alternative less idiomatic (dissimilar to how we implement checks in other checkers), thus I find this less readable. I'm okay with that amount of redundancy as it was present. https://github.com/llvm/llvm-project/pull/71394 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions (PR #71284)
steakhal wrote: > > So, if I understand you correctly, at the 3rd if statement, we should > > canonicalize the symbol we are constraining by walking every sub-symbol and > > substituting it with its equivalent counterpart (if any), by basically with > > its eqclass' representative symbol. > > I am not 100% sure if I'd go that far (doing canonicalization). Symbols carry > a lot of information, like provenance. Bug reporters, or maybe even some > custom checker state might rely on this. I am afraid that replacing/rewriting > like that might have unexpected consequences, nothing we cannot solve, but I > am not sure whether we want to solve them. What I'd suggest is more like > always adding all the constraints to the representative, and lazily > propagating those constraints to the other members of the equivalence > classes, only when we mention them in a constraint. > > This might also be challenging for bug reporters to explain in some > scenarios, but at least we preserve the provenance information. > > @steakhal It wasn't actually what I had in mind. I wanted to assign constraints to canonicalized symexprs instead of to the symexpr the API gets from the assume call. And for lookups, we would again do the same thing, canonicalize and only then do the lookup - or let the lookup (rather infer do the canonicalization on the fly). So to be explicit, no other component would know what canonicalization happens within the range-based solver. https://github.com/llvm/llvm-project/pull/71284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Let the checkers query upper and lower bounds on symbols (PR #74141)
steakhal wrote: My take is that the z3-based solver is crashing all over the place. So its not just slower. We anyways don't have CI checks for it. Given all these, I'd rather not put more burden to the issue tracker regarding this. I'd consider it if these issues wouldn't be present though, but we are really far from that. https://github.com/llvm/llvm-project/pull/74141 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Move alpha checker EnumCastOutOfRange to optin (PR #67157)
=?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: steakhal wrote: > (Force pushed the branch because I wanted to rebase it onto a recent main and > fix the merge conflicts. Is there a better workflow than this?) I think git also offers conflict resolution for `git merge origin/main`. It should put the resolution into the merge commit. That being said, I usually just do the rebase/resolve approach and force-push; only after all conversations are resolved and I don't expect anyone reviewing at the time. But yea, a merge commit should be the way for GitHub AFAIK. https://github.com/llvm/llvm-project/pull/67157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Analyzer][NFC] Remove redundant function call (PR #75076)
https://github.com/steakhal commented: I would agree with @isuckatcs, and I'd be weak against this PR. Right now I don't see the benefit of asserting this. Consider downstream users that might use this reporting system and have their own trackers. (We don't at Sonar, but pretend), then they would need to remove one more unjust assert. Speaking of assert, [here](https://discourse.llvm.org/t/rfc-error-handling-in-release-builds-aka-can-i-use-lldbassert-or-not/74738/10) is my statement about them for the whole llvm prohect. We are on the cold path anyways here. It should not matter. https://github.com/llvm/llvm-project/pull/75076 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Trust base to derived casts for dynamic types (PR #69057)
@@ -492,11 +492,13 @@ void check_required_cast() { void check_cast_behavior(OSObject *obj) { OSArray *arr1 = OSDynamicCast(OSArray, obj); - clang_analyzer_eval(arr1 == obj); // expected-warning{{TRUE}} -// expected-note@-1{{TRUE}} -// expected-note@-2{{Assuming 'arr1' is not equal to 'obj'}} -// expected-warning@-3{{FALSE}} -// expected-note@-4 {{FALSE}} + clang_analyzer_eval(arr1 == obj); // #check_cast_behavior_1 + // expected-warning@#check_cast_behavior_1 {{TRUE}} + // expected-note@#check_cast_behavior_1{{TRUE}} + // expected-note@#check_cast_behavior_1{{Assuming 'arr1' is equal to 'obj'}} steakhal wrote: Ping @haoNoQ https://github.com/llvm/llvm-project/pull/69057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Add std::variant checker (PR #66481)
=?utf-8?q?G=C3=A1bor?= Spaits,Gabor Spaits ,Gabor Spaits ,Gabor Spaits Message-ID: In-Reply-To: steakhal wrote: Yes, why not.. https://github.com/llvm/llvm-project/pull/66481 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][NFC] Add more tests of 'StreamChecker' about 'tmpfile' (PR #70540)
https://github.com/steakhal approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/70540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Simplify SVal for simple NonLoc->Loc casts (PR #66463)
steakhal wrote: This is a brief summary of my recent investigation, no direct action is required. I had a quick look at the issue differences between clang-17 and llvm/main as a preparation for the clang-18 release in early January and noticed that because of this patch, we have some unexpected improvements. FYI this FP was eliminated by Z3 refutation anyway if you enabled that, but now even the engine can avoid exploring that dead-code. --- Now that we simplify more often (see the patch), we can exclude more execution paths. Here is an [example](https://godbolt.org/z/x35MT33xT): ```C++ int fixedFP(int x) { int* z = 0; if ((x & 1) && ((x & 1) ^ 1)) { return *z; // Nullptr deref FP? Now it's eliminated. } return 0; } ``` Notice, that the FP is only eliminated in C++ mode, and still present in C mode for some reason. My theory is that we use `int` as a `bool` type, thus simplification can't exploit that a boolean expression can only be `1` or `0`; and fails to simplify the expression. The reason for having a wider impact than originally anticipated was that the function we modified `handleLValueBitCast()` suggested that it should be invoked only for lvalue bitcasts, however, actually, it gets invoked from a few more places including ExprEngineC.cpp ExprEngine::VisitCast: ```C++ case CK_IntegralToBoolean: case CK_IntegralToFloating: case CK_FloatingToIntegral: case CK_FloatingToBoolean: case CK_FloatingCast: case CK_FloatingRealToComplex: case CK_FloatingComplexToReal: case CK_FloatingComplexToBoolean: case CK_FloatingComplexCast: case CK_FloatingComplexToIntegralComplex: case CK_IntegralRealToComplex: case CK_IntegralComplexToReal: case CK_IntegralComplexToBoolean: case CK_IntegralComplexCast: case CK_IntegralComplexToFloatingComplex: case CK_CPointerToObjCPointerCast: case CK_BlockPointerToObjCPointerCast: case CK_AnyPointerToBlockPointerCast: case CK_ObjCObjectLValueCast: case CK_ZeroToOCLOpaqueType: case CK_IntToOCLSampler: case CK_LValueBitCast: case CK_FloatingToFixedPoint: case CK_FixedPointToFloating: case CK_FixedPointCast: case CK_FixedPointToBoolean: case CK_FixedPointToIntegral: case CK_IntegralToFixedPoint: { state = handleLValueBitCast(state, Ex, LCtx, T, ExTy, CastE, Bldr, Pred); continue; } ``` And this could have caused this "improvement". --- The exploded node of the `PostStmt` `((x & 1) ^ 1)` was this prior the patch:  And now it looks like this (notice that the expression value is now simplified to `0 U1b`):  Here is how a child node looks like for C now (notice that the constraint is `((reg_$0) & 1) ^ 1: { [-2147483648, -1], [1, 2147483647] }`. This is the same state we had before this patch for C and C++ as well.  https://github.com/llvm/llvm-project/pull/66463 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
https://github.com/steakhal requested changes to this pull request. I must admit that I didn't look at the issue too much, but this patch speaks for itself. Clean, to the point, and meets our conventions. Kudos. I only have minor remarks. And be sure to mention `Fixes #70464` in the PR/commit message to auto-close the issue. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
@@ -1222,6 +1222,15 @@ void ExprEngine::ProcessInitializer(const CFGInitializer CFGInit, PostInitializer PP(BMI, FieldLoc.getAsRegion(), stackFrame); evalBind(Tmp, Init, Pred, FieldLoc, InitVal, /*isInit=*/true, &PP); } + } else if (BMI->isBaseInitializer() && isa(Init)) { +// When the base class is initialized with an initialization list, there +// will not be a CXXConstructExpr to initialize the base region. Hence, we +// need to make the bind for it. +StoreManager &StoreMgr = State->getStateManager().getStoreManager(); +SVal BaseLoc = StoreMgr.evalDerivedToBase( steakhal wrote: ```suggestion SVal BaseLoc = getStoreManager()->evalDerivedToBase( ``` https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
@@ -0,0 +1,69 @@ +// Refer issue 70464 for more details. +// +// When the base class does not have a declared constructor, the base +// initializer in the constructor of the derived class should use the given +// initializer list to finish the initialization of the base class. +// +// Also testing base constructor and delegate constructor to make sure this +// change will not break these two cases when a CXXConstructExpr is available. + +// RUN: %clang_analyze_cc1 %s -verify -analyzer-checker=core,debug.ExprInspection steakhal wrote: I'd put this at the top, as that's the usual place for these RUN lines. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
@@ -0,0 +1,69 @@ +// Refer issue 70464 for more details. +// +// When the base class does not have a declared constructor, the base +// initializer in the constructor of the derived class should use the given +// initializer list to finish the initialization of the base class. +// +// Also testing base constructor and delegate constructor to make sure this +// change will not break these two cases when a CXXConstructExpr is available. + +// RUN: %clang_analyze_cc1 %s -verify -analyzer-checker=core,debug.ExprInspection + +void clang_analyzer_dump(int); + +namespace init_list { + +struct foo { + int foox; +}; + +class bar : public foo { +public: + bar() : foo{42} { +// The dereference to this->foox below should be initialized properly. +clang_analyzer_dump(this->foox); // expected-warning{{42 S32b}} steakhal wrote: ```suggestion clang_analyzer_dump(this->foox); // expected-warning{{42 S32b}} clang_analyzer_dump(foox); // expected-warning{{42 S32b}} ``` Let's just have both here to be sure. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
@@ -1222,6 +1222,15 @@ void ExprEngine::ProcessInitializer(const CFGInitializer CFGInit, PostInitializer PP(BMI, FieldLoc.getAsRegion(), stackFrame); evalBind(Tmp, Init, Pred, FieldLoc, InitVal, /*isInit=*/true, &PP); } + } else if (BMI->isBaseInitializer() && isa(Init)) { +// When the base class is initialized with an initialization list, there +// will not be a CXXConstructExpr to initialize the base region. Hence, we +// need to make the bind for it. +StoreManager &StoreMgr = State->getStateManager().getStoreManager(); +SVal BaseLoc = StoreMgr.evalDerivedToBase( +thisVal, QualType(BMI->getBaseClass(), 0), BMI->isBaseVirtual()); +SVal InitVal = State->getSVal(Init, stackFrame); +evalBind(Tmp, Init, Pred, BaseLoc, InitVal, true); steakhal wrote: BTW I can see that in the neighborhood, we also pass a `PostInitializer PP` as the last argument of `evalBind`. Why don't you do the same here? To me, we should deal with the same "event"/program point here as everywhere in this function. Maybe hoist `PostInitializer PP(BMI, FieldLoc.getAsRegion(), stackFrame);` and use `PP` everywhere. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
@@ -0,0 +1,69 @@ +// Refer issue 70464 for more details. +// +// When the base class does not have a declared constructor, the base +// initializer in the constructor of the derived class should use the given +// initializer list to finish the initialization of the base class. +// +// Also testing base constructor and delegate constructor to make sure this +// change will not break these two cases when a CXXConstructExpr is available. + +// RUN: %clang_analyze_cc1 %s -verify -analyzer-checker=core,debug.ExprInspection + +void clang_analyzer_dump(int); + +namespace init_list { + +struct foo { + int foox; +}; + +class bar : public foo { +public: + bar() : foo{42} { +// The dereference to this->foox below should be initialized properly. +clang_analyzer_dump(this->foox); // expected-warning{{42 S32b}} + } +}; + +void entry() { bar test; } + +} // namespace init_list + +namespace base_ctor_call { + +void clang_analyzer_dump(int); + +struct foo { + int foox; + foo(int x) : foox(x) {} +}; + +class bar : public foo { +public: + bar() : foo{42} { +clang_analyzer_dump(this->foox); // expected-warning{{42 S32b}} + } +}; + +void entry() { bar test; } + +} // namespace base_ctor_call + +namespace delegate_ctor_call { + +void clang_analyzer_dump(int); + +struct foo { + int foox; +}; + +struct bar : foo { steakhal wrote: For inheritance-related stuff, I'd advocate for using identifiers that carry some sort of meaning, possibly denoting their part of the relationship, such as `Derived` and `Base`. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
@@ -0,0 +1,69 @@ +// Refer issue 70464 for more details. steakhal wrote: Feel free to put a full link if you want. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
@@ -0,0 +1,69 @@ +// Refer issue 70464 for more details. +// +// When the base class does not have a declared constructor, the base +// initializer in the constructor of the derived class should use the given +// initializer list to finish the initialization of the base class. +// +// Also testing base constructor and delegate constructor to make sure this +// change will not break these two cases when a CXXConstructExpr is available. + +// RUN: %clang_analyze_cc1 %s -verify -analyzer-checker=core,debug.ExprInspection + +void clang_analyzer_dump(int); + +namespace init_list { + +struct foo { + int foox; +}; + +class bar : public foo { +public: + bar() : foo{42} { +// The dereference to this->foox below should be initialized properly. +clang_analyzer_dump(this->foox); // expected-warning{{42 S32b}} + } +}; + +void entry() { bar test; } + +} // namespace init_list + +namespace base_ctor_call { + +void clang_analyzer_dump(int); steakhal wrote: Only define these in the global namespace, please. That's the common pattern so far in the tests. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Update CallDescription of 'tmpfile' & 'fopen' in StreamChecker (PR #70540)
https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/70540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Update CallDescription of 'tmpfile' & 'fopen' in StreamChecker (PR #70540)
@@ -0,0 +1,16 @@ +// RUN: %clang_analyze_cc1 -fno-builtin -analyzer-checker=core,alpha.unix.Stream -verify %s +// expected-no-diagnostics + +typedef struct _FILE FILE; + +// These functions are not standard C library functions. +FILE *tmpfile(const char *restrict path); +FILE *fopen(const char *restrict path); steakhal wrote: ```suggestion // These functions are not standard C library functions. FILE *tmpfile(const char *restrict path); // Real 'tmpfile' should have exactly 0 formal parameters. FILE *fopen(const char *restrict path); // Real 'fopen' should have exactly 2 formal parameters. ``` https://github.com/llvm/llvm-project/pull/70540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Update CallDescription of 'tmpfile' & 'fopen' in StreamChecker (PR #70540)
https://github.com/steakhal requested changes to this pull request. Still looks good to me. I recommended a couple comments here and there to clarify the intent of the test, and to raise awareness. I'd suggest to reword the PR title (and the commit title ofc) to something like `[analyzer] Restrict fopen and tmpfile modeling to POSIX versions in StreamChecker` I'd also suggest renaming the `clang/test/Analysis/stream-wild-function.c` file to something more descriptive like `clang/test/Analysis/non-posix-stream-functions.c` https://github.com/llvm/llvm-project/pull/70540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Update CallDescription of 'tmpfile' & 'fopen' in StreamChecker (PR #70540)
@@ -0,0 +1,16 @@ +// RUN: %clang_analyze_cc1 -fno-builtin -analyzer-checker=core,alpha.unix.Stream -verify %s +// expected-no-diagnostics + +typedef struct _FILE FILE; + +// These functions are not standard C library functions. +FILE *tmpfile(const char *restrict path); +FILE *fopen(const char *restrict path); + +void test_fopen(void) { + FILE *fp = fopen("file"); +} + +void test_tmpfile(void) { + FILE *fp = tmpfile("file"); +} steakhal wrote: ```suggestion void non_posix_fopen(void) { FILE *fp = fopen("file"); // no-leak: this isn't the standard POSIX fopen, we don't the semantics of this call. } void non_posix_tmpfile(void) { FILE *fp = tmpfile("file"); // no-leak: this isn't the standard POSIX tmpfile, we don't the semantics of this call. } ``` https://github.com/llvm/llvm-project/pull/70540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve reports from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -174,9 +176,119 @@ compareValueToThreshold(ProgramStateRef State, NonLoc Value, NonLoc Threshold, return {nullptr, nullptr}; } -void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, -const Stmt* LoadS, -CheckerContext &checkerContext) const { +static std::string getRegionName(const SubRegion *Region) { + std::string RegName = Region->getDescriptiveName(); + if (!RegName.empty()) +return RegName; steakhal wrote: ```suggestion if (std::string RegName = Region->getDescriptiveName(); !RegName.empty()) return RegName; ``` https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve reports from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve reports from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -174,9 +176,119 @@ compareValueToThreshold(ProgramStateRef State, NonLoc Value, NonLoc Threshold, return {nullptr, nullptr}; } -void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, -const Stmt* LoadS, -CheckerContext &checkerContext) const { +static std::string getRegionName(const SubRegion *Region) { + std::string RegName = Region->getDescriptiveName(); + if (!RegName.empty()) +return RegName; + + // Field regions only have descriptive names when their parent has a + // descriptive name; so we provide a fallback representation for them: + if (const auto *FR = Region->getAs()) { +StringRef Name = FR->getDecl()->getName(); +if (!Name.empty()) + return formatv("the field '{0}'", Name); +return "the unnamed field"; + } + + if (isa(Region)) +return "the memory returned by 'alloca'"; + + if (isa(Region) && + isa(Region->getMemorySpace())) +return "the heap area"; + + if (isa(Region)) +return "the string literal"; + + return "the region"; +} + +static std::optional getConcreteValue(NonLoc SV) { + if (auto ConcreteVal = SV.getAs()) { +const llvm::APSInt &IntVal = ConcreteVal->getValue(); +return IntVal.tryExtValue(); + } + return std::nullopt; +} + +static const char *ShortMsgTemplates[] = { +"Out of bound access to memory preceding {0}", +"Out of bound access to memory after the end of {0}", +"Potential out of bound access to {0} with tainted offset"}; + +static std::string getShortMsg(OOB_Kind Kind, std::string RegName) { + return formatv(ShortMsgTemplates[Kind], RegName); steakhal wrote: You could have the `ShortMsgTemplates` defined within this function, closer to the place being used. https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve reports from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal requested changes to this pull request. Looks good. I only have minor remarks. Consider renaming the PR `Improve reports` -> `Improve messages`, or `diagnostics`, to highlight that the "messages" aspect is improved, not e.g. improving the FP/TP rate or something like that. https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve reports from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -174,9 +176,119 @@ compareValueToThreshold(ProgramStateRef State, NonLoc Value, NonLoc Threshold, return {nullptr, nullptr}; } -void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, -const Stmt* LoadS, -CheckerContext &checkerContext) const { +static std::string getRegionName(const SubRegion *Region) { + std::string RegName = Region->getDescriptiveName(); + if (!RegName.empty()) +return RegName; + + // Field regions only have descriptive names when their parent has a + // descriptive name; so we provide a fallback representation for them: + if (const auto *FR = Region->getAs()) { +StringRef Name = FR->getDecl()->getName(); +if (!Name.empty()) + return formatv("the field '{0}'", Name); +return "the unnamed field"; steakhal wrote: ```suggestion if (StringRef Name = FR->getDecl()->getName(); !Name.empty()) return formatv("the field '{0}'", Name); return "the unnamed field"; ``` https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve reports from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -174,9 +176,119 @@ compareValueToThreshold(ProgramStateRef State, NonLoc Value, NonLoc Threshold, return {nullptr, nullptr}; } -void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, -const Stmt* LoadS, -CheckerContext &checkerContext) const { +static std::string getRegionName(const SubRegion *Region) { + std::string RegName = Region->getDescriptiveName(); + if (!RegName.empty()) +return RegName; + + // Field regions only have descriptive names when their parent has a + // descriptive name; so we provide a fallback representation for them: + if (const auto *FR = Region->getAs()) { +StringRef Name = FR->getDecl()->getName(); +if (!Name.empty()) + return formatv("the field '{0}'", Name); +return "the unnamed field"; + } + + if (isa(Region)) +return "the memory returned by 'alloca'"; + + if (isa(Region) && + isa(Region->getMemorySpace())) +return "the heap area"; + + if (isa(Region)) +return "the string literal"; + + return "the region"; +} + +static std::optional getConcreteValue(NonLoc SV) { + if (auto ConcreteVal = SV.getAs()) { +const llvm::APSInt &IntVal = ConcreteVal->getValue(); +return IntVal.tryExtValue(); steakhal wrote: ```suggestion return ConcreteVal->getValue().tryExtValue(); ``` https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve reports from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -174,9 +176,119 @@ compareValueToThreshold(ProgramStateRef State, NonLoc Value, NonLoc Threshold, return {nullptr, nullptr}; } -void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, -const Stmt* LoadS, -CheckerContext &checkerContext) const { +static std::string getRegionName(const SubRegion *Region) { + std::string RegName = Region->getDescriptiveName(); + if (!RegName.empty()) +return RegName; + + // Field regions only have descriptive names when their parent has a + // descriptive name; so we provide a fallback representation for them: + if (const auto *FR = Region->getAs()) { +StringRef Name = FR->getDecl()->getName(); +if (!Name.empty()) + return formatv("the field '{0}'", Name); +return "the unnamed field"; + } + + if (isa(Region)) +return "the memory returned by 'alloca'"; + + if (isa(Region) && + isa(Region->getMemorySpace())) +return "the heap area"; + + if (isa(Region)) +return "the string literal"; + + return "the region"; +} + +static std::optional getConcreteValue(NonLoc SV) { + if (auto ConcreteVal = SV.getAs()) { +const llvm::APSInt &IntVal = ConcreteVal->getValue(); +return IntVal.tryExtValue(); + } + return std::nullopt; +} + +static const char *ShortMsgTemplates[] = { +"Out of bound access to memory preceding {0}", +"Out of bound access to memory after the end of {0}", +"Potential out of bound access to {0} with tainted offset"}; + +static std::string getShortMsg(OOB_Kind Kind, std::string RegName) { + return formatv(ShortMsgTemplates[Kind], RegName); steakhal wrote: I didnt mean local statics, simple locals to global cstring constant literals should be fine. https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix assertion failure in CXXInstanceCall::getCXXThisVal (PR #70837)
https://github.com/steakhal created https://github.com/llvm/llvm-project/pull/70837 Workaround the case when the `this` pointer is actually a `NonLoc`, by returning `Unknown` instead. The solution isn't ideal, as `this` should be really a `Loc`, but due to how casts work, I feel this is our easiest and best option. I've considered using `SVB.evalCast(ThisVal, Base->getType(), QualType())`, but it doesn't work as `EvalCastVisitor::VisitNonLocSymbolVal()` only evaluates casts that happen from NonLoc to NonLocs. When I tried to actually implement that case, I figured: 1) Create a SymbolicRegion from that nonloc::SymbolVal; but SymbolRegion ctor expects a pointer type for the symbol. 2) Okay, just have a SymbolCast, getting us the pointer type; but SymbolRegion expects SymbolData symbols, not generic SymExprs, as stated: > // Because pointer arithmetic is represented by ElementRegion layers, > // the base symbol here should not contain any arithmetic. 3) We can't use ElementRegions to perform this cast because to have an ElementRegion, you already have to have a SubRegion that you want to cast, but the point is that we don't have that. At this point, I gave up, and just returned `Unknown` xD IMO this is still better than having a crash. Fixes #69922 >From a7f64815f4986fad597b9cb2d1acce2de9ac20bf Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Mon, 23 Oct 2023 18:10:29 +0200 Subject: [PATCH] [analyzer] Fix assertion failure in CXXInstanceCall::getCXXThisVal Workaround the case when the `this` pointer is actually a `NonLoc`, by returning `Unknown` instead. The solution isn't ideal, as `this` should be really a `Loc`, but due to how casts work, I feel this is our easiest and best option. I've considered using `SVB.evalCast(ThisVal, Base->getType(), QualType())`, but it doesn't work as `EvalCastVisitor::VisitNonLocSymbolVal()` only evaluates casts that happen from NonLoc to NonLocs. When I tried to actually implement that case, I figured: 1) Create a SymbolicRegion from that nonloc::SymbolVal; but SymbolRegion ctor expects a pointer type for the symbol. 2) Okay, just have a SymbolCast, getting us the pointer type; but SymbolRegion expects SymbolData symbols, not generic SymExprs, as stated: > // Because pointer arithmetic is represented by ElementRegion layers, > // the base symbol here should not contain any arithmetic. 3) We can't use ElementRegions to perform this cast because to have an ElementRegion, you already have to have a SubRegion that you want to cast, but the point is that we don't have that. At this point, I gave up, and just returned `Unknown` xD IMO this is still better than having a crash. Fixes #69922 --- clang/lib/StaticAnalyzer/Core/CallEvent.cpp | 5 ++--- clang/test/Analysis/builtin_bitcast.cpp | 21 + 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp index ad5bb66c4fff3c8..20bc64dc2631cec 100644 --- a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp +++ b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp @@ -715,10 +715,9 @@ void CXXInstanceCall::getExtraInvalidatedValues( SVal CXXInstanceCall::getCXXThisVal() const { const Expr *Base = getCXXThisExpr(); // FIXME: This doesn't handle an overloaded ->* operator. - if (!Base) + SVal ThisVal = Base ? getSVal(Base) : UnknownVal(); + if (isa(ThisVal)) return UnknownVal(); - - SVal ThisVal = getSVal(Base); assert(ThisVal.isUnknownOrUndef() || isa(ThisVal)); return ThisVal; } diff --git a/clang/test/Analysis/builtin_bitcast.cpp b/clang/test/Analysis/builtin_bitcast.cpp index 396e7caa45f6acd..13475694d287a22 100644 --- a/clang/test/Analysis/builtin_bitcast.cpp +++ b/clang/test/Analysis/builtin_bitcast.cpp @@ -30,3 +30,24 @@ void test(int i) { clang_analyzer_dump(g4); // expected-warning@-1 {{&i [as 64 bit integer]}} } + +struct A { + int n; + void set(int x) { +n = x; + } +}; +using ptr_size = decltype(sizeof(void *)); +void gh_69922(ptr_size p) { + // expected-warning-re@+1 {{(reg_${{[0-9]+}}) & 1U}} + clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)); + + __builtin_bit_cast(A*, p & 1)->set(2); // no-crash + // However, since the `this` pointer is expected to be a Loc, but we have + // NonLoc there, we simply give up and resolve it as `Unknown`. + // Then, inside the `set()` member function call we can't evaluate the + // store to the member variable `n`. + + clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)->n); // Ideally, this should print "2". + // expected-warning-re@-1 {{(reg_${{[0-9]+}}) & 1U}} +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and // non-symbolic regions (e.g. a field subregion of a symbolic region) in // unknown space. -auto [state_precedesLowerBound, state_withinLowerBound] = -compareValueToThreshold(state, ByteOffset, -svalBuilder.makeZeroArrayIndex(), svalBuilder); +auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold( +State, ByteOffset, SVB.makeZeroArrayIndex(), SVB); -if (state_precedesLowerBound && !state_withinLowerBound) { +if (PrecedesLowerBound && !WithinLowerBound) { // We know that the index definitely precedes the lower bound. - reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes); + std::string RegName = getRegionName(Reg); + std::string Msg = getPrecedesMsg(RegName, ByteOffset); + reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg); steakhal wrote: ```suggestion reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, std::move(RegName), std::move(Msg)); ``` https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal approved this pull request. I only had a couple `std::move`s missing, an FYI comment, and one question about the diagnostics in the tests. Even in the current state, I think it's a good baby step in improving the status quo. Thank you for pushing this forward! --- One remark about the review workflow, I'd prefer if the conversation starter would resolve the conversations. Let me explain why: Given the amount of reviewers for CSA, I'd suggest explicitly supporting reviewer experience. Speaking of that, personally I find the following workflow to suite myself the best: - All comments needs a reaction, either by explict commenting on it or by putting there a thums-up emoji or something. This helps the author follow which comments were addressed downstream and is pending for submission. An explicit comment is expected for challenging a comment. I prefer emojies, becase they don't send an email to everyone subscribe - unlike for individual comments; and usually I also batch individual commits into a "self-review" as patch author when I reply for the same reason. - All comments needs to be reacted before requestion the next round of review from the person who added those comments. - The comment should be only marked resolved by the person who raised the concern. This ensures that the comment was adequately solved, thus the issue is no longer relevant. - To merge a PR, the PR should have no open conversations. Note that I'm not raising these because I feel we have to adjust, but rather because I find the reviews experience on GitHub really poor in general. Comments disappear, comments don't show up for the patch author, only if they look at the right pane like "Conversations" and "Files changed" - yes, not all comments are present on each -.- And I've been bitten by it as a patch author many many times, and expectations around "reacting on comments" helped to mitigate this pain to some degree. Note that this is only my personal preference, and if I'm not mistaken, there is no legitimized consensus around the project. At least, last time I checked the following relevant thread around GitHub review workflows: https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178/ https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -174,9 +176,116 @@ compareValueToThreshold(ProgramStateRef State, NonLoc Value, NonLoc Threshold, return {nullptr, nullptr}; } -void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, -const Stmt* LoadS, -CheckerContext &checkerContext) const { +static std::string getRegionName(const SubRegion *Region) { + if (std::string RegName = Region->getDescriptiveName(); !RegName.empty()) +return RegName; + + // Field regions only have descriptive names when their parent has a + // descriptive name; so we provide a fallback representation for them: + if (const auto *FR = Region->getAs()) { +if (StringRef Name = FR->getDecl()->getName(); !Name.empty()) + return formatv("the field '{0}'", Name); +return "the unnamed field"; + } + + if (isa(Region)) +return "the memory returned by 'alloca'"; + + if (isa(Region) && + isa(Region->getMemorySpace())) +return "the heap area"; + + if (isa(Region)) +return "the string literal"; + + return "the region"; +} + +static std::optional getConcreteValue(NonLoc SV) { + if (auto ConcreteVal = SV.getAs()) { +return ConcreteVal->getValue().tryExtValue(); + } + return std::nullopt; +} + +static std::string getShortMsg(OOB_Kind Kind, std::string RegName) { + static const char *ShortMsgTemplates[] = { + "Out of bound access to memory preceding {0}", + "Out of bound access to memory after the end of {0}", + "Potential out of bound access to {0} with tainted offset"}; + + return formatv(ShortMsgTemplates[Kind], RegName); +} + +static std::string getPrecedesMsg(std::string RegName, NonLoc Offset) { + SmallString<128> Buf; + llvm::raw_svector_ostream Out(Buf); steakhal wrote: Have you considered using `std::string` and `llvm::raw_string_ostream`, given that you convert it to `std::string` anyways. BTW sorry for not spotting this at the previous round. This applies to other similar cases as well. https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and // non-symbolic regions (e.g. a field subregion of a symbolic region) in // unknown space. -auto [state_precedesLowerBound, state_withinLowerBound] = -compareValueToThreshold(state, ByteOffset, -svalBuilder.makeZeroArrayIndex(), svalBuilder); +auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold( +State, ByteOffset, SVB.makeZeroArrayIndex(), SVB); -if (state_precedesLowerBound && !state_withinLowerBound) { +if (PrecedesLowerBound && !WithinLowerBound) { // We know that the index definitely precedes the lower bound. - reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes); + std::string RegName = getRegionName(Reg); + std::string Msg = getPrecedesMsg(RegName, ByteOffset); + reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg); return; } -if (state_withinLowerBound) - state = state_withinLowerBound; +if (WithinLowerBound) + State = WithinLowerBound; } // CHECK UPPER BOUND - DefinedOrUnknownSVal Size = getDynamicExtent(state, Reg, svalBuilder); + DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB); if (auto KnownSize = Size.getAs()) { -auto [state_withinUpperBound, state_exceedsUpperBound] = -compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder); +auto [WithinUpperBound, ExceedsUpperBound] = +compareValueToThreshold(State, ByteOffset, *KnownSize, SVB); -if (state_exceedsUpperBound) { - if (!state_withinUpperBound) { +if (ExceedsUpperBound) { + if (!WithinUpperBound) { // We know that the index definitely exceeds the upper bound. -reportOOB(checkerContext, state_exceedsUpperBound, OOB_Exceeds); +std::string RegName = getRegionName(Reg); +std::string Msg = getExceedsMsg(C.getASTContext(), RegName, ByteOffset, +*KnownSize, Location); +reportOOB(C, ExceedsUpperBound, OOB_Exceeds, ByteOffset, RegName, Msg); return; } - if (isTainted(state, ByteOffset)) { + if (isTainted(State, ByteOffset)) { // Both cases are possible, but the index is tainted, so report. -reportOOB(checkerContext, state_exceedsUpperBound, OOB_Taint, - ByteOffset); +std::string RegName = getRegionName(Reg); +std::string Msg = getTaintMsg(RegName); +reportOOB(C, ExceedsUpperBound, OOB_Taint, ByteOffset, RegName, Msg); steakhal wrote: ```suggestion reportOOB(C, ExceedsUpperBound, OOB_Taint, ByteOffset, std::move(RegName), std::move(Msg)); ``` https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and // non-symbolic regions (e.g. a field subregion of a symbolic region) in // unknown space. -auto [state_precedesLowerBound, state_withinLowerBound] = -compareValueToThreshold(state, ByteOffset, -svalBuilder.makeZeroArrayIndex(), svalBuilder); +auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold( +State, ByteOffset, SVB.makeZeroArrayIndex(), SVB); -if (state_precedesLowerBound && !state_withinLowerBound) { +if (PrecedesLowerBound && !WithinLowerBound) { // We know that the index definitely precedes the lower bound. - reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes); + std::string RegName = getRegionName(Reg); + std::string Msg = getPrecedesMsg(RegName, ByteOffset); + reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg); return; } -if (state_withinLowerBound) - state = state_withinLowerBound; +if (WithinLowerBound) + State = WithinLowerBound; } // CHECK UPPER BOUND - DefinedOrUnknownSVal Size = getDynamicExtent(state, Reg, svalBuilder); + DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB); if (auto KnownSize = Size.getAs()) { -auto [state_withinUpperBound, state_exceedsUpperBound] = -compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder); +auto [WithinUpperBound, ExceedsUpperBound] = +compareValueToThreshold(State, ByteOffset, *KnownSize, SVB); -if (state_exceedsUpperBound) { - if (!state_withinUpperBound) { +if (ExceedsUpperBound) { + if (!WithinUpperBound) { // We know that the index definitely exceeds the upper bound. -reportOOB(checkerContext, state_exceedsUpperBound, OOB_Exceeds); +std::string RegName = getRegionName(Reg); +std::string Msg = getExceedsMsg(C.getASTContext(), RegName, ByteOffset, +*KnownSize, Location); +reportOOB(C, ExceedsUpperBound, OOB_Exceeds, ByteOffset, RegName, Msg); steakhal wrote: ```suggestion reportOOB(C, ExceedsUpperBound, OOB_Exceeds, ByteOffset, std::move(RegName), std::move(Msg)); ``` https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -0,0 +1,149 @@ +// RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-output=text\ +// RUN: -analyzer-checker=core,alpha.security.ArrayBoundV2,unix.Malloc,alpha.security.taint -verify %s + +int array[10]; + +void arrayUnderflow(void) { + array[-3] = 5; + // expected-warning@-1 {{Out of bound access to memory preceding 'array'}} + // expected-note@-2 {{Access of 'array' at negative byte offset -12}} +} + +int scanf(const char *restrict fmt, ...); + +void taintedIndex(void) { + int index; + scanf("%d", &index); + // expected-note@-1 {{Taint originated here}} + // expected-note@-2 {{Taint propagated to the 2nd argument}} + array[index] = 5; + // expected-warning@-1 {{Potential out of bound access to 'array' with tainted offset}} + // expected-note@-2 {{Access of 'array' with a tainted offset that may be too large}} +} + +void arrayOverflow(void) { + array[12] = 5; + // expected-warning@-1 {{Out of bound access to memory after the end of 'array'}} + // expected-note@-2 {{Access of 'array' at index 12, while it holds only 10 'int' elements}} +} + +int scalar; +int scalarOverflow(void) { + return (&scalar)[1]; + // expected-warning@-1 {{Out of bound access to memory after the end of 'scalar'}} + // expected-note@-2 {{Access of 'scalar' at index 1, while it holds only a single 'int' element}} +} + +int oneElementArray[1]; +int oneElementArrayOverflow(void) { + return oneElementArray[1]; + // expected-warning@-1 {{Out of bound access to memory after the end of 'oneElementArray'}} + // expected-note@-2 {{Access of 'oneElementArray' at index 1, while it holds only a single 'int' element}} +} + +short convertedArray(void) { + return ((short*)array)[47]; + // expected-warning@-1 {{Out of bound access to memory after the end of 'array'}} + // expected-note@-2 {{Access of 'array' at index 47, while it holds only 20 'short' elements}} +} + +struct vec { + int len; + double elems[64]; +} v; + +double arrayInStruct(void) { + return v.elems[64]; + // expected-warning@-1 {{Out of bound access to memory after the end of 'v.elems'}} + // expected-note@-2 {{Access of 'v.elems' at index 64, while it holds only 64 'double' elements}} +} + +double arrayInStructPtr(struct vec *pv) { + return pv->elems[64]; + // expected-warning@-1 {{Out of bound access to memory after the end of the field 'elems'}} + // expected-note@-2 {{Access of the field 'elems' at index 64, while it holds only 64 'double' elements}} +} + +struct two_bytes { + char lo, hi; +}; + +struct two_bytes convertedArray2(void) { + // We report this with byte offsets because the offset is not divisible by the element size. + struct two_bytes a = {0, 0}; + char *p = (char*)&a; + return *((struct two_bytes*)(p + 7)); + // expected-warning@-1 {{Out of bound access to memory after the end of 'a'}} + // expected-note@-2 {{Access of 'a' at byte offset 7, while it holds only 2 bytes}} +} + +int intFromString(void) { + // We report this with byte offsets because the extent is not divisible by the element size. + return ((const int*)"this is a string of 33 characters")[20]; + // expected-warning@-1 {{Out of bound access to memory after the end of the string literal}} + // expected-note@-2 {{Access of the string literal at byte offset 80, while it holds only 34 bytes}} +} + +int intFromStringDivisible(void) { + // However, this is reported with indices/elements, because the extent happens to be a multiple of 4. + return ((const int*)"abc")[20]; + // expected-warning@-1 {{Out of bound access to memory after the end of the string literal}} + // expected-note@-2 {{Access of the string literal at index 20, while it holds only a single 'int' element}} +} steakhal wrote: I couldn't wrap my head around the end of the note: `while it holds only a single 'int' element` Could you elaborate on why is `int` mentioned here, and if so, why only a `single`? https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -22,23 +22,25 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h" #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h" #include "llvm/ADT/SmallString.h" +#include "llvm/Support/FormatVariadic.h" #include "llvm/Support/raw_ostream.h" #include using namespace clang; using namespace ento; using namespace taint; +using llvm::formatv; namespace { +enum OOB_Kind { OOB_Precedes, OOB_Exceeds, OOB_Taint }; steakhal wrote: FYI you can use any identifiers here because these are only exposed to this single TU. The ugly prefixes are only applied to enums exposed in headers etc. where there is an unbounded chance of having collisions. No action is expected. https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
@@ -1222,6 +1222,15 @@ void ExprEngine::ProcessInitializer(const CFGInitializer CFGInit, PostInitializer PP(BMI, FieldLoc.getAsRegion(), stackFrame); evalBind(Tmp, Init, Pred, FieldLoc, InitVal, /*isInit=*/true, &PP); } + } else if (BMI->isBaseInitializer() && isa(Init)) { +// When the base class is initialized with an initialization list, there +// will not be a CXXConstructExpr to initialize the base region. Hence, we +// need to make the bind for it. +StoreManager &StoreMgr = State->getStateManager().getStoreManager(); +SVal BaseLoc = StoreMgr.evalDerivedToBase( +thisVal, QualType(BMI->getBaseClass(), 0), BMI->isBaseVirtual()); +SVal InitVal = State->getSVal(Init, stackFrame); +evalBind(Tmp, Init, Pred, BaseLoc, InitVal, true); steakhal wrote: Well, me neither but I think it should be fine with the default. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
https://github.com/steakhal approved this pull request. Looks great! FYI when submitting patches to GH try not to force-push to help the UI for following lines having comments. Otherwise, they will be marked as "outdated" and become hard to dig up and relate to new line locations. This was easy this time, as the size of the PR was minimal (as it should be). After the PR is approved and all conversations resolved; the author can either manually force-push to squash the commits or use the "squash-merge" to let GH do it. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix assertion failure in CXXInstanceCall::getCXXThisVal (PR #70837)
https://github.com/steakhal updated https://github.com/llvm/llvm-project/pull/70837 >From a7f64815f4986fad597b9cb2d1acce2de9ac20bf Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Mon, 23 Oct 2023 18:10:29 +0200 Subject: [PATCH 1/2] [analyzer] Fix assertion failure in CXXInstanceCall::getCXXThisVal Workaround the case when the `this` pointer is actually a `NonLoc`, by returning `Unknown` instead. The solution isn't ideal, as `this` should be really a `Loc`, but due to how casts work, I feel this is our easiest and best option. I've considered using `SVB.evalCast(ThisVal, Base->getType(), QualType())`, but it doesn't work as `EvalCastVisitor::VisitNonLocSymbolVal()` only evaluates casts that happen from NonLoc to NonLocs. When I tried to actually implement that case, I figured: 1) Create a SymbolicRegion from that nonloc::SymbolVal; but SymbolRegion ctor expects a pointer type for the symbol. 2) Okay, just have a SymbolCast, getting us the pointer type; but SymbolRegion expects SymbolData symbols, not generic SymExprs, as stated: > // Because pointer arithmetic is represented by ElementRegion layers, > // the base symbol here should not contain any arithmetic. 3) We can't use ElementRegions to perform this cast because to have an ElementRegion, you already have to have a SubRegion that you want to cast, but the point is that we don't have that. At this point, I gave up, and just returned `Unknown` xD IMO this is still better than having a crash. Fixes #69922 --- clang/lib/StaticAnalyzer/Core/CallEvent.cpp | 5 ++--- clang/test/Analysis/builtin_bitcast.cpp | 21 + 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp index ad5bb66c4fff3c8..20bc64dc2631cec 100644 --- a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp +++ b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp @@ -715,10 +715,9 @@ void CXXInstanceCall::getExtraInvalidatedValues( SVal CXXInstanceCall::getCXXThisVal() const { const Expr *Base = getCXXThisExpr(); // FIXME: This doesn't handle an overloaded ->* operator. - if (!Base) + SVal ThisVal = Base ? getSVal(Base) : UnknownVal(); + if (isa(ThisVal)) return UnknownVal(); - - SVal ThisVal = getSVal(Base); assert(ThisVal.isUnknownOrUndef() || isa(ThisVal)); return ThisVal; } diff --git a/clang/test/Analysis/builtin_bitcast.cpp b/clang/test/Analysis/builtin_bitcast.cpp index 396e7caa45f6acd..13475694d287a22 100644 --- a/clang/test/Analysis/builtin_bitcast.cpp +++ b/clang/test/Analysis/builtin_bitcast.cpp @@ -30,3 +30,24 @@ void test(int i) { clang_analyzer_dump(g4); // expected-warning@-1 {{&i [as 64 bit integer]}} } + +struct A { + int n; + void set(int x) { +n = x; + } +}; +using ptr_size = decltype(sizeof(void *)); +void gh_69922(ptr_size p) { + // expected-warning-re@+1 {{(reg_${{[0-9]+}}) & 1U}} + clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)); + + __builtin_bit_cast(A*, p & 1)->set(2); // no-crash + // However, since the `this` pointer is expected to be a Loc, but we have + // NonLoc there, we simply give up and resolve it as `Unknown`. + // Then, inside the `set()` member function call we can't evaluate the + // store to the member variable `n`. + + clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)->n); // Ideally, this should print "2". + // expected-warning-re@-1 {{(reg_${{[0-9]+}}) & 1U}} +} >From 27d2b84e14d73dcc2385f202a204993f25235de4 Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Tue, 31 Oct 2023 19:20:16 +0100 Subject: [PATCH 2/2] Add fixme and add the ineffective evalCast --- clang/lib/StaticAnalyzer/Core/CallEvent.cpp | 9 +++-- clang/lib/StaticAnalyzer/Core/SValBuilder.cpp | 5 + 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp index 20bc64dc2631cec..76fb7481f65194b 100644 --- a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp +++ b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp @@ -716,8 +716,13 @@ SVal CXXInstanceCall::getCXXThisVal() const { const Expr *Base = getCXXThisExpr(); // FIXME: This doesn't handle an overloaded ->* operator. SVal ThisVal = Base ? getSVal(Base) : UnknownVal(); - if (isa(ThisVal)) -return UnknownVal(); + + if (isa(ThisVal)) { +SValBuilder &SVB = getState()->getStateManager().getSValBuilder(); +QualType OriginalTy = ThisVal.getType(SVB.getContext()); +return SVB.evalCast(ThisVal, Base->getType(), OriginalTy); + } + assert(ThisVal.isUnknownOrUndef() || isa(ThisVal)); return ThisVal; } diff --git a/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp index d89d82626f72694..9375f39b2d71dd4 100644 --- a/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp +++ b/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp @@ -980,6 +980,11 @@ class EvalCastVisitor : public SValVisitor { re
[clang] [analyzer] Fix assertion failure in CXXInstanceCall::getCXXThisVal (PR #70837)
steakhal wrote: > Hmm, I wonder if we should leave a FIXME comment, but it looks good to me. I was thinking where to put the FIXME, and as I explored that should be within the CastVisitor. After that, I argued, that then I should still have the (ineffective) `SVB.evalCast()` to actually exercise that path. It should be a fancy way of returning `Unknown` now, but arguably for the better reason xD @Xazax-hun WDYT? https://github.com/llvm/llvm-project/pull/70837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix assertion failure in CXXInstanceCall::getCXXThisVal (PR #70837)
https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/70837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix assertion failure in `CXXInstanceCall::getCXXThisVal` (PR #70837)
https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/70837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix assertion failure in `CXXInstanceCall::getCXXThisVal` (PR #70837)
steakhal wrote: > > WDYT? > > I like this! I hope we do not add too much redundant work, but at least we > make it clear what is the plan to fix this in the future. Please approve the PR again, so that I could merge this after I give some time for others to look at this. https://github.com/llvm/llvm-project/pull/70837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Restrict 'fopen' & 'tmpfile' modeling to POSIX versions in StreamChecker (PR #70540)
https://github.com/steakhal approved this pull request. https://github.com/llvm/llvm-project/pull/70540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer]][NFC] Simplify method 'ensureStreamNonNull' of StreamChecker (PR #70927)
https://github.com/steakhal approved this pull request. You are right. I'm not sure it improves the code too much, and I wonder if you have further ideas refactoring the checker; if so we could probably bundle up similar changes into this one. https://github.com/llvm/llvm-project/pull/70927 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Restrict 'fopen' & 'tmpfile' modeling to POSIX versions in StreamChecker (PR #70540)
@@ -0,0 +1,16 @@ +// RUN: %clang_analyze_cc1 -fno-builtin -analyzer-checker=core,alpha.unix.Stream -verify %s +// expected-no-diagnostics + +typedef struct _FILE FILE; + +// These functions are not standard C library functions. +FILE *tmpfile(const char *restrict path); // Real 'tmpfile' should have exactly 0 formal parameters. +FILE *fopen(const char *restrict path); // Real 'fopen' should have exactly 2 formal parameters. + +void test_fopen_non_posix(void) { + FILE *fp = fopen("file"); // no-leak: this isn't the standard POSIX fopen, we don't the semantics of this call. +} + +void test_tmpfile_non_posix(void) { + FILE *fp = tmpfile("file"); // no-leak: this isn't the standard POSIX tmpfile, we don't the semantics of this call. steakhal wrote: I think the verb was missing. Probably the "we don't [know] the semantics..." https://github.com/llvm/llvm-project/pull/70540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)
https://github.com/steakhal created https://github.com/llvm/llvm-project/pull/71039 The goal of this patch is to refine how the `SVal` base and sub-kinds are represented by forming one unified enum describing the possible SVals. This means that the `unsigned SVal::Kind` and the attached bit-packing semantics would be replaced by a single unified enum. This is more conventional and leads to a better debugging experience by default. This eases the need of using debug pretty-printers, or the use of runtime functions doing the printing for us like we do today by calling `Val.dump()` whenever we inspect the values. Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`. The rest of the upper bits represented the sub-kind, where the value represented the index among only the `Loc`s or `NonLoc`s, effectively attaching 2 meanings of the upper bits depending on the base-kind. We don't need to pack these bits, as we have plenty even if we would use just a plan-old `unsigned char`. Consequently, in this patch, I propose to lay out all the (non-abstract) `SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`, `END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar how we do with the `MemRegions`. Note that in the unified `SVal::Kind` enum, to differentiate `nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with `Loc` and `NonLoc` to resolve this ambiguity. This should not surface in general, because I'm replacing the `nonloc::Kind` enum items with `inline constexpr` global constants to mimic the original behavior - and offer nicer spelling to these enum values. Some `SVal` constructors were not marked explicit, which I now mark as such to follow best practices, and marked others as `/*implicit*/` to clarify the intent. During refactoring, I also found at least one function not marked `LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that. The `TypeRetrievingVisitor` visitor had some accidental dead code, namely: `VisitNonLocConcreteInt` and `VisitLocConcreteInt`. Previously, the `SValVisitor` expected visit handlers of `VisitNonLocX(nonloc::X)` and `VisitLocX(loc::X)`, where I felt that envoding `NonLoc` and `Loc` in the name is not necessary as the type of the parameter would select the right overload anyways, so I simplified the naming of those visit functions. The rest of the diff is a lot of times just formatting, because `getKind()` by nature, frequently appears in switches, which means that the whole switch gets automatically reformatted. I could probably undo the formatting, but I didn't want to deviate from the rule unless explicitly requested. >From 3bc43ab005aa76a43644d4d93286215b490cc8fa Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Thu, 2 Nov 2023 10:21:03 +0100 Subject: [PATCH] [analyzer][NFC] Rework SVal kind representation The goal of this patch is to refine how the `SVal` base and sub-kinds are represented by forming one unified enum describing the possible SVals. This means that the `unsigned SVal::Kind` and the attached bit-packing semantics would be replaced by a single unified enum. This is more conventional and leads to a better debugging experience by default. This eases the need of using debug pretty-printers, or the use of runtime functions doing the printing for us like we do today by calling `Val.dump()` whenever we inspect the values. Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`. The rest of the upper bits represented the sub-kind, where the value represented the index among only the `Loc`s or `NonLoc`s, effectively attaching 2 meanings of the upper bits depending on the base-kind. We don't need to pack these bits, as we have plenty even if we would use just a plan-old `unsigned char`. Consequently, in this patch, I propose to lay out all the (non-abstract) `SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`, `END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar how we do with the `MemRegions`. Note that in the unified `SVal::Kind` enum, to differentiate `nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with `Loc` and `NonLoc` to resolve this ambiguity. This should not surface in general, because I'm replacing the `nonloc::Kind` enum items with `inline constexpr` global constants to mimic the original behavior - and offer nicer spelling to these enum values. Some `SVal` constructors were not marked explicit, which I now mark as such to follow best practices, and marked others as `/*implicit*/` to clarify the intent. During refactoring, I also found at least one function not marked `LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that. The `TypeRetrievingVisitor` visitor had some accidental dead code, namely: `VisitNonLocConcreteInt` and `VisitLocCon
[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)
steakhal wrote: This PR relates to #69835 ([comment](https://github.com/llvm/llvm-project/issues/69835#issuecomment-1775533393)). https://github.com/llvm/llvm-project/pull/71039 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)
steakhal wrote: > But I have to point out that this patch doesn't address the fact that `const > void* Data` is not friendly to debuggers, especially with type information > encoded in another member. So even with this patch applied, someone would > still have to write (and maintain) a custom formatter on debugger side to > display `Data` correctly. That's correct. I'll think about how to resolve that as well. > If you refactor `const void* Data` to be a `llvm::PointerUnion`, then it will > be picked up automatically (`PointerUnion` is a popular type, so I've already > written a formatter for it.) Together with removing `BaseBits` from `Kind` > and making the latter non-packed, `SVal` will have a debugger-friendly layout. I've considered this but I found the number of alternatives too large to make it feasible. Consider that we have 11 possible SValKinds, which would require 4 bits to encode. Requiring all `Data` pointers to be aligned as such seems rough - although not impossible. > That said, debugger-friendliness is probably not among top priorities, so I'm > not going to block this patch, if this is the direction you and reviewers > want to take. I'd say it's one baby step in that direction, but not the end of the journey. How about looking at this like that? https://github.com/llvm/llvm-project/pull/71039 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
steakhal wrote: I wanted to highlight that this PR resolved a bunch of open issues, namely: #61919, #59493, #54533 Thank you! So I think we should mention this in the release notes for clang-18 in some way. I'll keep this in mind. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -0,0 +1,149 @@ +// RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-output=text\ +// RUN: -analyzer-checker=core,alpha.security.ArrayBoundV2,unix.Malloc,alpha.security.taint -verify %s + +int array[10]; + +void arrayUnderflow(void) { + array[-3] = 5; + // expected-warning@-1 {{Out of bound access to memory preceding 'array'}} + // expected-note@-2 {{Access of 'array' at negative byte offset -12}} +} + +int scanf(const char *restrict fmt, ...); + +void taintedIndex(void) { + int index; + scanf("%d", &index); + // expected-note@-1 {{Taint originated here}} + // expected-note@-2 {{Taint propagated to the 2nd argument}} + array[index] = 5; + // expected-warning@-1 {{Potential out of bound access to 'array' with tainted offset}} + // expected-note@-2 {{Access of 'array' with a tainted offset that may be too large}} +} + +void arrayOverflow(void) { + array[12] = 5; + // expected-warning@-1 {{Out of bound access to memory after the end of 'array'}} + // expected-note@-2 {{Access of 'array' at index 12, while it holds only 10 'int' elements}} +} + +int scalar; +int scalarOverflow(void) { + return (&scalar)[1]; + // expected-warning@-1 {{Out of bound access to memory after the end of 'scalar'}} + // expected-note@-2 {{Access of 'scalar' at index 1, while it holds only a single 'int' element}} +} + +int oneElementArray[1]; +int oneElementArrayOverflow(void) { + return oneElementArray[1]; + // expected-warning@-1 {{Out of bound access to memory after the end of 'oneElementArray'}} + // expected-note@-2 {{Access of 'oneElementArray' at index 1, while it holds only a single 'int' element}} +} + +short convertedArray(void) { + return ((short*)array)[47]; + // expected-warning@-1 {{Out of bound access to memory after the end of 'array'}} + // expected-note@-2 {{Access of 'array' at index 47, while it holds only 20 'short' elements}} +} + +struct vec { + int len; + double elems[64]; +} v; + +double arrayInStruct(void) { + return v.elems[64]; + // expected-warning@-1 {{Out of bound access to memory after the end of 'v.elems'}} + // expected-note@-2 {{Access of 'v.elems' at index 64, while it holds only 64 'double' elements}} +} + +double arrayInStructPtr(struct vec *pv) { + return pv->elems[64]; + // expected-warning@-1 {{Out of bound access to memory after the end of the field 'elems'}} + // expected-note@-2 {{Access of the field 'elems' at index 64, while it holds only 64 'double' elements}} +} + +struct two_bytes { + char lo, hi; +}; + +struct two_bytes convertedArray2(void) { + // We report this with byte offsets because the offset is not divisible by the element size. + struct two_bytes a = {0, 0}; + char *p = (char*)&a; + return *((struct two_bytes*)(p + 7)); + // expected-warning@-1 {{Out of bound access to memory after the end of 'a'}} + // expected-note@-2 {{Access of 'a' at byte offset 7, while it holds only 2 bytes}} +} + +int intFromString(void) { + // We report this with byte offsets because the extent is not divisible by the element size. + return ((const int*)"this is a string of 33 characters")[20]; + // expected-warning@-1 {{Out of bound access to memory after the end of the string literal}} + // expected-note@-2 {{Access of the string literal at byte offset 80, while it holds only 34 bytes}} +} + +int intFromStringDivisible(void) { + // However, this is reported with indices/elements, because the extent happens to be a multiple of 4. + return ((const int*)"abc")[20]; + // expected-warning@-1 {{Out of bound access to memory after the end of the string literal}} + // expected-note@-2 {{Access of the string literal at index 20, while it holds only a single 'int' element}} +} steakhal wrote: I have no strong opinion. Thanks for the explanation. https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad, // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and // non-symbolic regions (e.g. a field subregion of a symbolic region) in // unknown space. -auto [state_precedesLowerBound, state_withinLowerBound] = -compareValueToThreshold(state, ByteOffset, -svalBuilder.makeZeroArrayIndex(), svalBuilder); +auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold( +State, ByteOffset, SVB.makeZeroArrayIndex(), SVB); -if (state_precedesLowerBound && !state_withinLowerBound) { +if (PrecedesLowerBound && !WithinLowerBound) { // We know that the index definitely precedes the lower bound. - reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes); + std::string RegName = getRegionName(Reg); + std::string Msg = getPrecedesMsg(RegName, ByteOffset); + reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg); steakhal wrote: To me, `move` describes the intent: I produced something, and this is the only place that is supposed to consume it. Keep in mind that this code is cold, thus we can do whatever we want, including making unnecessary copies. BTW I didn't see that the result of the first is actually used for the second call. This way this code makes all sense. https://github.com/llvm/llvm-project/pull/70056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
steakhal wrote: > As #59493 is an array, which is different from the test case I provided and > the ones in #61919 and #54533, although this pr can correctly handle the > array case, do I still need to add the array one to the test case? It would be really nice if you could. Thanks! https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class (#70464) (PR #70792)
steakhal wrote: > Do I need to > > * create a new PR? > > * push directly to this PR on the original branch `Snape3058:issue-70464`? > > * commit directly without revision? > > > Which operation is correct? > > (Sorry for not familiar with GitHub) -( I'd prefer a new PR, and mentioning the original PR (the fix) and the issue for which it adds the test. https://github.com/llvm/llvm-project/pull/70792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Add a test case to PR-70792 for Issue-59493 (PR #71073)
@@ -66,3 +66,23 @@ struct Derived : Base { void entry() { Derived test; } } // namespace delegate_ctor_call + +// Additional test case from issue #59493 +namespace init_list_array { + +struct Base { + int foox[1]; steakhal wrote: I'd suggest a more realistic array size for the test and also test some of the subsequent elements, including zero initialized elements. https://github.com/llvm/llvm-project/pull/71073 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add a test case to PR-70792 for Issue-59493 (PR #71073)
https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/71073 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Add a test case to PR-70792 for Issue-59493 (PR #71073)
https://github.com/steakhal requested changes to this pull request. https://github.com/llvm/llvm-project/pull/71073 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Add a test case to PR-70792 for Issue-59493 (PR #71073)
https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/71073 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add a test case to PR-70792 for Issue-59493 (PR #71073)
https://github.com/steakhal approved this pull request. Thanks https://github.com/llvm/llvm-project/pull/71073 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)
https://github.com/steakhal updated https://github.com/llvm/llvm-project/pull/71039 >From 3bc43ab005aa76a43644d4d93286215b490cc8fa Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Thu, 2 Nov 2023 10:21:03 +0100 Subject: [PATCH 1/2] [analyzer][NFC] Rework SVal kind representation The goal of this patch is to refine how the `SVal` base and sub-kinds are represented by forming one unified enum describing the possible SVals. This means that the `unsigned SVal::Kind` and the attached bit-packing semantics would be replaced by a single unified enum. This is more conventional and leads to a better debugging experience by default. This eases the need of using debug pretty-printers, or the use of runtime functions doing the printing for us like we do today by calling `Val.dump()` whenever we inspect the values. Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`. The rest of the upper bits represented the sub-kind, where the value represented the index among only the `Loc`s or `NonLoc`s, effectively attaching 2 meanings of the upper bits depending on the base-kind. We don't need to pack these bits, as we have plenty even if we would use just a plan-old `unsigned char`. Consequently, in this patch, I propose to lay out all the (non-abstract) `SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`, `END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar how we do with the `MemRegions`. Note that in the unified `SVal::Kind` enum, to differentiate `nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with `Loc` and `NonLoc` to resolve this ambiguity. This should not surface in general, because I'm replacing the `nonloc::Kind` enum items with `inline constexpr` global constants to mimic the original behavior - and offer nicer spelling to these enum values. Some `SVal` constructors were not marked explicit, which I now mark as such to follow best practices, and marked others as `/*implicit*/` to clarify the intent. During refactoring, I also found at least one function not marked `LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that. The `TypeRetrievingVisitor` visitor had some accidental dead code, namely: `VisitNonLocConcreteInt` and `VisitLocConcreteInt`. Previously, the `SValVisitor` expected visit handlers of `VisitNonLocX(nonloc::X)` and `VisitLocX(loc::X)`, where I felt that envoding `NonLoc` and `Loc` in the name is not necessary as the type of the parameter would select the right overload anyways, so I simplified the naming of those visit functions. The rest of the diff is a lot of times just formatting, because `getKind()` by nature, frequently appears in switches, which means that the whole switch gets automatically reformatted. I could probably undo the formatting, but I didn't want to deviate from the rule unless explicitly requested. --- .../StaticAnalyzer/Checkers/SValExplainer.h | 10 +- .../Core/PathSensitive/SValVisitor.h | 57 ++--- .../Core/PathSensitive/SVals.def | 38 ++- .../StaticAnalyzer/Core/PathSensitive/SVals.h | 225 ++ clang/lib/StaticAnalyzer/Core/SValBuilder.cpp | 28 +-- clang/lib/StaticAnalyzer/Core/SVals.cpp | 87 --- .../Core/SimpleConstraintManager.cpp | 4 +- .../StaticAnalyzer/Core/SimpleSValBuilder.cpp | 73 +++--- clang/lib/StaticAnalyzer/Core/Store.cpp | 2 +- 9 files changed, 220 insertions(+), 304 deletions(-) diff --git a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h index 3ae28c1dba3eb5a..43a70f596a4da28 100644 --- a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h +++ b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h @@ -65,7 +65,7 @@ class SValExplainer : public FullSValVisitor { return "undefined value"; } - std::string VisitLocMemRegionVal(loc::MemRegionVal V) { + std::string VisitMemRegionVal(loc::MemRegionVal V) { const MemRegion *R = V.getRegion(); // Avoid the weird "pointer to pointee of ...". if (auto SR = dyn_cast(R)) { @@ -76,7 +76,7 @@ class SValExplainer : public FullSValVisitor { return "pointer to " + Visit(R); } - std::string VisitLocConcreteInt(loc::ConcreteInt V) { + std::string VisitConcreteInt(loc::ConcreteInt V) { const llvm::APSInt &I = V.getValue(); std::string Str; llvm::raw_string_ostream OS(Str); @@ -84,11 +84,11 @@ class SValExplainer : public FullSValVisitor { return Str; } - std::string VisitNonLocSymbolVal(nonloc::SymbolVal V) { + std::string VisitSymbolVal(nonloc::SymbolVal V) { return Visit(V.getSymbol()); } - std::string VisitNonLocConcreteInt(nonloc::ConcreteInt V) { + std::string VisitConcreteInt(nonloc::ConcreteInt V) { const llvm::APSInt &I = V.getValue(); std::string Str; llvm::raw_string_ostream OS(Str); @@ -97,7
[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)
steakhal wrote: > But I have to point out that this patch doesn't address the fact that `const > void* Data` is not friendly to debuggers, especially with type information > encoded in another member. So even with this patch applied, someone would > still have to write (and maintain) a custom formatter on debugger side to > display `Data` correctly. > > If you refactor `const void* Data` to be a `llvm::PointerUnion`, then it will > be picked up automatically (`PointerUnion` is a popular type, so I've already > written a formatter for it.) Together with removing `BaseBits` from `Kind` > and making the latter non-packed, `SVal` will have a debugger-friendly layout. What other ways are to make a `const void *` debugger-friendly - other than `llvm::PointerUnion`? I've considered using plain-old unions, but I found that approach pretty ugly, and I'm not even sure if it would help debugging experience. https://github.com/llvm/llvm-project/pull/71039 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix assertion failure in `CXXInstanceCall::getCXXThisVal` (PR #70837)
@@ -30,3 +30,24 @@ void test(int i) { clang_analyzer_dump(g4); // expected-warning@-1 {{&i [as 64 bit integer]}} } + +struct A { + int n; + void set(int x) { +n = x; + } +}; +using ptr_size = decltype(sizeof(void *)); +void gh_69922(ptr_size p) { + // expected-warning-re@+1 {{(reg_${{[0-9]+}}) & 1U}} + clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)); steakhal wrote: Yes, that is to form a `SymExpr`, which is a `SymbolVal` (NonLoc) wrapping a `BO_And` `SymIntExpr` I think. https://github.com/llvm/llvm-project/pull/70837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix assertion failure in `CXXInstanceCall::getCXXThisVal` (PR #70837)
@@ -30,3 +30,24 @@ void test(int i) { clang_analyzer_dump(g4); // expected-warning@-1 {{&i [as 64 bit integer]}} } + +struct A { + int n; + void set(int x) { +n = x; + } +}; +using ptr_size = decltype(sizeof(void *)); steakhal wrote: Fixed. Now using `size_t`. https://github.com/llvm/llvm-project/pull/70837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix assertion failure in `CXXInstanceCall::getCXXThisVal` (PR #70837)
https://github.com/steakhal updated https://github.com/llvm/llvm-project/pull/70837 >From 2de19fc8e14319674ce87c18771ba1b8ba22f79b Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Mon, 23 Oct 2023 18:10:29 +0200 Subject: [PATCH 1/3] [analyzer] Fix assertion failure in CXXInstanceCall::getCXXThisVal Workaround the case when the `this` pointer is actually a `NonLoc`, by returning `Unknown` instead. The solution isn't ideal, as `this` should be really a `Loc`, but due to how casts work, I feel this is our easiest and best option. I've considered using `SVB.evalCast(ThisVal, Base->getType(), QualType())`, but it doesn't work as `EvalCastVisitor::VisitNonLocSymbolVal()` only evaluates casts that happen from NonLoc to NonLocs. When I tried to actually implement that case, I figured: 1) Create a SymbolicRegion from that nonloc::SymbolVal; but SymbolRegion ctor expects a pointer type for the symbol. 2) Okay, just have a SymbolCast, getting us the pointer type; but SymbolRegion expects SymbolData symbols, not generic SymExprs, as stated: > // Because pointer arithmetic is represented by ElementRegion layers, > // the base symbol here should not contain any arithmetic. 3) We can't use ElementRegions to perform this cast because to have an ElementRegion, you already have to have a SubRegion that you want to cast, but the point is that we don't have that. At this point, I gave up, and just returned `Unknown` xD IMO this is still better than having a crash. Fixes #69922 --- clang/lib/StaticAnalyzer/Core/CallEvent.cpp | 5 ++--- clang/test/Analysis/builtin_bitcast.cpp | 21 + 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp index ad5bb66c4fff3c8..20bc64dc2631cec 100644 --- a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp +++ b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp @@ -715,10 +715,9 @@ void CXXInstanceCall::getExtraInvalidatedValues( SVal CXXInstanceCall::getCXXThisVal() const { const Expr *Base = getCXXThisExpr(); // FIXME: This doesn't handle an overloaded ->* operator. - if (!Base) + SVal ThisVal = Base ? getSVal(Base) : UnknownVal(); + if (isa(ThisVal)) return UnknownVal(); - - SVal ThisVal = getSVal(Base); assert(ThisVal.isUnknownOrUndef() || isa(ThisVal)); return ThisVal; } diff --git a/clang/test/Analysis/builtin_bitcast.cpp b/clang/test/Analysis/builtin_bitcast.cpp index 396e7caa45f6acd..13475694d287a22 100644 --- a/clang/test/Analysis/builtin_bitcast.cpp +++ b/clang/test/Analysis/builtin_bitcast.cpp @@ -30,3 +30,24 @@ void test(int i) { clang_analyzer_dump(g4); // expected-warning@-1 {{&i [as 64 bit integer]}} } + +struct A { + int n; + void set(int x) { +n = x; + } +}; +using ptr_size = decltype(sizeof(void *)); +void gh_69922(ptr_size p) { + // expected-warning-re@+1 {{(reg_${{[0-9]+}}) & 1U}} + clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)); + + __builtin_bit_cast(A*, p & 1)->set(2); // no-crash + // However, since the `this` pointer is expected to be a Loc, but we have + // NonLoc there, we simply give up and resolve it as `Unknown`. + // Then, inside the `set()` member function call we can't evaluate the + // store to the member variable `n`. + + clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)->n); // Ideally, this should print "2". + // expected-warning-re@-1 {{(reg_${{[0-9]+}}) & 1U}} +} >From bcb048c09dcc7bb2728d46afc0ff9a09cf71f663 Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Tue, 31 Oct 2023 19:20:16 +0100 Subject: [PATCH 2/3] Add fixme and add the ineffective evalCast --- clang/lib/StaticAnalyzer/Core/CallEvent.cpp | 9 +++-- clang/lib/StaticAnalyzer/Core/SValBuilder.cpp | 5 + 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp index 20bc64dc2631cec..76fb7481f65194b 100644 --- a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp +++ b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp @@ -716,8 +716,13 @@ SVal CXXInstanceCall::getCXXThisVal() const { const Expr *Base = getCXXThisExpr(); // FIXME: This doesn't handle an overloaded ->* operator. SVal ThisVal = Base ? getSVal(Base) : UnknownVal(); - if (isa(ThisVal)) -return UnknownVal(); + + if (isa(ThisVal)) { +SValBuilder &SVB = getState()->getStateManager().getSValBuilder(); +QualType OriginalTy = ThisVal.getType(SVB.getContext()); +return SVB.evalCast(ThisVal, Base->getType(), OriginalTy); + } + assert(ThisVal.isUnknownOrUndef() || isa(ThisVal)); return ThisVal; } diff --git a/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp index d89d82626f72694..9375f39b2d71dd4 100644 --- a/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp +++ b/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp @@ -980,6 +980,11 @@ class EvalCastVisitor : public SValVisitor { re
[clang] [analyzer] Fix assertion failure in `CXXInstanceCall::getCXXThisVal` (PR #70837)
https://github.com/steakhal closed https://github.com/llvm/llvm-project/pull/70837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions (PR #71284)
https://github.com/steakhal created https://github.com/llvm/llvm-project/pull/71284 The idea is that if we see a `X RELOP Y` being constrained to a RangeSet `S`, then check the eqclasses of X and Y respectively and for `X' RELOP Y'` SymSymExprs and try to infer their ranges. If there is no contradiction with any of the equivalent alternatives, then intersecting all these RangeSets should never be empty - aka. there should be a value satisfying the constraints we have. It costs around `|eqclass(X)| + |eqclass(y)|`. The approach has its limitations, as demonstrated by `gh_62215_contradicting_nested_right_equivalent`, where we would need to apply the same logic, but on a sub-expression of a direct operand. Before the patch, line 90, 100, and 112 would be reachable; and become unreachable after this. Line 127 will remain still reachable, but keep in mind that when cross-checking with Z3 (aka. Z3 refutation), then all 4 reports would be eliminated. The idea comes from https://github.com/llvm/llvm-project/issues/62215#issuecomment-1792878474 Fixes #62215 >From 92ece501b340c3a2a52b5a4614ddb70bb3e35c93 Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Sat, 4 Nov 2023 13:44:28 +0100 Subject: [PATCH] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions The idea is that if we see a `X RELOP Y` being constrained to a RangeSet `S`, then check the eqclasses of X and Y respectively and for `X' RELOP Y'` SymSymExprs and try to infer their ranges. If there is no contradiction with any of the equivalent alternatives, then intersecting all these RangeSets should never be empty - aka. there should be a value satisfying the constraints we have. It costs around `|eqclass(X)| + |eqclass(y)|`. The approach has its limitations, as demonstrated by `gh_62215_contradicting_nested_right_equivalent`, where we would need to apply the same logic, but on a sub-expression of a direct operand. Before the patch, line 90, 100, and 112 would be reachable; and become unreachable after this. Line 127 will remain still reachable, but keep in mind that when cross-checking with Z3 (aka. Z3 refutation), then all 4 reports would be eliminated. The idea comes from https://github.com/llvm/llvm-project/issues/62215#issuecomment-1792878474 Fixes #62215 --- .../Core/RangeConstraintManager.cpp | 53 +++ clang/test/Analysis/constraint-assignor.c | 48 + 2 files changed, 101 insertions(+) diff --git a/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp b/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp index 5de99384449a4c8..d631369e895d3a9 100644 --- a/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp +++ b/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp @@ -2067,6 +2067,12 @@ class ConstraintAssignor : public ConstraintAssignorBase { return Assignor.assign(CoS, NewConstraint); } + /// Check if using an equivalent operand alternative would lead to + /// contradiction. + /// If a contradiction is witnessed, returns false; otherwise returns true. + bool handleEquivalentAlternativeSymOperands(const SymSymExpr *SymSym, + RangeSet Constraint); + /// Handle expressions like: a % b != 0. template bool handleRemainderOp(const SymT *Sym, RangeSet Constraint) { @@ -2218,11 +2224,58 @@ bool ConstraintAssignor::assignSymExprToConst(const SymExpr *Sym, return true; } +bool ConstraintAssignor::handleEquivalentAlternativeSymOperands( +const SymSymExpr *SymSym, RangeSet Constraint) { + SymbolRef LHS = SymSym->getLHS(); + SymbolRef RHS = SymSym->getRHS(); + EquivalenceClass LHSClass = EquivalenceClass::find(State, LHS); + EquivalenceClass RHSClass = EquivalenceClass::find(State, RHS); + SymbolSet SymbolsEqWithLHS = LHSClass.getClassMembers(State); + SymbolSet SymbolsEqWithRHS = RHSClass.getClassMembers(State); + llvm::SmallVector AlternativeSymSyms; + + // Gather left alternatives. + for (SymbolRef AlternativeLHS : SymbolsEqWithLHS) { +if (AlternativeLHS == LHS) + continue; +AlternativeSymSyms.emplace_back(AlternativeLHS, SymSym->getOpcode(), RHS, +SymSym->getType()); + } + + // Gather right alternatives. + for (SymbolRef AlternativeRHS : SymbolsEqWithRHS) { +if (AlternativeRHS == RHS) + continue; +AlternativeSymSyms.emplace_back(LHS, SymSym->getOpcode(), AlternativeRHS, +SymSym->getType()); + } + + // Crosscheck the inferred ranges. + for (SymSymExpr AltSymSym : AlternativeSymSyms) { +RangeSet AltSymSymConstrant = +SymbolicRangeInferrer::inferRange(RangeFactory, State, &AltSymSym); +Constraint = intersect(RangeFactory, Constraint, AltSymSymConstrant); + +// Check if we witnessed a contradiction with the equivalent alternative +// operand. +if (Constraint.isEmpty()) { + State = nullptr; + return false; +} + } + r
[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)
https://github.com/steakhal updated https://github.com/llvm/llvm-project/pull/71039 >From 8f16d3000a91df33d416dd09381175ddeb7e5ed3 Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Sat, 4 Nov 2023 15:25:42 +0100 Subject: [PATCH] [analyzer][NFC] Rework SVal kind representation The goal of this patch is to refine how the `SVal` base and sub-kinds are represented by forming one unified enum describing the possible SVals. This means that the `unsigned SVal::Kind` and the attached bit-packing semantics would be replaced by a single unified enum. This is more conventional and leads to a better debugging experience by default. This eases the need of using debug pretty-printers, or the use of runtime functions doing the printing for us like we do today by calling `Val.dump()` whenever we inspect the values. Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`. The rest of the upper bits represented the sub-kind, where the value represented the index among only the `Loc`s or `NonLoc`s, effectively attaching 2 meanings of the upper bits depending on the base-kind. We don't need to pack these bits, as we have plenty even if we would use just a plan-old `unsigned char`. Consequently, in this patch, I propose to lay out all the (non-abstract) `SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`, `END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar how we do with the `MemRegions`. Note that in the unified `SVal::Kind` enum, to differentiate `nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with `Loc` and `NonLoc` to resolve this ambiguity. This should not surface in general, because I'm replacing the `nonloc::Kind` enum items with `inline constexpr` global constants to mimic the original behavior - and offer nicer spelling to these enum values. Some `SVal` constructors were not marked explicit, which I now mark as such to follow best practices, and marked others as `/*implicit*/` to clarify the intent. During refactoring, I also found at least one function not marked `LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that. The `TypeRetrievingVisitor` visitor had some accidental dead code, namely: `VisitNonLocConcreteInt` and `VisitLocConcreteInt`. Previously, the `SValVisitor` expected visit handlers of `VisitNonLocX(nonloc::X)` and `VisitLocX(loc::X)`, where I felt that envoding `NonLoc` and `Loc` in the name is not necessary as the type of the parameter would select the right overload anyways, so I simplified the naming of those visit functions. The rest of the diff is a lot of times just formatting, because `getKind()` by nature, frequently appears in switches, which means that the whole switch gets automatically reformatted. I could probably undo the formatting, but I didn't want to deviate from the rule unless explicitly requested. --- .../StaticAnalyzer/Checkers/SValExplainer.h | 10 +- .../Core/PathSensitive/SValVisitor.h | 55 ++--- .../Core/PathSensitive/SVals.def | 38 ++- .../StaticAnalyzer/Core/PathSensitive/SVals.h | 226 ++ clang/lib/StaticAnalyzer/Core/SValBuilder.cpp | 28 +-- clang/lib/StaticAnalyzer/Core/SVals.cpp | 87 --- .../Core/SimpleConstraintManager.cpp | 4 +- .../StaticAnalyzer/Core/SimpleSValBuilder.cpp | 73 +++--- clang/lib/StaticAnalyzer/Core/Store.cpp | 2 +- 9 files changed, 219 insertions(+), 304 deletions(-) diff --git a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h index 3ae28c1dba3eb5a..43a70f596a4da28 100644 --- a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h +++ b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h @@ -65,7 +65,7 @@ class SValExplainer : public FullSValVisitor { return "undefined value"; } - std::string VisitLocMemRegionVal(loc::MemRegionVal V) { + std::string VisitMemRegionVal(loc::MemRegionVal V) { const MemRegion *R = V.getRegion(); // Avoid the weird "pointer to pointee of ...". if (auto SR = dyn_cast(R)) { @@ -76,7 +76,7 @@ class SValExplainer : public FullSValVisitor { return "pointer to " + Visit(R); } - std::string VisitLocConcreteInt(loc::ConcreteInt V) { + std::string VisitConcreteInt(loc::ConcreteInt V) { const llvm::APSInt &I = V.getValue(); std::string Str; llvm::raw_string_ostream OS(Str); @@ -84,11 +84,11 @@ class SValExplainer : public FullSValVisitor { return Str; } - std::string VisitNonLocSymbolVal(nonloc::SymbolVal V) { + std::string VisitSymbolVal(nonloc::SymbolVal V) { return Visit(V.getSymbol()); } - std::string VisitNonLocConcreteInt(nonloc::ConcreteInt V) { + std::string VisitConcreteInt(nonloc::ConcreteInt V) { const llvm::APSInt &I = V.getValue(); std::string Str; llvm::raw_string_ostream OS(Str); @@ -97,7 +97
[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)
https://github.com/steakhal closed https://github.com/llvm/llvm-project/pull/71039 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions (PR #71284)
steakhal wrote: For crossreference: I raised some related questions around having void casts artificially keeping constraints and symbols alive at discuss: https://discourse.llvm.org/t/range-based-solver-and-eager-symbol-garbage-collection/74670 https://github.com/llvm/llvm-project/pull/71284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions (PR #71284)
steakhal wrote: > I think every time we need to iterate over all member of an equivalence > class, we might do something wrong. The point of the equivalence class would > be to make sure those elements are equivalent. One way to avoid iteration > would be to always use the representative of the equivalence class. E.g., > each time we record a new constraint to a member of the class, we add this > information to the representative element. Every time we do a query, we first > look up the representative element which already should have all the info > from the class and use it instead of the original symbol. > > Would something like this work or am I missing something? I had to think about it for a little while, and here are my thoughts: In an example like here: ```c++ void gh_62215(int x, int y, int z) { if (x != y) return; // x == y if (z <= x) return; // z > x if (z >= y) return; // z < y clang_analyzer_warnIfReached(); // no-warning: This should be dead code. (void)(x + y + z); // keep the constraints alive. } ``` Here, after the first if, we would have an eqclass of `{x,y}`; with a single constraint `x != y: [0,0]`. After the second if, we would have notionally 3 eqclasses: `{x,y}`, and two trivial ones `x` and `y`. We would also have 3 constraints: `x != y: [0,0]`, `z <= x: [0,0]`, `z <= y: [0,0]`. Note that the keys in the constraints map (symbols) can be 'arbitrarily' complex and refer to already dead symbols. In this example, I keep these symbols artificially alive to make the case slightly simpler to discuss. So, if I understand you correctly, at the 3rd if statement, we should canonicalize the symbol we are constraining by walking every sub-symbol and substituting it with its equivalent counterpart (if any), by basically with its eqclass' representative symbol. In this example, it would mean that instead of adding a constraint `z >= y: [0,0]`, we would instead `z >= x: [0,0]` (assuming that `x` is the representative symbol of eqclass `{x,y}`. I believe this canonicalization could work (and would solve the other demonstrative limitation I added in this PR), but would also incur overhead. And from a design point of view, an eqclass would need to keep the representative symbol alive, because otherwise, we can't substitute a symbol with the representative symbol. (Note that for this reason, we don't actually have a representative symbol, but rather a unique integral value as an ID for the eqclass - which happens to be equal to the `SymbolRef` pointer value of the representative symbol if that's still alive. Here is an example where the representative symbol `x` of the eqclass `{x,y}` would die, but remain the ID of the eqclass: ```c++ // same example, but with: (void)(y + z); // X is not mentioned!, thus reclaimed after the 2nd if statement. ``` https://github.com/llvm/llvm-project/pull/71284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Trust base to derived casts for dynamic types (PR #69057)
steakhal wrote: Thanks for the replies. I'll come back to this PR once I have some time; maybe during the holidays. Both assertions directly relate to this PR for sure. > I looked into Tom's bug report and I hit the following assertion in a debug > build: > > ``` > clang: > /srv/repos/llvm-project/clang/lib/StaticAnalyzer/Core/DynamicType.cpp:134: > clang::ento::ProgramStateRef > clang::ento::setDynamicTypeAndCastInfo(clang::ento::ProgramStateRef, const > clang::ento::MemRegion*, clang::QualType, clang::QualType, bool): Assertion > `(CastToTy->isAnyPointerType() || CastToTy->isReferenceType()) && > "DynamicTypeInfo should always be a pointer."' failed. > ``` > [...] > What I don't understand is why these are not reference types. @steakhal do > you know more here? Its because prior we `unbox` them I believe, which here sort of means that we drop refs and pointers I think. https://github.com/llvm/llvm-project/pull/69057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Switch to PostStmt callbacks in ArrayBoundV2 (PR #72107)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: steakhal wrote: I'm in favor of this change. I'll pull the patch downstream and report back how it performed. Coming back to the `&array[size]` example, actually I believe that it's well-formed in C, but UB in C++, but I'm not a language lawyer. @shafik probably knows this better :sweat_smile: However, `(array + size)` should be well-formed in both C and C++. https://github.com/llvm/llvm-project/pull/72107 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Use AllocaRegion in MallocChecker (PR #72402)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal approved this pull request. Overall, I'm in favor of this change. On the other hand, I'd urge for not to regress on the diagnostics. To me, `alloca` is like a VLA; which is prone to misuses, thus the edge-cases count there (esp. if tainted). https://github.com/llvm/llvm-project/pull/72402 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Switch to PostStmt callbacks in ArrayBoundV2 (PR #72107)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/72107 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Switch to PostStmt callbacks in ArrayBoundV2 (PR #72107)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: steakhal wrote: FYI I edited the PR summary so that I'm not tagged there directly because if someone is tagged in a commit message on GH, that person will be notified each time that commit is pushed in public. Consequently, the tagged person gets spammed by GH from all the public forks each time they rebase and push this commit :D Such a great feature, right? https://github.com/llvm/llvm-project/pull/72107 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Switch to PostStmt callbacks in ArrayBoundV2 (PR #72107)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: steakhal wrote: > @steakhal thanks for the checking and sorry for the unintentional spamming. > > > Such a great feature, right? > > Just wonderful 😄 To clarify, you did not spam me. I'm worried about merging a commit directly mentioning people. That would be the point when forks start to pick up the commit (and my name e.g.) and start spamming. I just wanted to raise awareness of this being a thing, and why I did that edit. https://github.com/llvm/llvm-project/pull/72107 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Let the checkers query upper and lower bounds on symbols (PR #74141)
https://github.com/steakhal approved this pull request. The patch makes sense to me. https://github.com/llvm/llvm-project/pull/74141 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Switch to PostStmt callbacks in ArrayBoundV2 (PR #72107)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/72107 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Switch to PostStmt callbacks in ArrayBoundV2 (PR #72107)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal approved this pull request. Sorry for reporting back only after a week or so. The analysis was done in like a day, but then we had some credential issues that blocked me actually doing the diffing. Now about the diffing: a lot of things changed because now we report earlier, so a different range gets highlighted. This caused me some problems, so I decided to not do anything fancy, but randomly pick a bunch of issues. In short, the picked issues looked better; but effectively remained the same. This is good. https://github.com/llvm/llvm-project/pull/72107 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Switch to PostStmt callbacks in ArrayBoundV2 (PR #72107)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -413,6 +464,19 @@ bool ArrayBoundCheckerV2::isFromCtypeMacro(const Stmt *S, ASTContext &ACtx) { (MacroName == "isupper") || (MacroName == "isxdigit")); } +bool ArrayBoundCheckerV2::isInAddressOf(const Stmt *S, ASTContext &ACtx) { + ParentMapContext &ParentCtx = ACtx.getParentMapContext(); + do { +const DynTypedNodeList Parents = ParentCtx.getParents(*S); +if (Parents.empty()) + return false; +S = Parents[0].get(); + } while (isa_and_nonnull(S)); + if (const auto *UnaryOp = dyn_cast_or_null(S)) +return UnaryOp->getOpcode() == UO_AddrOf; + return false; steakhal wrote: ```suggestion const auto *UnaryOp = dyn_cast_or_null(S); return UnaryOp && UnaryOp->getOpcode() == UO_AddrOf; ``` https://github.com/llvm/llvm-project/pull/72107 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix "sprintf" parameter modeling in CStringChecker (PR #74345)
https://github.com/steakhal created https://github.com/llvm/llvm-project/pull/74345 Review the commits one by one. I plan to merge them manually by pushing both of these at once. This PR intends to fix #74269. >From 1359a7ef528358cc7e10a751aa885c6bd8ac8d1c Mon Sep 17 00:00:00 2001 From: Balazs Benics Date: Mon, 4 Dec 2023 17:40:09 +0100 Subject: [PATCH 1/2] [analyzer][NFC] Prefer CallEvent over CallExpr in APIs This change only uplifts existing APIs, without any semantic changes. This is the continuation of 44820630dfa45bc47748a5abda7d4a9cb86da2c1. Benefits of using CallEvents over CallExprs: The callee decl is traced through function pointers if possible. This will be important to fix #74269 in a follow-up patch. --- .../Checkers/CStringChecker.cpp | 381 ++ 1 file changed, 203 insertions(+), 178 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp index 31f5b03dcdeba..f5dbf9d82beee 100644 --- a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp @@ -121,7 +121,7 @@ class CStringChecker : public Checker< eval::Call, const CallEvent *Call) const; using FnCheck = std::function; + const CallEvent &)>; CallDescriptionMap Callbacks = { {{CDF_MaybeBuiltin, {"memcpy"}, 3}, @@ -173,56 +173,53 @@ class CStringChecker : public Checker< eval::Call, StdCopyBackward{{"std", "copy_backward"}, 3}; FnCheck identifyCall(const CallEvent &Call, CheckerContext &C) const; - void evalMemcpy(CheckerContext &C, const CallExpr *CE, CharKind CK) const; - void evalMempcpy(CheckerContext &C, const CallExpr *CE, CharKind CK) const; - void evalMemmove(CheckerContext &C, const CallExpr *CE, CharKind CK) const; - void evalBcopy(CheckerContext &C, const CallExpr *CE) const; - void evalCopyCommon(CheckerContext &C, const CallExpr *CE, + void evalMemcpy(CheckerContext &C, const CallEvent &Call, CharKind CK) const; + void evalMempcpy(CheckerContext &C, const CallEvent &Call, CharKind CK) const; + void evalMemmove(CheckerContext &C, const CallEvent &Call, CharKind CK) const; + void evalBcopy(CheckerContext &C, const CallEvent &Call) const; + void evalCopyCommon(CheckerContext &C, const CallEvent &Call, ProgramStateRef state, SizeArgExpr Size, DestinationArgExpr Dest, SourceArgExpr Source, bool Restricted, bool IsMempcpy, CharKind CK) const; - void evalMemcmp(CheckerContext &C, const CallExpr *CE, CharKind CK) const; + void evalMemcmp(CheckerContext &C, const CallEvent &Call, CharKind CK) const; - void evalstrLength(CheckerContext &C, const CallExpr *CE) const; - void evalstrnLength(CheckerContext &C, const CallExpr *CE) const; - void evalstrLengthCommon(CheckerContext &C, - const CallExpr *CE, + void evalstrLength(CheckerContext &C, const CallEvent &Call) const; + void evalstrnLength(CheckerContext &C, const CallEvent &Call) const; + void evalstrLengthCommon(CheckerContext &C, const CallEvent &Call, bool IsStrnlen = false) const; - void evalStrcpy(CheckerContext &C, const CallExpr *CE) const; - void evalStrncpy(CheckerContext &C, const CallExpr *CE) const; - void evalStpcpy(CheckerContext &C, const CallExpr *CE) const; - void evalStrlcpy(CheckerContext &C, const CallExpr *CE) const; - void evalStrcpyCommon(CheckerContext &C, const CallExpr *CE, bool ReturnEnd, -bool IsBounded, ConcatFnKind appendK, + void evalStrcpy(CheckerContext &C, const CallEvent &Call) const; + void evalStrncpy(CheckerContext &C, const CallEvent &Call) const; + void evalStpcpy(CheckerContext &C, const CallEvent &Call) const; + void evalStrlcpy(CheckerContext &C, const CallEvent &Call) const; + void evalStrcpyCommon(CheckerContext &C, const CallEvent &Call, +bool ReturnEnd, bool IsBounded, ConcatFnKind appendK, bool returnPtr = true) const; - void evalStrcat(CheckerContext &C, const CallExpr *CE) const; - void evalStrncat(CheckerContext &C, const CallExpr *CE) const; - void evalStrlcat(CheckerContext &C, const CallExpr *CE) const; + void evalStrcat(CheckerContext &C, const CallEvent &Call) const; + void evalStrncat(CheckerContext &C, const CallEvent &Call) const; + void evalStrlcat(CheckerContext &C, const CallEvent &Call) const; - void evalStrcmp(CheckerContext &C, const CallExpr *CE) const; - void evalStrncmp(CheckerContext &C, const CallExpr *CE) const; - void evalStrcasecmp(CheckerContext &C, const CallExpr *CE) const; - void evalStrncasecmp(CheckerContext &C, const CallExpr *CE) const; - void evalStrcmpCommon(CheckerContext &C, -const CallExpr *CE, -bool IsBounded = false, -bool IgnoreCa
[clang] [analyzer][docs] Update the release notes for llvm-18 (PR #76446)
https://github.com/steakhal milestoned https://github.com/llvm/llvm-project/pull/76446 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits