zaks.anna added a comment.

Hi,

Thank you for working on this!

Here are several comments from a **partial** review.

Can you talk a bit about how do you envision to generalize this to copy and 
paste detection, where the pasted code is only partially modified? Do you think 
this could be generalized in that way?


================
Comment at: include/clang/AST/CloneDetection.h:20
@@ +19,3 @@
+
+#include <map>
+#include <vector>
----------------
Please, check if there are better data structures you could be using: 
http://llvm.org/docs/ProgrammersManual.html#map-like-containers-std-map-densemap-etc

================
Comment at: include/clang/AST/CloneDetection.h:131
@@ +130,3 @@
+  bool operator<(const StmtSequence &Other) const {
+    return std::tie(S, StartIndex, EndIndex) <
+           std::tie(Other.S, Other.StartIndex, Other.EndIndex);
----------------
We are comparing the Stmt pointers here. Does this make results 
non-deterministic?


================
Comment at: include/clang/AST/CloneDetection.h:145
@@ +144,3 @@
+///
+/// This class first needs to be provided with (possibly multiple) translation
+/// units by passing them to \p AnalyzeTranslationUnitDecl .
----------------
Let's not talk about multiple TU here since much more is needed to support 
them. Maybe we should just say that, first, given a TU, the class generates the 
clone identifiers (hashes?) of statements in it (by running 
AnalyzeTranslationUnitDecl)?

The description of the method already states that it can be called multiple 
times.

================
Comment at: include/clang/AST/CloneDetection.h:161
@@ +160,3 @@
+  /// Holds the generated data for a StmtSequence
+  struct StmtData {
+    /// The generated hash value.
----------------
StmtData is too generic of a name. Maybe it could be called something like code 
signature, clone signature, or code hash?

================
Comment at: include/clang/AST/CloneDetection.h:166
@@ +165,3 @@
+    ///
+    /// This scalar values serves as a simple way of filtering clones
+    /// that are too small to be reported. A greater value indicates that the
----------------
serves -> serve

================
Comment at: include/clang/AST/CloneDetection.h:186
@@ +185,3 @@
+  /// \param S The given StmtSequence
+  /// \param Error Output parameter that is set to true if no data was 
generated
+  ///        for the given StmtSequence. This can happen if the given
----------------
Could this return an llvm Optional instead of the error code?

================
Comment at: include/clang/AST/CloneDetection.h:201
@@ +200,3 @@
+  /// \brief Stores the StmtSequence with its associated data in the search
+  ///        database of this CloneDetector.
+  ///
----------------
database -> map?

I would just say "Stores the StmtSequence with its associated data to allow 
future querying."

================
Comment at: include/clang/AST/CloneDetection.h:209
@@ +208,3 @@
+
+  /// \brief Searches all translation units in this CloneDetector for clones.
+  ///
----------------
Please, do not talk about "all translation units" here. Maybe say something 
like previously seen/stored statements/code or something like that.

================
Comment at: include/clang/AST/CloneDetection.h:217
@@ +216,3 @@
+  void findClones(std::vector<CloneGroup> &Result,
+                  unsigned MinGroupComplexity = 10);
+
----------------
Make 10 into a constant. It probably should be a checker parameter. 

================
Comment at: include/clang/StaticAnalyzer/Checkers/Checkers.td:668
@@ +667,3 @@
+
+let ParentPackage = CloneDetection in {
+
----------------
This should be an "alpha" package until it's considered to be ready for end 
users.

================
Comment at: lib/AST/CMakeLists.txt:10
@@ -9,3 +9,3 @@
   ASTImporter.cpp
   ASTTypeTraits.cpp
   AttrImpl.cpp
----------------
Again, I do not think this should be in the AST. Who the users of this will be? 
If you envision users outside of Clang Static Analyzer, maybe we could add this 
to lib/Analysis/.

================
Comment at: lib/AST/CloneDetection.cpp:56
@@ +55,3 @@
+StmtSequence::iterator StmtSequence::begin() const {
+  if (holdsSequence()) {
+    auto CS = cast<CompoundStmt>(S);
----------------
http://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code

================
Comment at: lib/AST/CloneDetection.cpp:103
@@ +102,3 @@
+/// \endcode
+class HashValue {
+  unsigned Value = 0;
----------------
This should probably be a utility instead of something specific to this 
checker. Do we have something similar to this in LLVM already?

================
Comment at: lib/StaticAnalyzer/Checkers/CloneChecker.cpp:45
@@ +44,3 @@
+
+  DiagnosticsEngine &DiagEngine = Mgr.getDiagnostic();
+
----------------
Is it possible to report these through BugReporter?


http://reviews.llvm.org/D20795



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to