rtrieu added a comment.

In https://reviews.llvm.org/D21675#654659, @teemperor wrote:

> I feel like we have a really similar use case in the 
> Analysis/CloneDetection{.h/.cpp} with the hashing of statements. I tried 
> substituting the call to the statement hashing with a call to the 
> CloneDetection API and it seems that most tests pass and the remaining fails 
> are just minor fixable differences (like `::foo()` and `foo()` having 
> different hash codes).
>
> Would be nice if we could make Stmt::Profile, ODRHash and the CloneDetection 
> use the same backend. We improved a few things that we no longer have the 
> periodic `realloc`s from the vector inside NodeIDSet and that we use MD5. 
> Also there are are some future plans on how we can better prevent regressions 
> when we add/extend AST nodes. Thoughts?


It would help to understand your use case better.  Does CloneDetection care 
about ::foo versus foo?  How about different types with the same name?  ODR 
checking assumes identical token sequences, so it does care about extra "::".  
ODR checking also will process information about type while CloneDetection 
looks like it only uses the type name.

I see that CloneDetector uses both llvm::MD5 and llvm::FoldingSetNodeID, and 
only adds data via StringRef.  MD5 will add the bytes as data, while 
FoldingSetNodeID will add the size of the string, then the bytes as data.  This 
means MD5 may have collisions when two strings are added back to back while 
FoldingSetNodeID will store extra data for every integer processed.  
FoldingSetNodeID doesn't have this problem if AddInteger is used.  Were the 
reallocs a big problem for CloneDetection?

I don't thinking merging Stmt::Profile, ODRHash, and CloneDetection would be a 
good idea unless the hashes needed are very similar.  Stmt::Profile and ODRHash 
already share a base for Stmt processing, which might be extendable to 
CloneDetection as well, but a full merge may be difficult.

Sadly, we don't have a good story for when an AST node gets changed with new 
properties.  The Stmt profiler does declare all the visit methods in the class 
definition, so that will catch any new Stmt nodes without a new visit method.


https://reviews.llvm.org/D21675



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to