aaron.ballman added inline comments.
================
Comment at: Analysis/diagnostics/sarif-check.py:22
+passes = 0
+with open(testfile) as testfh:
+ lineno = 0
----------------
george.karpenkov wrote:
> Wow, this is super neat!
> Since you are quite active in LLVM community, would you think it's better to
> have this tool in `llvm/utils` next to FileCheck? The concept is very
> general, and I think a lot of people really feel FileCheck limitations.
The concept was pulled from test\TableGen\JSON-check.py, so it likely could be
generalized. However, I suspect that each JSON test may want to expose
different helper capabilities, so perhaps it won't be super general? I don't
know enough about good Python design to know.
================
Comment at: Analysis/diagnostics/sarif-diagnostics-taint-test.c:18
+// Test the basics for sanity.
+// CHECK: sarifLog['version'].startswith("2.0.0")
+// CHECK: sarifLog['runs'][0]['tool']['fullName'] == "clang static analyzer"
----------------
george.karpenkov wrote:
> Would it make more sense to just use `diff` + json pretty-formatter to write
> a test?
> With this test I can't even quite figure out how the output should look like.
I'm not super comfortable with that approach, but perhaps I'm thinking of
something different than what you're proposing. The reason I went with this
approach is because diff would be fragile (depends heavily on field ordering,
which the JSON support library doesn't give control over) and the physical
layout of the file isn't what needs to be tested anyway. SARIF has a fair
amount of optional data that can be provided as well, so using a purely textual
diff worried me that exporting additional optional data in the future would
require extensive unrelated changes to all SARIF diffs in the test directory.
The goal for this test was to demonstrate that we can validate that the
interesting bits of information are present in the output without worrying
about the details.
Also, the python approach allows us to express relationships between data items
more easily than a textual diff tool would. I've not used that here, but I
could imagine a test where we want to check that each code location has a
corresponding file entry in another list.
================
Comment at: StaticAnalyzer/Core/SarifDiagnostics.cpp:74
+ // Add the rest of the path components, encoding any reserved characters.
+ std::for_each(std::next(sys::path::begin(Filename)),
sys::path::end(Filename),
+ [&Ret](StringRef Component) {
----------------
george.karpenkov wrote:
> Nitpicking style, but I don't see why for-each loop, preferably with a range
> wrapping the iterators would not be more readable.
I tend to prefer using algorithms when the logic is simple -- it makes it more
clear that the loop is an unimportant detail. I don't have strong opinions on
this particular loop, however.
================
Comment at: StaticAnalyzer/Core/SarifDiagnostics.cpp:88
+ // URI encode the part.
+ for (char C : Component) {
+ // RFC 3986 claims alpha, numeric, and this handful of
----------------
ZaMaZaN4iK wrote:
> I don't know about const corectness policy in LLVM. But I prefer here const
> char C .
We typically do not put top-level `const` on locals, so I'd prefer to leave it
off here rather than be inconsistent.
================
Comment at: StaticAnalyzer/Core/SarifDiagnostics.cpp:182
+ case PathDiagnosticPiece::Kind::Note:
+ // FIXME: What should be reported here?
+ break;
----------------
george.karpenkov wrote:
> "Note" are notes which do not have to be attached to a particular path
> element.
Good to know!
================
Comment at: StaticAnalyzer/Core/SarifDiagnostics.cpp:243
+
+ llvm::for_each(Diags, [&](const PathDiagnostic *D) {
+ Results.push_back(createResult(*D, Files));
----------------
george.karpenkov wrote:
> I like closures, but what's wrong with just using a `for` loop here?
Same as above: clarity of exposition. This one I'd feel pretty strongly about
keeping as an algorithm given how trivial the loop body is.
================
Comment at: StaticAnalyzer/Core/SarifDiagnostics.cpp:254
+ std::vector<const PathDiagnostic *> &Diags, FilesMade *) {
+ // FIXME: if the file already exists, do we overwrite it with a single run,
+ // or do we append a run into the file if it's a valid SARIF log?
----------------
george.karpenkov wrote:
> Usually we overwrite the file and note that on stderr in such cases.
We took the decision internally to overwrite as well, but the SARIF format
allows for multiple runs within the same output file (so you can compare
analysis results for the same project over time). I think this will eventually
have to be user-controlled because I can see some users wanting to append to
the run and others wanting to overwrite. However, these log files can become
quite large in practice (GBs of data), so "read in the JSON and add to it" may
be implausible, hence why I punted for now.
I'll update the comment so it's not a FIXME.
https://reviews.llvm.org/D53814
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits