https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116613

--- Comment #16 from Kamil Dudka <kdudka at redhat dot com> ---
(In reply to David Malcolm from comment #15)
> (In reply to Kamil Dudka from comment #14)
> > 1. Will `file=` work with absolute paths?
> 
> Yes.  There might be some issues with expressing paths/filenames containing
> whitespace, '=', or ',' due to the way the option-parsing works, but
> hopefully that's acceptable.

Yes, OSH does not need such chars in names of files with diagnostic output.

> > 2. If a file with the specified name exists, will it be overwritten?
> 
> Short answer: yes
> 
> Longer answer: My plan is to do:
>   fopen (path, "w") 
> on any specified outputs as cc1 starts up, and to fail with a hard error if
> any of the fopen fail, telling you why.  Hence it could fail if a file with
> the specified name exists, but e.g. you don't have write permissions on it.

Sounds good.

> Perhaps we need a specific exit code for the case of "can't open diagnostic
> output stream"?

I do not think we need it but it may ease debugging in some corner cases.

> > 3. Will the file always be created, even if no diagnostics is produced by
> > gcc?
> 
> Yes, with the caveat that if cc1 can't fopen a diagnostic output file it
> will fail immediately (as above), and obviously the file won't be created in
> such a case.

Although the SARIF file may contain useful info even for runs with no warnings,
OSH only collects the warnings.  SARIF files that contain no warnings may
complicate the processing of results (e.g. by exceeding command line length
while expanding globs in our shell scripts).  On the other hand, it is not that
difficult to remove the SARIF files with no warnings after the build has
finished.  We already take a similar approach with ShellCheck:
https://github.com/csutils/csmock/blob/455de9cdadf5d828c3b380c87e3f6f24dddad6ba/scripts/run-shellcheck.sh#L95

> > 4. Will the SARIF data contain absolute paths to source code files?  If not
> > will the working directory be recorded in each SARIF file?
> 
> GCC's current behavior is (as of GCC 13):
> 
> * for absolute paths, the GCC SARIF output for the "artifact" will have an
> absolute value in its "uri".
> 
>   "artifacts": [{"location": {"uri": "/tmp/test.c"},
> 
> * for relative paths, the GCC SARIF output for the "artifact" will have a
> relative uri, and a "uriBaseId" of "PWD":
> 
>   "artifacts": [{"location": {"uri":
> "../../src/gcc/testsuite/gcc.dg/analyzer/malloc-1.c",
>                               "uriBaseId": "PWD"},
> 
> and the "run" will have this giving the absolute path for "PWD":
>    "originalUriBaseIds": {"PWD": {"uri":
> "file:///home/david/coding-3/gcc-newgit-path-revamp/build/gcc/"}},
> 
> As of GCC 15 the "run"'s "invocation" also has the "workingDirectory"
> property:
> 
>   "workingDirectory": {"uri":
> "/home/david/coding-3/gcc-newgit-path-revamp/build/gcc"},

Thanks for explanation!  So at least the full path can be reconstructed later
on.  I have created a csdiff issue for this:
https://github.com/csutils/csdiff/issues/209

> Re my idea of:
> 
> >  -fdiagnostics-add-output=sarif:file={output-filename}.sarif
> > 
> > where e.g. {output-filename} would be substituted with the output filename 
> > from the gcc invocation.
> 
> note that I don't have that working yet, or a precise set of "substitution"
> names and their semantics (I plan to work on it today).  What
> "substitutions" might you need for your use-cases?
> 
> Does the above support all your use-cases?

OSH does not care too much about the names of SARIF files because all the
important data are contained in the files inside.  The ideal SARIF-based
workflow for OSH would be:
1. create an empty directory for scan results
2. run the (instrumented) build a of C/C++ project
3. each invocation of gcc (that produces warnings) during the build creates a
unique file with SARIF data in the pre-created directory with scan results
4. all the files created in the pre-created directory with scan results can be
processed after the build

Step 3. can be partially implemented in the compiler wrapper with flock/mktemp,
which can invoke gcc with an absolute path of an already created empty file to
write the SARIF data to.  If we take this approach, OSH will not need any such
substitutions in gcc.  If a substitution was provided by gcc to construct a
unique file name (such as %p and %n in valgrind), OSH would not need to
implement this part in the compiler wrapper.

Reply via email to