On Thu, 2026-01-29 at 11:58 -0500, David Malcolm wrote: > On Wed, 2026-01-28 at 17:53 -0500, Siddhesh Poyarekar wrote: > > On 2026-01-28 10:41, Qing Zhao via Gcc wrote: > > > Does GCC provide any option to record optimization information, > > > such as inlining, loop transformation, > > > profiling consistency, etc into specific sections of binary > > > code? > > > > I may be misremembering this, but I think David had some ideas > > about > > doing something like this in SARIF. > > > > Several thoughts here: > > (a) I've written a prototype that embeds SARIF as an ELF section in > the > generated object file, rather like debuginfo (my idea at the time > being > that a binary could contain within it its build flags and other > metadata, and its diagnostics, etc). I don't think I posted it to > the > mailing list though. > > (b) A long time ago I prototyped a gcc implementation of llvm's idea > of > optimization remarks, to send info optimization through the > diagnostics > subsystem, but IIRC that work ended up as the revamp of optinfo (in > GCC > 9?; see my Cauldron 2018 talk on optimization records), which > generalized some of the internals of how we track optimization info. > The machine-readable output is a custom json-based format. > > (c) SARIF would probably be a good fit for optimization records; it's > machine-readable, and has a rich vocabulary for source locations, > code > constructs, machine locations, etc; IDEs and other tooling understand > it, so they'd get a source-level view of optimization info "for > free". > Note that currently our SARIF output captures the contents of every > source file referred to by any diagnostics, but we could e.g. capture > every source file/header used during the compile, and could capture > e.g. SHA1 sums rather than file content. > > (d) I've added the ability to add custom info to diagnostic sinks; > see > e.g. capturing CFG information in > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e20eee3897ae8cd0f2212dad0710d64df8f1a956 > > (e) I've added a new publish/subscribe framework to GCC for loosely > coupled notifications that would probably help with the > implementation > (to avoid needing to have the diagnostics subsystem "know" too much > about the optimizer). > > So possible GCC 17 material might be: > > (d) add a new sink to the optinfo subsystem that adds a new pub/sub > channel about optimization info, and sends notifications about the > optimization records there > > (e) add a new option to -fdiagnostics-add-output to capture optinfo, > which when enabled subscribes the diagnostic sink to the optinfo > notifications channel. Or we just skip (d) and work more directly > with > optinfo, but (d) allows some extra flexibility e.g. for plugins that > listen for optimization decisions. > > (f) potentially add a new option to the SARIF sink to support > embedding > the data in an ELF section, rather than writing to a file (as per (a) > above). > > Brainstorming, the user might be able to do something like: > > -fdiagnostics-add-output=sarif:elf- > section=optimizations,optinfo=inline > > or whatnot, and have an ELF section capturing the decisions made by > the > inliner. > > Or we could have an option to send optinfo as diagnostics, like > LLVM's > optimization records (and (b) above), and have the diagnostics sinks > handle them that way (text, SARIF, HTML).
Some more thoughts: * raw JSON might be rather large for the SARIF, but it compresses well. I've experimented with other serializations (CBOR), but the savings I saw didn't justify adding a new binary format. That said, protobuf might be an interesting approach. * the optinfo records have a nested structure (e.g. info about the logic within the vectorizer) that seems similar to that of the hierarchical C++ messages we now emit for template errors. So I'd want to explore reusing that framework; this could make vectorizer reports much easier for users to read Dave
