On Fri, Feb 12, 2021 at 5:40 PM John Bates <[email protected]> wrote:
>
>
>
> On Fri, Feb 12, 2021 at 4:34 PM Rob Clark <[email protected]> wrote:
>>
>> On Thu, Feb 11, 2021 at 5:40 PM John Bates <[email protected]> wrote:
>> >
>>
>> <snip>
>>
>> > Runtime Characteristics
>> >
>> > ~500KB additional binary size. Even with using only the basic features of
>> > perfetto, it will increase the binary size of mesa by about 500KB.
>>
>> IMHO, that size is negligible.. looking at freedreno, a mesa build
>> *only* enabling freedreno is already ~6MB.. distros typically use
>> "megadriver" (ie. all the drivers linked into a single .so with hard
>> links for the different ${driver}_dri.so), which on my fedora laptop
>> is ~21M. Maybe if anything is relevant it is how much of that
>> actually gets paged into RAM from disk, but I think 500K isn't a thing
>> to worry about too much.
>>
>> > Background thread. Perfetto uses a background thread for communication
>> > with the system tracing daemon (traced) to advertise trace data and get
>> > notification of trace start/stop.
>>
>> Mesa already tends to have plenty of threads.. some of that depends on
>> the driver, I think currently radeonsi is the threading king, but
>> there are several other drivers working on threaded_context and async
>> compile thread pool.
>>
>> It is worth mentioning that, AFAIU, perfetto can operate in
>> self-server mode, which seems like it would be useful for distros
>> which do not have the system daemon. I'm not sure if we lose that
>> with percetto?
>
>
> Easy to add, but want to avoid a runtime arg because it would add ~300KB to
> binary size. Okay if we have an alternate init function though.
I think I could imagine wanting mesa build params to control whether
we want self-server or system-server mode.. ie. if some distros add
system-server support they wouldn't need self-server mode and visa
versa
>
>>
>>
>> > Runtime overhead when disabled is designed to be optimal with one
>> > predicted branch, typically a few CPU cycles per event. While enabled, the
>> > overhead can be around 1 us per event.
>> >
>> > Integration Challenges
>> >
>> > The perfetto SDK is C++ and designed around macros, lambdas, inline
>> > templates, etc. There are ongoing discussions on providing an official
>> > perfetto C API, but it is not yet clear when this will land on the
>> > perfetto roadmap.
>> > The perfetto SDK is an amalgamated .h and .cc that adds up to 100K lines
>> > of code.
>> > Anything that includes perfetto.h takes a long time to compile.
>> > The current Perfetto SDK design is incompatible with being a shared
>> > library behind a C API.
>>
>> So, C++ on it's own isn't a showstopper, mesa has plenty of C++ code.
>> But maybe we should verify that MSVC is happy with it, otherwise we
>> need to take a bit more care in some parts of the codebase.
>>
>> As far as compile time, I wonder if we can regenerate the .cc/.h with
>> only the gpu trace parts? But I wouldn't expect the .h to be
>> something widely included. For example, for gpu timeline traces in
>> freedreno, I'm expecting it to look like a freedreno_perfetto.cc with
>> extern "C" {} around the callbacks that would hook into the
>> u_tracepoint tracepoints. That one file would pull in the perfetto
>> .h, and we'd just not build that file if perfetto was disabled.
>
>
> That works for GPU, but I'd like to see some slow CPU functions in traces as
> well to help reason about performance problems. This ends up peppering the
> trace header in lots of places.
My point was that we could strip out a whole lot of stuff that is
completely unrelated to mesa.. not sure if it is worth bothering with,
I doubt we'd #include perfetto.h very widely
>> Overall having to add our own extern C wrappers in some places doesn't
>> seem like the *end* of the world.. a bit annoying, but we might end up
>> doing that regardless if other folks want the ability to hook in
>> something other than perfetto?
>
>
> It's more than extern C wrappers if we want to minimize overhead while
> tracing enabled at compile time. Have a look at percetto.h/cc.
I'm not sure how many distros are not using LTO these days.. I assume
once you have LTO it doesn't really matter anymore?
>>
>>
>> <snip>
>>
>> > Mesa Integration Alternatives
>>
>> I'm kind of leaning towards the "just slurp in the .cc/.h" approach..
>> that is mostly because I expect to initially just add some basic gpu
>> timeline tracepoints, but over time iterate on adding more.. it would
>> be nice to not have to depend on a newer version of an external
>> library at each step. That is ofc only my $0.02..
>
>
> It's a small initial setup tax, true, but I still think it depends on what
> perfetto features we plan to use -- for only a couple files doing GPU tracing
> I agree percetto is unnecessary, but for CPU tracing it gets more complicated.
Definitely the first thing I plan to use is getting render stages onto
a timeline, so I can better see where the GPU time is going.. second
step is probably adding more gpu perfcntr.. and I guess the third
thing is more CPU oriented things like seeing where shader compiles
are happening. Although threaded_context might also be a thing where
having some more CPU tracing could be useful?
BR,
-R
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev