On Mon, Mar 21, 2022 at 01:22:22PM +0100, Martin Pieuchot wrote:
> On 20/03/22(Sun) 05:39, Visa Hankala wrote:
> > On Sat, Mar 19, 2022 at 12:10:11AM +0100, Alexander Bluhm wrote:
> > > On Thu, Mar 17, 2022 at 07:25:27AM +0000, Visa Hankala wrote:
> > > > On Thu, Mar 17, 2022 at 12:42:13AM +0100, Alexander Bluhm wrote:
> > > > > I would like to use btrace to debug refernce counting.  The idea
> > > > > is to a a tracepoint for every type of refcnt we have.  When it
> > > > > changes, print the actual object, the current counter and the change
> > > > > value.
> > > > 
> > > > > Do we want that feature?
> > > > 
> > > > I am against this in its current form. The code would become more
> > > > complex, and the trace points can affect timing. There is a risk that
> > > > the kernel behaves slightly differently when dt has been compiled in.
> > > 
> > > On our main architectures dt(4) is in GENERIC.  I see your timing
> > > point for uvm structures.
> > 
> > In my opinion, having dt(4) enabled by default is another reason why
> > there should be no carte blanche for adding trace points. Each trace
> > point adds a tiny amount of bloat. Few users will use the tracing
> > facility.
> > 
> > Maybe high-rate trace points could be behind a build option...
> 
> The whole point of dt(4) is to be able to debug GENERIC kernel.  I doubt
> the cost of an additional if () block matters.

The idea of dt(4) is that developer or end user with instructions
can debug a running kernel without recompiling.  So we have to put
trace points at places where we gain much information.

I did some meassurement with and without dt.  Note that I configure
my tests machines with sysctl kern.allowdt=1.  I had to disable it
in kernel diff.

http://bluhm.genua.de/perform/results/2022-03-21T09%3A08%3A37Z/perform.html

I see difference from moving the kernel objects.  Even reboot and
testing again has more variance than dt(4).

The story is different when btrace(8) is actually running.  Look
at the numbers in the right column.

http://bluhm.genua.de/perform/results/2022-03-21T09%3A08%3A37Z/2022-03-21T00%3A00%3A00Z/perform.html

For the network test it does not matter, as our IP stack uses only
one or maybe two cores.  On a 4 core machine btrace userland can
use 1 core.  When compiling the kernel in the "make-bsd-j4" test
row, the build time goes up as btrace takes CPU time from the
compiler.

In my opinion tracepoints give insight at minimal cost.  It is worth
it to have it in GENERIC to make it easy to use.

bluhm

Reply via email to