On Mon, Apr 11, 2022 at 07:19:00PM +0200, Martin Pieuchot wrote: > On 08/04/22(Fri) 12:16, Alexander Bluhm wrote: > > On Fri, Apr 08, 2022 at 02:39:34AM +0000, Visa Hankala wrote: > > > On Thu, Apr 07, 2022 at 07:55:11PM +0200, Alexander Bluhm wrote: > > > > On Wed, Mar 23, 2022 at 06:13:27PM +0100, Alexander Bluhm wrote: > > > > > In my opinion tracepoints give insight at minimal cost. It is worth > > > > > it to have it in GENERIC to make it easy to use. > > > > > > > > After release I want to revive the btrace of refcounts discussion. > > > > > > > > As mpi@ mentioned the idea of dt(4) is to have these trace points > > > > in GENERIC kernel. If you want to hunt a bug, just turn it on. > > > > Refounting is a common place for bugs, leaks can be detected easily. > > > > > > > > The alternative are some defines that you can compile in and access > > > > from ddb. This is more work and you would have to implement it for > > > > every recount. > > > > https://marc.info/?l=openbsd-tech&m=163786435916039&w=2 > > > > > > > > There is no measuarable performance difference. dt(4) is written > > > > in a way that is is only one additional branch. At least my goal > > > > is to add trace points to useful places when we identify them. > > > > > > DT_INDEX_ENTER() still checks the index first, so it has two branches > > > in practice. > > > > > > I think dt_tracing should be checked first so that it serves as > > > a gateway to the trace code. Under normal operation, the check's > > > outcome is always the same, which is easy even for simple branch > > > predictors. > > > > Reordering the check is easy. Now dt_tracing is first. > > > > > I have a slight suspicion that dt(4) is now becoming a way to add code > > > that would be otherwise unacceptable. Also, how "durable" are trace > > > points perceived? Is an added trace point an achieved advantage that > > > is difficult to remove even when its utility has diminished? There is > > > a similarity to (ad hoc) debug printfs too. > > > > As I understand dt(4) it is a replacement for debug printfs. But > > it has advantages. I can be turnd on selectively from userland. > > It does not spam the console, but can be processed in userland. It > > is always there, you don't have to recompile. > > > > Of course you always have the printf or tracepoint at the worng > > place. I think people debugging the code should move them to > > the useful places. Then we may end with generally useful tool. > > A least that is my hope. > > > > There are obvious places to debug. We have syscall entry and return. > > And I think reference counting is also generally interesting. > > I'm happy if this can help debugging real reference counting issues. Do > you have a script that could be committed to /usr/share/btrace to show > how to track reference counting using these probes?
Script looks like this: #!/usr/sbin/btrace tracepoint:refcnt:inpcb{ printf("%s %x %u %+d\n", probe, arg0, arg1, arg2) } Note that output should be -1 instead of +4294967295, but that is a different problem. tracepoint:refcnt:inpcb fffffd80793885c0 2 +1 tracepoint:refcnt:inpcb fffffd80793885c0 3 +4294967295 tracepoint:refcnt:inpcb fffffd80793885c0 2 +1 tracepoint:refcnt:inpcb fffffd80793885c0 3 +4294967295 tracepoint:refcnt:inpcb fffffd80793885c0 2 +1 tracepoint:refcnt:inpcb fffffd80793885c0 3 +4294967295 Or with kernel stack: #!/usr/sbin/btrace tracepoint:refcnt:inpcb{ printf("%s %x %u %+d%s\n", probe, arg0, arg1, arg2, kstack) } tracepoint:refcnt:inpcb fffffd80793885c0 3 +4294967295 refcnt_rele+0x88 in_pcbunref+0x24 pf_find_state+0x2a6 pf_test_state+0x172 pf_test+0xd17 ip6_output+0xd14 tcp_output+0x164f tcp_usrreq+0x386 sosend+0x37c dofilewritev+0x14d sys_write+0x51 syscall+0x314 Xsyscall+0x128 kernel > > Index: dev/dt/dt_prov_static.c > > =================================================================== > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/dt/dt_prov_static.c,v > > retrieving revision 1.13 > > diff -u -p -r1.13 dt_prov_static.c > > --- dev/dt/dt_prov_static.c 17 Mar 2022 14:53:59 -0000 1.13 > > +++ dev/dt/dt_prov_static.c 8 Apr 2022 09:40:29 -0000 > > @@ -87,6 +87,12 @@ DT_STATIC_PROBE1(smr, barrier_exit, "int > > DT_STATIC_PROBE0(smr, wakeup); > > DT_STATIC_PROBE2(smr, thread, "uint64_t", "uint64_t"); > > > > +/* > > + * reference counting > > + */ > > +DT_STATIC_PROBE0(refcnt, none); > > +DT_STATIC_PROBE3(refcnt, inpcb, "void *", "int", "int"); > > +DT_STATIC_PROBE3(refcnt, tdb, "void *", "int", "int"); > > > > /* > > * List of all static probes > > @@ -127,15 +133,24 @@ struct dt_probe *const dtps_static[] = { > > &_DT_STATIC_P(smr, barrier_exit), > > &_DT_STATIC_P(smr, wakeup), > > &_DT_STATIC_P(smr, thread), > > + /* refcnt */ > > + &_DT_STATIC_P(refcnt, none), > > + &_DT_STATIC_P(refcnt, inpcb), > > + &_DT_STATIC_P(refcnt, tdb), > > }; > > > > +struct dt_probe *const *dtps_index_refcnt; > > + > > int > > dt_prov_static_init(void) > > { > > int i; > > > > - for (i = 0; i < nitems(dtps_static); i++) > > + for (i = 0; i < nitems(dtps_static); i++) { > > + if (dtps_static[i] == &_DT_STATIC_P(refcnt, none)) > > + dtps_index_refcnt = &dtps_static[i]; > > dt_dev_register_probe(dtps_static[i]); > > + } > > > > return i; > > } > > Index: dev/dt/dtvar.h > > =================================================================== > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/dt/dtvar.h,v > > retrieving revision 1.13 > > diff -u -p -r1.13 dtvar.h > > --- dev/dt/dtvar.h 27 Feb 2022 10:14:01 -0000 1.13 > > +++ dev/dt/dtvar.h 8 Apr 2022 09:42:19 -0000 > > @@ -313,11 +313,30 @@ extern volatile uint32_t dt_tracing; /* > > #define DT_STATIC_ENTER(func, name, args...) do { > > \ > > extern struct dt_probe _DT_STATIC_P(func, name); \ > > struct dt_probe *dtp = &_DT_STATIC_P(func, name); \ > > - struct dt_provider *dtpv = dtp->dtp_prov; \ > > \ > > if (__predict_false(dt_tracing) && \ > > __predict_false(dtp->dtp_recording)) { \ > > + struct dt_provider *dtpv = dtp->dtp_prov; \ > > + \ > > dtpv->dtpv_enter(dtpv, dtp, args); \ > > + } \ > > +} while (0) > > + > > +#define _DT_INDEX_P(func) (dtps_index_##func) > > + > > +#define DT_INDEX_ENTER(func, index, args...) do { \ > > + extern struct dt_probe **_DT_INDEX_P(func); \ > > + \ > > + if (__predict_false(dt_tracing) && \ > > + __predict_false(index > 0) && \ > > + __predict_true(_DT_INDEX_P(func) != NULL)) { \ > > + struct dt_probe *dtp = _DT_INDEX_P(func)[index]; \ > > + \ > > + if(__predict_false(dtp->dtp_recording)) { \ > > + struct dt_provider *dtpv = dtp->dtp_prov; \ > > + \ > > + dtpv->dtpv_enter(dtpv, dtp, args); \ > > + } \ > > } \ > > } while (0) > > > > Index: kern/kern_synch.c > > =================================================================== > > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_synch.c,v > > retrieving revision 1.185 > > diff -u -p -r1.185 kern_synch.c > > --- kern/kern_synch.c 18 Mar 2022 15:32:06 -0000 1.185 > > +++ kern/kern_synch.c 8 Apr 2022 09:40:29 -0000 > > @@ -804,7 +804,15 @@ sys___thrwakeup(struct proc *p, void *v, > > void > > refcnt_init(struct refcnt *r) > > { > > + refcnt_init_trace(r, 0); > > +} > > + > > +void > > +refcnt_init_trace(struct refcnt *r, int idx) > > +{ > > + r->r_traceidx = idx; > > atomic_store_int(&r->r_refs, 1); > > + TRACEINDEX(refcnt, r->r_traceidx, r, 0, +1); > > } > > > > void > > @@ -814,6 +822,7 @@ refcnt_take(struct refcnt *r) > > > > refs = atomic_inc_int_nv(&r->r_refs); > > KASSERT(refs != 0); > > + TRACEINDEX(refcnt, r->r_traceidx, r, refs - 1, +1); > > (void)refs; > > } > > > > @@ -824,6 +833,7 @@ refcnt_rele(struct refcnt *r) > > > > refs = atomic_dec_int_nv(&r->r_refs); > > KASSERT(refs != ~0); > > + TRACEINDEX(refcnt, r->r_traceidx, r, refs + 1, -1); > > return (refs == 0); > > } > > > > @@ -842,11 +852,13 @@ refcnt_finalize(struct refcnt *r, const > > > > refs = atomic_dec_int_nv(&r->r_refs); > > KASSERT(refs != ~0); > > + TRACEINDEX(refcnt, r->r_traceidx, r, refs + 1, -1); > > while (refs) { > > sleep_setup(&sls, r, PWAIT, wmesg, 0); > > refs = atomic_load_int(&r->r_refs); > > sleep_finish(&sls, refs); > > } > > + TRACEINDEX(refcnt, r->r_traceidx, r, refs, 0); > > } > > > > int > > @@ -855,6 +867,7 @@ refcnt_shared(struct refcnt *r) > > u_int refs; > > > > refs = atomic_load_int(&r->r_refs); > > + TRACEINDEX(refcnt, r->r_traceidx, r, refs, 0); > > return (refs > 1); > > } > > > > @@ -864,6 +877,7 @@ refcnt_read(struct refcnt *r) > > u_int refs; > > > > refs = atomic_load_int(&r->r_refs); > > + TRACEINDEX(refcnt, r->r_traceidx, r, refs, 0); > > return (refs); > > } > > > > Index: netinet/in_pcb.c > > =================================================================== > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/in_pcb.c,v > > retrieving revision 1.264 > > diff -u -p -r1.264 in_pcb.c > > --- netinet/in_pcb.c 22 Mar 2022 18:02:54 -0000 1.264 > > +++ netinet/in_pcb.c 8 Apr 2022 09:40:29 -0000 > > @@ -235,7 +235,7 @@ in_pcballoc(struct socket *so, struct in > > return (ENOBUFS); > > inp->inp_table = table; > > inp->inp_socket = so; > > - refcnt_init(&inp->inp_refcnt); > > + refcnt_init_trace(&inp->inp_refcnt, DT_REFCNT_IDX_INPCB); > > inp->inp_seclevel[SL_AUTH] = IPSEC_AUTH_LEVEL_DEFAULT; > > inp->inp_seclevel[SL_ESP_TRANS] = IPSEC_ESP_TRANS_LEVEL_DEFAULT; > > inp->inp_seclevel[SL_ESP_NETWORK] = IPSEC_ESP_NETWORK_LEVEL_DEFAULT; > > Index: netinet/ip_ipsp.c > > =================================================================== > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_ipsp.c,v > > retrieving revision 1.269 > > diff -u -p -r1.269 ip_ipsp.c > > --- netinet/ip_ipsp.c 10 Mar 2022 15:21:08 -0000 1.269 > > +++ netinet/ip_ipsp.c 8 Apr 2022 09:40:29 -0000 > > @@ -1048,7 +1048,7 @@ tdb_alloc(u_int rdomain) > > > > tdbp = pool_get(&tdb_pool, PR_WAITOK | PR_ZERO); > > > > - refcnt_init(&tdbp->tdb_refcnt); > > + refcnt_init_trace(&tdbp->tdb_refcnt, DT_REFCNT_IDX_TDB); > > mtx_init(&tdbp->tdb_mtx, IPL_SOFTNET); > > TAILQ_INIT(&tdbp->tdb_policy_head); > > > > Index: sys/refcnt.h > > =================================================================== > > RCS file: /data/mirror/openbsd/cvs/src/sys/sys/refcnt.h,v > > retrieving revision 1.6 > > diff -u -p -r1.6 refcnt.h > > --- sys/refcnt.h 16 Mar 2022 14:13:01 -0000 1.6 > > +++ sys/refcnt.h 8 Apr 2022 09:40:29 -0000 > > @@ -21,24 +21,30 @@ > > > > /* > > * Locks used to protect struct members in this file: > > + * I immutable after creation > > * a atomic operations > > */ > > > > struct refcnt { > > unsigned int r_refs; /* [a] reference counter */ > > + int r_traceidx; /* [I] index for dt(4) tracing */ > > }; > > > > -#define REFCNT_INITIALIZER() { .r_refs = 1 } > > +#define REFCNT_INITIALIZER() { .r_refs = 1, .r_traceidx = 0 } > > > > #ifdef _KERNEL > > > > void refcnt_init(struct refcnt *); > > +void refcnt_init_trace(struct refcnt *, int id); > > void refcnt_take(struct refcnt *); > > int refcnt_rele(struct refcnt *); > > void refcnt_rele_wake(struct refcnt *); > > void refcnt_finalize(struct refcnt *, const char *); > > int refcnt_shared(struct refcnt *); > > unsigned int refcnt_read(struct refcnt *); > > + > > +#define DT_REFCNT_IDX_INPCB 1 > > +#define DT_REFCNT_IDX_TDB 2 > > > > #endif /* _KERNEL */ > > > > Index: sys/tracepoint.h > > =================================================================== > > RCS file: /data/mirror/openbsd/cvs/src/sys/sys/tracepoint.h,v > > retrieving revision 1.1 > > diff -u -p -r1.1 tracepoint.h > > --- sys/tracepoint.h 21 Jan 2020 16:16:23 -0000 1.1 > > +++ sys/tracepoint.h 8 Apr 2022 09:40:29 -0000 > > @@ -25,11 +25,13 @@ > > #if NDT > 0 > > #include <dev/dt/dtvar.h> > > > > -#define TRACEPOINT(func, name, args...) DT_STATIC_ENTER(func, name, > > args) > > +#define TRACEPOINT(func, name, args...) DT_STATIC_ENTER(func, name, > > args) > > +#define TRACEINDEX(func, index, args...) DT_INDEX_ENTER(func, index, args) > > > > #else /* NDT > 0 */ > > > > -#define TRACEPOINT(func, name, args...) > > +#define TRACEPOINT(func, name, args...) > > +#define TRACEINDEX(func, index, args...) > > > > #endif /* NDT > 0 */ > > #endif /* _KERNEL */