On Wed, Apr 21, 2021 at 01:15:53PM +0000, Visa Hankala wrote:
> On Wed, Apr 21, 2021 at 11:04:20AM +1000, David Gwynne wrote:
> > On Wed, Apr 21, 2021 at 10:21:32AM +1000, David Gwynne wrote:
> > > if you have a program that uses kq (or libevent) to wait for bytes to
> > > read off an idle network interface via /dev/bpf and that interface
> > > goes away, the program doesnt get woken up. this is because the kq
> > > read filter in bpf only checks if there ares bytes available. because a
> > > detached interface never gets packets (how very zen), this condition
> > > never changes and the program will never know something happened.
> > > 
> > > this has the bpf filter check if the interface is detached too. with
> > > this change my test program wakes up, tries to read, and gets EIO. which
> > > is great.
> > > 
> > > note that in the middle of this is the vdevgone machinery. when an
> > > interface is detached, bpfdetach gets called, which ends up calling
> > > vdevgone. vdevgone sort of swaps out bpf on the currently open vdev with
> > > some dead operations, part of which involves calling bpfclose() to try
> > > and clean up the existing state associated with the vdev. bpfclose tries
> > > to wake up any waiting listeners, which includes kq handlers. that's how
> > > the kernel goes from an interface being detached to the bpf kq filter
> > > being run. the bpf kq filter just has to check that the interface is
> > > still attached.
> > 
> > I thought tun(4) had this same problem, but I wrote a test and couldn't
> > reproduce it. tun works because it addresses the problem in a different
> > way. Instead of having its own kq filter check if the device is dead or
> > not, it calls klist_invalidate, which switches things around like the
> > vdevgone/vop_revoke stuff does with the vdev.
> > 
> > So an alternative way to solve this problem in bpf(4) would be the
> > following:
> > 
> > Index: bpf.c
> > ===================================================================
> > RCS file: /cvs/src/sys/net/bpf.c,v
> > retrieving revision 1.203
> > diff -u -p -r1.203 bpf.c
> > --- bpf.c   21 Jan 2021 12:33:14 -0000      1.203
> > +++ bpf.c   21 Apr 2021 00:54:30 -0000
> > @@ -401,6 +401,7 @@ bpfclose(dev_t dev, int flag, int mode, 
> >     bpf_wakeup(d);
> >     LIST_REMOVE(d, bd_list);
> >     mtx_leave(&d->bd_mtx);
> > +   klist_invalidate(&d->bd_sel.si_note);
> >     bpf_put(d);
> >  
> >     return (0);
> 
> I think bpf should call klist_invalidate() from the detach path.
> bpfsdetach() might be the right place. This would make the code pattern
> similar to the existing uses of klist_invalidate().
> 
> Calling klist_invalidate() from the close function twists the logic,
> at least in my opinion. When a file descriptor is closed, the file
> descriptor layer will remove the knotes automatically. This is why close
> functions usually do not have to manage with knotes. However, the
> automatic removal does not happen when a device is revoked, which is
> mended by klist_invalidate().

yep, makes sense to me. how's this look? it works as well as my previous
diff did in my tests.

Index: bpf.c
===================================================================
RCS file: /cvs/src/sys/net/bpf.c,v
retrieving revision 1.203
diff -u -p -r1.203 bpf.c
--- bpf.c       21 Jan 2021 12:33:14 -0000      1.203
+++ bpf.c       22 Apr 2021 03:09:27 -0000
@@ -1690,8 +1690,10 @@ bpfsdetach(void *p)
                if (cdevsw[maj].d_open == bpfopen)
                        break;
 
-       while ((bd = SMR_SLIST_FIRST_LOCKED(&bp->bif_dlist)))
+       while ((bd = SMR_SLIST_FIRST_LOCKED(&bp->bif_dlist))) {
                vdevgone(maj, bd->bd_unit, bd->bd_unit, VCHR);
+               klist_invalidate(&bd->bd_sel.si_note);
+       }
 
        for (tbp = bpf_iflist; tbp; tbp = tbp->bif_next) {
                if (tbp->bif_next == bp) {

Reply via email to