On Wed, Apr 21, 2021 at 11:04:20AM +1000, David Gwynne wrote:
> On Wed, Apr 21, 2021 at 10:21:32AM +1000, David Gwynne wrote:
> > if you have a program that uses kq (or libevent) to wait for bytes to
> > read off an idle network interface via /dev/bpf and that interface
> > goes away, the program doesnt get woken up. this is because the kq
> > read filter in bpf only checks if there ares bytes available. because a
> > detached interface never gets packets (how very zen), this condition
> > never changes and the program will never know something happened.
> > 
> > this has the bpf filter check if the interface is detached too. with
> > this change my test program wakes up, tries to read, and gets EIO. which
> > is great.
> > 
> > note that in the middle of this is the vdevgone machinery. when an
> > interface is detached, bpfdetach gets called, which ends up calling
> > vdevgone. vdevgone sort of swaps out bpf on the currently open vdev with
> > some dead operations, part of which involves calling bpfclose() to try
> > and clean up the existing state associated with the vdev. bpfclose tries
> > to wake up any waiting listeners, which includes kq handlers. that's how
> > the kernel goes from an interface being detached to the bpf kq filter
> > being run. the bpf kq filter just has to check that the interface is
> > still attached.
> 
> I thought tun(4) had this same problem, but I wrote a test and couldn't
> reproduce it. tun works because it addresses the problem in a different
> way. Instead of having its own kq filter check if the device is dead or
> not, it calls klist_invalidate, which switches things around like the
> vdevgone/vop_revoke stuff does with the vdev.
> 
> So an alternative way to solve this problem in bpf(4) would be the
> following:
> 
> Index: bpf.c
> ===================================================================
> RCS file: /cvs/src/sys/net/bpf.c,v
> retrieving revision 1.203
> diff -u -p -r1.203 bpf.c
> --- bpf.c     21 Jan 2021 12:33:14 -0000      1.203
> +++ bpf.c     21 Apr 2021 00:54:30 -0000
> @@ -401,6 +401,7 @@ bpfclose(dev_t dev, int flag, int mode, 
>       bpf_wakeup(d);
>       LIST_REMOVE(d, bd_list);
>       mtx_leave(&d->bd_mtx);
> +     klist_invalidate(&d->bd_sel.si_note);
>       bpf_put(d);
>  
>       return (0);

I think bpf should call klist_invalidate() from the detach path.
bpfsdetach() might be the right place. This would make the code pattern
similar to the existing uses of klist_invalidate().

Calling klist_invalidate() from the close function twists the logic,
at least in my opinion. When a file descriptor is closed, the file
descriptor layer will remove the knotes automatically. This is why close
functions usually do not have to manage with knotes. However, the
automatic removal does not happen when a device is revoked, which is
mended by klist_invalidate().

Reply via email to