Hello,
On Sat, Dec 09, 2023 at 03:10:02AM +0300, Vitaliy Makkoveev wrote:
> On Sat, Dec 09, 2023 at 12:28:10AM +0100, Alexander Bluhm wrote:
> > On Sat, Dec 09, 2023 at 02:07:06AM +0300, Vitaliy Makkoveev wrote:
> > > > > SLIST_ENTRY(pflow_softc) sc_next;
> > > >
> > > > This list is protected by net lock. Can you add an [N] here?
> > > >
> > >
> > > This is not true. The netlock is not taken while export_pflow() called
> > > from pf_purge_states(). I privately shared the diff to fix this, but not
> > > committed it yet. I will update it and share after committing this diff.
> >
> > Why not hold the shared net lock while in pf?
> >
>
> Because the export_flow() should be called only if the PFSTATE_PFLOW
> flag is set? The netlock should be taken before PF_LOCK, so there is no
> reason to always take it in this path only to make optional
> export_pflow() call:
>
> pf_purge_expired_states(const unsigned int limit, const unsigned int collect)
> {
> /* skip */
>
> rw_enter_write(&pf_state_list.pfs_rwl);
> PF_LOCK();
> PF_STATE_ENTER_WRITE();
> SLIST_FOREACH(st, &gcl, gc_list) {
> if (st->timeout != PFTM_UNLINKED)
> pf_remove_state(st);
>
> pf_free_state(st);
> }
>
> pf_remove_state(struct pf_state *st)
> {
> PF_ASSERT_LOCKED();
>
> /* skip */
>
> RBT_REMOVE(pf_state_tree_id, &tree_id, st);
> #if NPFLOW > 0
> if (st->state_flags & PFSTATE_PFLOW)
> export_pflow(st);
>
> However, I'm not the pf hacker, I's better to ask sashan@ and dlg@.
>
dlg@ and I are basically trying to remove all NET_LOCK() operations from
pf(4), because we don't want pf(4) to be playing with global NET_LOCK().
all callers to pf(4) should either obtain NET_LOCK() in case they need it.
pf(4) should not care about NET_LOCK() at all. That's the ideal situation
where we are heading to.
I took a closer look at export_pflow() in current. It seems to me the
function assumes caller holds a shared NET_LOCK() at least, just to protect
consistency of `pflowif_list`. There should be an ASSERT() for this and
pf(4) indeed should make sure to call export_pflow() with all locks it
needs obtained in right order. Because the ASSERT() was missing the bugs
got discovered the hard way. I certainly don't mind to add NET_LOCK()
into pf_purge_expired_states() to keep export_pflow() happy.
on the other hand if there is a way to implement pflowif_list as lock-less
(or move it ouf of NET_LOCK() scope), then this is a preferred way
forward.
thanks and
regards
sashan