On Thu, May 21, 2015 at 21:28 +0200, Alexandr Nedvedicky wrote: > Hello, > > On Thu, May 21, 2015 at 07:43:51PM +0200, Mike Belopuhov wrote: > > On Thu, May 21, 2015 at 17:34 +0200, Alexandr Nedvedicky wrote: > > > Hello, > > > > > > > Hi, > > > > > snippet below comes from pf_create_state(): > > > > > > 3559 if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, > > > s)) { > > > 3560 pf_state_key_detach(s, PF_SK_STACK); > > > 3561 pf_state_key_detach(s, PF_SK_WIRE); > > > 3562 *sks = *skw = NULL; > > > 3563 REASON_SET(&reason, PFRES_STATEINS); > > > 3564 goto csfailed; > > > 3565 } else > > > 3566 *sm = s; > > > 3567 > > > 3568 /* attach src nodes late, otherwise cleanup on error > > > nontrivial */ > > > 3569 for (i = 0; i < PF_SN_MAX; i++) > > > 3570 if (sns[i] != NULL) { > > > 3571 struct pf_sn_item *sni; > > > 3572 > > > 3573 sni = pool_get(&pf_sn_item_pl, PR_NOWAIT); > > > 3574 if (sni == NULL) { > > > 3575 REASON_SET(&reason, PFRES_MEMORY); > > > 3576 pf_src_tree_remove_state(s); > > > 3577 STATE_DEC_COUNTERS(s); > > > 3578 pool_put(&pf_state_pl, s); > > > 3579 return (PF_DROP); > > > 3580 } > > > 3581 sni->sn = sns[i]; > > > 3582 SLIST_INSERT_HEAD(&s->src_nodes, sni, > > > next); > > > 3583 sni->sn->states++; > > > 3584 } > > > > > > at line 3559 PF inserts state to table. If insert operation succeeds, then > > > state can no longer be killed using simple pool_put() as it currently > > > happens at line 3578. I think PF should go for pf_unlink_state() instead. > > > > > > patch below should kill the bug. > > > > > > > Indeed. But I don't like the comment stating that we're attaching > > src nodes late because the "cleanup on error nontrivial". Perhaps > > we should do a pf_state_insert afterwards? This might simplify > > locking later on. > > perhaps swapping the for loop block with pf_state_insert() will work. > We can then bail out using goto csfailed then (see patch below...) >
Yup. > > we should do a pf_state_insert afterwards? This might simplify > > locking later on. > > speaking of locking... everything is more complicated, however > in this particular case it makes things easier for sure. basically in > our current SMP model, once we insert state to table, the only way > to remove it is to leave the job on garbage collector thread... > Right. > > > > > also one more off-topic question: > > > > > > would you be interested in SMP patch for PF? > > > it basically introduces fine locking and reference counting > > > on PF data objects, so firewall can handle more packets at > > > single instance of time. > > > > > > > We would definitely be interested in such a diff, but at the moment > > there's no use for it as pf_test is called directly from the IP stack > > and hence IP stack needs to support parallel execution first. This > > doesn't mean that we can't start integrating bits and pieces, esp. > > if they're presented as separate patches. > > we use a compile time switch to enable SMP code (-D_PF_SMP_). this is > something I'd like to keep around anyway. I'll start on syncing up our changes > with CURRENT. It will take couple of weeks (ENOCYCLES as usually). Once > I'll have something to show I'll share it and you'll see what can be done > with it. > Cool. Looking forward to! > > > > Thanks for your quality diffs btw, help is always much appreciated. > > > I'm just trying to be useful. It's pleasure to work with PF sources. > > thanks and > regards > sasha > Unfortunately this diff has two nasty bugs (one is not even yours). > =================================================================== > RCS file: /cvs/src/sys/net/pf.c,v > retrieving revision 1.913 > diff -u -r1.913 pf.c > --- pf.c 11 May 2015 12:22:14 -0000 1.913 > +++ pf.c 21 May 2015 19:18:04 -0000 > @@ -3556,15 +3556,6 @@ > goto csfailed; > } > > - if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, s)) { > - pf_state_key_detach(s, PF_SK_STACK); > - pf_state_key_detach(s, PF_SK_WIRE); This bug is not yours, but doing two pf_state_key_detach is wrong and results in all kinds of protection fault fireworks. The right thing is to do pf_detach_state that would handle the case where PF_SK_STACK == PF_SK_WIRE (i.e. we need only one invocation). > - *sks = *skw = NULL; > - REASON_SET(&reason, PFRES_STATEINS); > - goto csfailed; > - } else > - *sm = s; > - > /* attach src nodes late, otherwise cleanup on error nontrivial */ > for (i = 0; i < PF_SN_MAX; i++) > if (sns[i] != NULL) { > @@ -3572,16 +3563,21 @@ > > sni = pool_get(&pf_sn_item_pl, PR_NOWAIT); > if (sni == NULL) { > - REASON_SET(&reason, PFRES_MEMORY); REASON_SET needs to stay. > - pf_src_tree_remove_state(s); > - STATE_DEC_COUNTERS(s); > - pool_put(&pf_state_pl, s); > - return (PF_DROP); > + goto csfailed; > } > sni->sn = sns[i]; > SLIST_INSERT_HEAD(&s->src_nodes, sni, next); > sni->sn->states++; > } > + > + if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, s)) { > + pf_state_key_detach(s, PF_SK_STACK); > + pf_state_key_detach(s, PF_SK_WIRE); > + *sks = *skw = NULL; > + REASON_SET(&reason, PFRES_STATEINS); > + goto csfailed; > + } else > + *sm = s; > > pf_set_rt_ifp(s, pd->src); /* needs s->state_key set */ > if (tag > 0) { > In csfailed case we do pf_remove_src_node and then call pf_src_tree_remove_state. With your diff this means that pf_src_tree_remove_state will be dereferencing sni->sn that might be pool_put by the pf_remove_src_node. Therefore their order needs to be reversed. I did some debugging with this version and it feels solid. OKs are welcome. diff --git sys/net/pf.c sys/net/pf.c index a5c902c..bc1c9e4 100644 --- sys/net/pf.c +++ sys/net/pf.c @@ -3555,37 +3555,32 @@ pf_create_state(struct pf_pdesc *pd, struct pf_rule *r, struct pf_rule *a, if (pf_state_key_setup(pd, skw, sks, act->rtableid)) { REASON_SET(&reason, PFRES_MEMORY); goto csfailed; } - if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, s)) { - pf_state_key_detach(s, PF_SK_STACK); - pf_state_key_detach(s, PF_SK_WIRE); - *sks = *skw = NULL; - REASON_SET(&reason, PFRES_STATEINS); - goto csfailed; - } else - *sm = s; - - /* attach src nodes late, otherwise cleanup on error nontrivial */ for (i = 0; i < PF_SN_MAX; i++) if (sns[i] != NULL) { struct pf_sn_item *sni; sni = pool_get(&pf_sn_item_pl, PR_NOWAIT); if (sni == NULL) { REASON_SET(&reason, PFRES_MEMORY); - pf_src_tree_remove_state(s); - STATE_DEC_COUNTERS(s); - pool_put(&pf_state_pl, s); - return (PF_DROP); + goto csfailed; } sni->sn = sns[i]; SLIST_INSERT_HEAD(&s->src_nodes, sni, next); sni->sn->states++; } + if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, s)) { + pf_detach_state(s); + *sks = *skw = NULL; + REASON_SET(&reason, PFRES_STATEINS); + goto csfailed; + } else + *sm = s; + pf_set_rt_ifp(s, pd->src); /* needs s->state_key set */ if (tag > 0) { pf_tag_ref(tag); s->tag = tag; } @@ -3610,16 +3605,20 @@ pf_create_state(struct pf_pdesc *pd, struct pf_rule *r, struct pf_rule *a, } return (PF_PASS); csfailed: + if (s) { + pf_normalize_tcp_cleanup(s); /* safe even w/o init */ + pf_src_tree_remove_state(s); + } + for (i = 0; i < PF_SN_MAX; i++) if (sns[i] != NULL) pf_remove_src_node(sns[i]); + if (s) { - pf_normalize_tcp_cleanup(s); /* safe even w/o init */ - pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); pool_put(&pf_state_pl, s); } return (PF_DROP);