On Thu, May 28, 2015 at 13:34 +0200, Alexandr Nedvedicky wrote: > On Thu, May 28, 2015 at 11:43:02AM +0200, Mike Belopuhov wrote: > > On Thu, May 28, 2015 at 01:17 +0200, Alexandr Nedvedicky wrote: > > > Hello, > > > > > > > > > On Wed, May 27, 2015 at 07:44:15PM +0200, Mike Belopuhov wrote: > > > > On Wed, May 27, 2015 at 10:39 +0200, Alexandr Nedvedicky wrote: > > > > > Hello, > > > > > > > > > > > > - if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, > > > > > > > s)) { > > > > > > > - pf_state_key_detach(s, PF_SK_STACK); > > > > > > > - pf_state_key_detach(s, PF_SK_WIRE); > > > > > > > > > > > > This bug is not yours, but doing two pf_state_key_detach is wrong > > > > > > and results in all kinds of protection fault fireworks. The right > > > > > > thing is to do pf_detach_state that would handle the case where > > > > > > PF_SK_STACK == PF_SK_WIRE (i.e. we need only one invocation). > > > > > good catch. > > > > > > > > > > > > > > > > > In csfailed case we do pf_remove_src_node and then call > > > > > > pf_src_tree_remove_state. With your diff this means that > > > > > > pf_src_tree_remove_state will be dereferencing sni->sn that > > > > > > might be pool_put by the pf_remove_src_node. Therefore their > > > > > > order needs to be reversed. > > > > > > > > > > I see. Another option to fix it would be to do: > > > > > > > > > > sni->sn = sns[i]; > > > > > sns[i] = NULL; > > > > > > > > > > > > > I'm not sure I follow. Where do you want to do it? > > > > If it's when we pool_get sni's, then this would mean > > > > we won't run pf_remove_src_node and cleanup source > > > > nodes that might have been created. > > > > > > > > > > sorry I was too brief. The snippet below comes from > > > pf_create_state() with your patch applied: > > > > > > 3560 for (i = 0; i < PF_SN_MAX; i++) > > > 3561 if (sns[i] != NULL) { > > > 3562 struct pf_sn_item *sni; > > > 3563 > > > 3564 sni = pool_get(&pf_sn_item_pl, PR_NOWAIT); > > > 3565 if (sni == NULL) { > > > 3566 REASON_SET(&reason, PFRES_MEMORY); > > > 3567 goto csfailed; > > > 3568 } > > > 3569 sni->sn = sns[i]; > > > 3570 sns[i] = NULL; > > > 3571 SLIST_INSERT_HEAD(&s->src_nodes, sni, > > > next); > > > 3572 sni->sn->states++; > > > 3573 } > > > 3574 > > > > > > the point of my suggestion is to transfer ownership of source node from > > > local variable sns[] to state. so the check performed later in csfailed > > > at line 3617, will find NULL pointer in sns[i] field and won't attempt > > > to call pf_remove_src_node(). > > > > > > 3610 csfailed: > > > 3611 if (s) { > > > 3612 pf_normalize_tcp_cleanup(s); /* safe even w/o > > > init */ > > > 3613 pf_src_tree_remove_state(s); > > > 3614 } > > > 3615 > > > 3616 for (i = 0; i < PF_SN_MAX; i++) > > > 3617 if (sns[i] != NULL) > > > 3618 pf_remove_src_node(sns[i]); > > > 3619 > > > > > > > > > your patch updated by my suggestion is further below. > > > > > > > But that introduces a potential leak, as I said in my mail: > > we won't run pf_remove_src_node and cleanup source nodes that > > might have been created. > > > > sns[] is an array of pointers to independently tracked objects > > (inserted into the pf_src_tree). > > > > yes you are right, I was completely wrong. Still my only concern is what > happens in case we fail to allocate source node item (sni) for let's say > PF_SN_NAT type, while there is PF_SN_ROUTE type in sns[] to be processed. > > I'm referring to line numbers above. So we fail to allocate PF_SN_NAT > source node and taking goto at 3567, arriving to 3610. > > We do the right thing for state clean up (lines 3612, 3613). And now > arriving to for loop 3616, which will remove all source nodes from > sns[], including PF_SN_ROUTE. I completely agree we should do > pf_remove_src_node(sns[PF_SN_NAT]) as we took reference at 3572 for it.
But we'll drop this "reference" in pf_src_tree_remove_state, then how will sns[PF_SN_NAT] and sns[PF_SN_ROUTE] be different? > I'm not sure if we should be calling pf_remove_src_node(sns[PF_SN_ROUTE]) > here since line 3572 was not executed for it, for loop at 3560 terminated > prematurely. Probably introducing new for-loop counter 'j' in csfailed: > will fix it: > > > 3616 for (j = 0; j < i; j++) > sns[] array was prepared for this state so if we can't insert the state, sns entries must be cleaned up. pf_remove_src_node checks the number of associated states and if source node should expire some time later. Speaking of PF_SN_ROUTE, pf_set_rt_ifp should be probably called before we insert the state for the very same reason, plus it should check the pf_map_addr return value and do the cleanup. > > regards > sasha >