On 12/21/2016 09:47 PM, Cong Wang wrote:
On Wed, Dec 21, 2016 at 12:02 PM, Daniel Borkmann <dan...@iogearbox.net> wrote:
On 12/21/2016 08:10 PM, Cong Wang wrote:
On Wed, Dec 21, 2016 at 10:51 AM, Cong Wang <xiyou.wangc...@gmail.com>
wrote:
On Wed, Dec 21, 2016 at 9:04 AM, Daniel Borkmann <dan...@iogearbox.net>
wrote:
What happens is that in tc_ctl_tfilter(), thread A allocates a new
tp, initializes it, sets tp_created to 1, and calls into
tp->ops->change()
with it. In that classifier callback we had to unlock/lock the rtnl
mutex and returned with -EAGAIN. One reason why we need to drop there
is, for example, that we need to request an action module to be loaded.
Excellent catch!
But why do we have to replay the request here? Shouldn't we just return
EAGAIN to user-space and let user-space decide to retry or not?
Replaying is the root of the evil here.
Answer myself: probably due to historical reasons, but still replaying
inside such a big function is just error-prone, we should promote it
out:
Have no strong opinion, I guess it could be done as a simplification
for net-next, why not, along with moving out the netlink_ns_capable()
check or possibly other things after careful analysis that don't need
to be redone in that circumstance.
It is only slightly bigger than your current one so could fit for -stable too.
Also, it could fix all potential problems like this one. Let compiler do the
work, not human. ;)
Ok, you mean for net. In that case I prefer the smaller sized fix to be
honest. It also covers everything from the point where we fetch the chain
via cops->tcf_chain() to the end of the function, which is where most of
the complexity resides, and only the two mentioned commits do the relock,
so as a fix I think it's fine as-is. As mentioned, if there's need to
refactor tc_ctl_tfilter() net-next would be better, imho.
Thanks,
Daniel