> -----Original Message-----
> From: David Ahern [mailto:dsah...@gmail.com]
> Sent: Thursday, January 11, 2018 3:21 AM
> To: Chris Mi <chr...@mellanox.com>; netdev@vger.kernel.org
> Cc: gerlitz...@gmail.com; step...@networkplumber.org;
> marcelo.leit...@gmail.com; p...@nwl.cc
> Subject: Re: [patch iproute2 v8 1/2] lib/libnetlink: Add functions
> rtnl_talk_msg and rtnl_talk_iov
> 
> On 1/9/18 8:27 PM, Chris Mi wrote:
> > rtnl_talk can only send a single message to kernel. Add two functions
> > rtnl_talk_msg and rtnl_talk_iov that can send multiple messages to kernel.
> > rtnl_talk_msg takes struct msghdr * as argument.
> > rtnl_talk_iov takes struct iovec * and iovlen as arguments.
> >
> > Signed-off-by: Chris Mi <chr...@mellanox.com>
> > ---
> >  include/libnetlink.h |  6 ++++
> >  lib/libnetlink.c     | 82 ++++++++++++++++++++++++++++++++++++++++-
> -----------
> >  2 files changed, 70 insertions(+), 18 deletions(-)
> >
> > diff --git a/include/libnetlink.h b/include/libnetlink.h index
> > a4d83b9e..e9a63dbc 100644
> > --- a/include/libnetlink.h
> > +++ b/include/libnetlink.h
> > @@ -96,6 +96,12 @@ int rtnl_dump_filter_nc(struct rtnl_handle *rth,
> > int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
> >           struct nlmsghdr **answer)
> >     __attribute__((warn_unused_result));
> > +int rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
> > +             struct nlmsghdr **answer)
> > +   __attribute__((warn_unused_result));
> 
> As mentioned before rtnl_talk_msg is not needed; you only need to add
> rtnl_talk_iov. The attached fixup on top of your patch removes it and adjusts
> __rtnl_talk_iov. Please roll that change into your patch.
Done. I misunderstood you previous comment.
Thanks for your patch, David.
> 
> 
> While testing this I noticed 2 other oddities:
> 
> $ perf trace -s tc -b tc.batch
> (stddev column removed to shorten line width)
> 
>  Summary of events:
> 
>  tc (780), 1857 events, 97.9%
> 
>    syscall            calls    total       min       avg       max
>                                (msec)    (msec)    (msec)    (msec)
>    --------------- -------- --------- --------- --------- ---------
>    recvmsg              530     6.532     0.008     0.012     0.218
>    open                 269     5.429     0.012     0.020     0.117
>    sendmsg                4     3.518     0.092     0.879     1.647
> 
> 
> 
> 1. recvmsg is called twice - once to peek at message size, allocate a buffer
> and then really receive the message. That is overkill for ACKs.
> 
> 2. I am using a batch file with drop filters:
> 
> filter add dev eth2 ingress protocol ip pref 273 flower dst_ip
> 192.168.253.0/16 action drop
> 
> and for each command tc is trying to dlopen m_drop.so:
> 
> open("/usr/lib/tc//m_drop.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No
> such file or directory)
> 
> 
> With a patch to use a stack buffer for ACKs, the above perf summary
> becomes:
> 
> $ perf trace -s tc -b tc.batch
> 
>  Summary of events:
> 
>  tc (777), 1345 events, 97.1%
> 
>    syscall            calls    total       min       avg       max
>                                (msec)    (msec)    (msec)    (msec)
>    --------------- -------- --------- --------- --------- ---------
>    open                 269     5.510     0.013     0.020     0.160
>    recvmsg              274     3.758     0.009     0.014     0.396
>    sendmsg                4     3.531     0.098     0.883     1.672
> 
> 
> Making the open errors now the dominate overhead affecting performance.
> If tc had some smarts that it already tried that file it would avoid the
> subsequent open calls. The end result is a significant speed up compared to
> the current tc:
> 
>  Summary of events:
> 
>  tc (785), 2333 events, 98.3%
> 
>    syscall            calls    total       min       avg       max
>                                (msec)    (msec)    (msec)    (msec)
>    --------------- -------- --------- --------- --------- ---------
>    sendmsg              256     9.832     0.029     0.038     0.181
>    open                 269     5.819     0.013     0.022     0.353
>    recvmsg              530     5.592     0.009     0.011     0.285
> 
> 
> Can you look at a follow on patch (not part of this set) to cache status of
> dlopen attempts?
Sure, I will investigate this issue.

Reply via email to