libnetlink: update rtnl_talk to support malloc buff at run time

Stephen Hemminger Tue, 10 Oct 2017 09:48:13 -0700

On Tue, 10 Oct 2017 08:41:17 +0200
Michal Kubecek <mkube...@suse.cz> wrote:


> On Mon, Oct 09, 2017 at 10:25:25PM +0200, Phil Sutter wrote:
> > Hi Stephen,
> > 
> > On Mon, Oct 02, 2017 at 10:37:08AM -0700, Stephen Hemminger wrote:  
> > > On Thu, 28 Sep 2017 21:33:46 +0800
> > > Hangbin Liu <ha...@redhat.com> wrote:
> > >   
> > > > From: Hangbin Liu <liuhang...@gmail.com>
> > > > 
> > > > This is an update for 460c03f3f3cc ("iplink: double the buffer size 
> > > > also in
> > > > iplink_get()"). After update, we will not need to double the buffer size
> > > > every time when VFs number increased.
> > > > 
> > > > With call like rtnl_talk(&rth, &req.n, NULL, 0), we can simply remove 
> > > > the
> > > > length parameter.
> > > > 
> > > > With call like rtnl_talk(&rth, nlh, nlh, sizeof(req), I add a new 
> > > > variable
> > > > answer to avoid overwrite data in nlh, because it may has more info 
> > > > after
> > > > nlh. also this will avoid nlh buffer not enough issue.
> > > > 
> > > > We need to free answer after using.
> > > > 
> > > > Signed-off-by: Hangbin Liu <liuhang...@gmail.com>
> > > > Signed-off-by: Phil Sutter <p...@nwl.cc>
> > > > ---  
> > > 
> > > Most of the uses of rtnl_talk() don't need to this peek and dynamic 
> > > sizing.
> > > Can only those places that need that be targeted?  
> > 
> > We could probably do that, by having a buffer on stack in __rtnl_talk()
> > which will be used instead of the allocated one if 'answer' is NULL. Or
> > maybe even introduce a dedicated API call for the dynamically allocated
> > receive buffer. But I really doubt that's feasible: AFAICT, that stack
> > buffer still needs to be reasonably sized since the reply might be
> > larger than the request (reusing the request buffer would be the most
> > simple way to tackle this), also there is support for extack which may
> > bloat the response to arbitrary size. Hangbin has shown in his benchmark
> > that the overhead of the second syscall is negligible, so why care about
> > that and increase code complexity even further?
> > 
> > Not saying it's not possible, but I just doubt it's worth the effort.  
> 
> Agreed. Current code is based on the assumption that we can estimate the
> maximum reply length in advance and the reason for this series is that
> this assumption turned out to be wrong. I'm afraid that if we replace
> it by an assumption that we can estimate the maximum reply length for
> most requests with only few exceptions, it's only matter of time for us
> to be proven wrong again.
> 
> Michal Kubecek
> 

For query responses, yes the response may be large. But for the common cases of
add address or add route, the response should just be ack or error.

Re: [PATCHv4 iproute2 2/2] lib/libnetlink: update rtnl_talk to support malloc buff at run time

Reply via email to