On Sat, 2015-06-06 at 18:38 +0200, Maciej Żenczykowski wrote:
> Hmm, I certainly like this.
> 
> So IMHO this is indeed much better than a sysctl to select a magic
> port to ignore during a bind call (previous internal patchset),
> although it does use up one more bit per socket (and one more syscall
> per connect).
> 
> ---
> 
> Thinking about this some more, I think it might be possible to make
> this behaviour automatic in certain cases.
> 
> The new socket bit has 2 different meanings, depending on whether a
> port is already allocated or not.
>   if a port is not yet allocated, it governs whether bind(port=0) will
> allocate a port.
>   if a port is already allocated, it flags whether it was autoallocated
> (obviously could also just use 2 bits instead of 1)
> 
> bind(with port=0)
>   if the flag is set, doesn't select a port [ie. this patch]
>   if the flag wasn't set, selects a port, sets the flag


But this the problematic part here with multi threaded applications,
and servers where all ephemeral ports are already in use by at least one
socket.

Also think about cohabitation with applications not using yet this
knowledge (lets say they use bind(0), getsockname(), connect())

My patch allows bind(0) to succeed even if all ports are in use.

Then connect() is almost guaranteed to succeed, unless this host already
have ~32000 sessions with exact same 
(source_ip, destination_ip, destination_port) 3-tuple.

connect() already can return EADDRINUSE for this case.
(Some applications tried SO_REUSEADDR or SO_REUSEPORT to get rid of the
problem, with no great success)

So I am not sure what the 'automatic' stuff would provide anyway ?

Selecting a port is quite expensive because of all the spinlocks and
lookups, so doing this twice automatically would add a significant cost.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to