On 11/01/2018 10:58 AM, Leif Hedstrom wrote:
> 
> 
>> On Oct 31, 2018, at 6:53 PM, Eric Dumazet <eric.duma...@gmail.com> wrote:
>>
>>
>>
>> On 10/31/2018 04:26 PM, Christoph Paasch wrote:
>>> Implementations of Quic might want to create a separate socket for each
>>> Quic-connection by creating a connected UDP-socket.
>>>
>>
>> Nice proposal, but I doubt a QUIC server can afford having one UDP socket 
>> per connection ?
> 
> First thing: This is an idea we’ve been floating, and it’s not completed yet, 
> so we don’t have any performance numbers etc. to share. The ideas for the 
> implementation came up after a discussion with Ian and Jana re: their 
> implementation of a QUIC server.
> 
> That much said, the general rationale for this is that having a socket for 
> each QUIC connection could simplify integrating QUIC into existing software 
> that already does epoll() over TCP sockets. This is how e.g. Apache Traffic 
> Server works, which is our target implementation for QUIC.
> 
> 
> 
>>
>> It would add a huge overhead in term of memory usage in the kernel,
>> and lots of epoll events to manage (say a QUIC server with one million 
>> flows, receiving
>> very few packets per second per flow)
> 
> Our use case is not millions of sockets, rather, 10’s of thousands. There 
> would be one socket for each QUIC Connection, not per stream (obviously). At 
> ~80Gbps on a box, we definitely see much less than 100k TCP connections.
> 
> Question: is there additional memory overhead here for the UDP sockets vs a 
> normal TCP socket for e.g. HTTP or HTTP/2 ?

TCP sockets have a lot of state. We can understand spending 2 or 3 KB per 
socket.

UDP sockets really have no state. The receive queue anchor is only 24 bytes.
Still, memory cost for one UDP socket are :

1344 bytes for UDP socket,
320 bytes for the "struct file"
192 bytes for the struct dentry
704 bytes for inode
512 bytes for the two dst (connected socket)
200 bytes for eventpoll structures
104 bytes for the fq flow

That is about 3.1KB per socket (but you probably can round this to 4KB due to 
kmalloc roundings)

One million sockets -> 4GB of memory.

This really does not scale.

Reply via email to