Re: Flaw in "random32: update the net random state on interrupt and activity"

Willy Tarreau Sat, 08 Aug 2020 11:09:38 -0700

On Sat, Aug 08, 2020 at 10:07:51AM -0700, Andy Lutomirski wrote:
> 
> > On Aug 8, 2020, at 8:29 AM, George Spelvin <l...@sdf.org> wrote:
> > 
> 
> > And apparently switching to the fastest secure PRNG currently
> > in the kernel (get_random_u32() using ChaCha + per-CPU buffers)
> > would cause too much performance penalty.
> 
> Can someone explain *why* the slow path latency is particularly relevant
> here?  What workload has the net code generating random numbers in a place
> where even a whole microsecond is a problem as long as the amortized cost is
> low?  (I'm not saying I won't believe this matters, but it's not obvious to
> me that it matters.)


It really depends on what is being done and how we want that code to
continue to be used. Often it will be a matter of queue sizes or
concurrency in certain contexts under a lock. For illustration let's
say you want to use randoms to choose a sequence number for a SYN
cookie (it's not what is done today), a 40G NIC can deliver 60Mpps
or one packet every 16 ns. If you periodically add 1us you end up
queuing 60 extra packets in a queue when this happens. This might
or might not be acceptable. Regarding concurrency, you could imagine
some code picking a random in a small function running under a
spinlock. Frequently adding 1us there can be expensive.

To be clear, I'm not *that* much worried about *some* extra latency,
however I'm extremely worried about the risks of increase once
some security researcher considers the PRNG not secure enough again
(especially once it starts to be used for security purposes after
having being claimed secure) and we replace it with another one
showing a higher cost and longer tail to amortize the cost. And given
that we're speaking about replacing a non-secure PRNG with *something*,
it doesn't seem really wise to start to introduce new constraints on
the data path when there are probably a large enough number of
possibilities which do not require these constraints.

Last point, we must not rule out the possibility than some clever
researcher will later come up with new time-based attacks consisting
in observing latencies of packet processing, explaining how they can
count the number of connections reaching a given host and reveal
certain useful information, and that we're then forced to slow down
all of them to proceed in constant time.

Just my two cents,
Willy

Re: Flaw in "random32: update the net random state on interrupt and activity"

Reply via email to