On 04/13/2011 12:29 PM, Bob Briscoe wrote:
Jim,
By the end I think I had already addressed a lot of the concerns you
stated at the start of the mail:
- Yes, the name of this exercise is water under the bridge.
- Buffers still have to be reasonably sized (my footnote covered that
already)
However, three responses inline (prefixed "BB:")...
At 15:30 13/04/2011, Jim Gettys wrote:
On 04/13/2011 07:19 AM, Bob Briscoe wrote:
The problem is actually queuebloat, not bufferbloat. The buffer is
the memory set aside for the queue. The queue is how much of the
memory is used to store packets or frames.
I think you are picking nits on the naming, though if you'd had the
suggestion last fall, I might have gone for it
BB: As I said, I'm picking nits on the naming, not suggesting it
should be changed at this stage.
But having a misleading name does make the nuancing harder - there's a
lot of practitioners out there who don't need or want to understand
anything - they have no idea about why they should do things - they
just put together strings of feature buzz-words. That's how most of
the industry works.
It only needs some researcher with only a partial grasp of the issue
to pick up the word bufferbloat as the new sexy research fashion, then
publish their research results showing that smaller buffers will make
things worse. Then we have to start explaining we didn't really mean
bufferbloat, yada yada, and it starts to make us look like we might
not have known what we were talking about. While our researcher friend
with half a brain starts running around crowing that his marvellous
new research has proved us wrong,... when all he's actually done is
proved that the word we chose as a name was not quite precise enough.
Heh. We didn't have any term for this at all. I went back and looked
at the discussion in end-to-end interest when Dave Reed reported 3g
bufferbloat, and the suggested alternatives were worse, and no consensus
reached.
And there are buffers that hide in systems that are not packet
queues, that people also should be aware
of (e.g. encryption buffers, error correction buffers, buffers in
applications used for pipelining, etc).
BB: Good point. I guess your point again is that many of these buffers
are not anything like as parsimoniously sized as they should be.
Yes, often they are infinite and dynamically allocated (e.g. the event
queue inside of GUI applications and/or window systems themselves).
But these buffers are harder to cut down below a certain minimum,
because they actually serve a function. There's no magic like AQM that
can keep these buffers unoccupied most of the time.
Sometimes yes, sometimes no. Often, as in packet queues, the buffers
fill because flow control from lower layers of buffering/queuing have
filled, and the software is not designed to elide unneeded operations
when they can't keep up (again, causing buffers/queues to form just
before the bottleneck).
I'm happy to also use a term queuebloat in places where it is
applicable, where you have packet queues... But bufferbloat a generic
phenomena in communications programming, whether in network transports,
or in applications using them. I guess in this I'm an odd-ball, having
mostly been a programmer who designed network based application. Let me
give a concrete example:
Oh, and I forgot about socket buffers, which on modern OS's may also
automatically resize; these are not queues either. Even worse, is that
they will resize based on the underlying confusion induced by other
bufferbloat/queuebloat underneath them. These can be controlled by
applications setting the socket buffer size, rather than taking default
behaviour. Again, at least for stream based protocols such as X, these
aren't yet queues (though we then parse the stream, and generate a queue
of X events).
A good (recent) example I've seen is in OpenOffice, which has had
terrible behaviour on its slide arranging operations on Linux for years,
not understanding it should discard unneeded mouse motion events (seems
to be one of the things the LibreOffice guys may have fixed, thankfully;
I talked to Michael Meeks about this a while back). Bufferbloat affects
applications just as much as network stacks.
So I'm not convinced that queuebloat is a better term, as it is less
general than the phenomena I was trying
to describe. In any case, I think it's water under the dam at this
date.
We don't want vendors to (necessarily*) reduce the size of the
buffer, we want them to reduce the size of the standing queue. They
can do that with active queue management (AQM) (if we only knew how
to code it robustly). Ideally with ECN too, but AQM would be a good
start.
Some of these buffers are truly bloated, and/or not sized even
approximately related to the bandwidth available (e.g. the 1.2
seconds of buffering I observed on my DOCSIS3 modem, or similar
horror stories in DSL), or the 1000 packet transmit queue in Linux.
These buffers are often sized by all the memory that is available,
and the hardware vendors can't get small enough chips to "correctly"
size them, (as though we knew what the bandwidth was, or the delay
was, one of the mythologies that got us into this mess).
One of the first steps (well short of the nirvana of AQM), is to at
least get the buffers sized to something sane, and related to the
bandwidth the hardware is being operated at. And as each generation
of new kit is built (and often as a market requirement has to plug
into downward compatible hardware), it's been getting worse.
This is what the cable folks are in the middle of doing; it's
obviously safe to at least have the buffer sizes approximately
proportional to the bandwidth at which the device is operating
(similarly for the Linux transmit queue; if you are at 100Mbps, you
can cut the size by a factor of 10 without any danger). With the
ability to go hundreds of megabits/second but most customers paying
for 10-20Mbps, it is pretty obvious the buffer size had better be
related to the bandwidth of operation, and never be a static buffer
sized for the worst case.
Let's not lose sight of immediate, safe mitigations that are at hand,
while working on AQM with or without ECN, though that is the only
real, long term solution.
BB: The two stage fix might work for some types of product, where
continual fixes are the norm. But in other types of product, each fix
involves an engineer visit and a box swap out, which you don't want to
be doing more than once if you can help it.
Yup. Would that we had AQM's that we knew worked in the face of highly
variable bandwidth and workloads we could just recommend everyone go
use: but we're not there yet.
At best, we have some not yet tested ideas and are still getting set up
to try to run even simple tests (e.g. SFB, RED light when we can get our
hands on it).
And we certainly *want* operators who could/should be running RED
already to turn it on in places where it can be used. My point is
primarily that the enemy of the good is the perfect, and steps we can
take to make the problem less severe while working on AQM that can
handle the current edge are well worth taking. Sometimes those steps
may make the problem 1/10th the size it is today. That doesn't get us
where we ought to go, but it will reduce suffering.
- jim
_______________________________________________
Bloat mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/bloat