On Sat, Feb 06, 2010 at 12:58:21PM -0500, Ted Unangst wrote: > On Sat, Feb 6, 2010 at 9:53 AM, Otto Moerbeek <o...@drijf.net> wrote: > > The buffer cache should ne smart enough to optmize read and writes in > > such large chunks so that the seeking gets reduced. The problem with > > How? cp reads 64k. How much extra should the kernel read ahead? How > is this determined? What if you're only reading a small part of a > file? > > cp writes 64k. How does the buffer cache optimize that into a larger > write if cp hasn't even read the data yet? Does it hide the write on > a secret "to be continued" queue? How long does it stay there? What > if you're only writing to a part of the file? > > We're already seeing problems just making the buffer cache bigger. > You think adding the complexity to optimize access patterns is going > to make things better? cp at least has a very good idea of exactly > what data it's going to need next. Putting heuristics in the kernel > is exactly the wrong approach. > > > your big buffers is that you are claiming resources even if you have > > no idea if they are available or not. > > What resources? It uses a small fraction of your RAM. If a 4 gig > machine can't spare a couple megs for a file copy, you're in trouble.
Add to that that doing I/O in big chunks is also putting more pressure on the buffer cache. Now I look at the diff and see that the memory used is reasonably small. But still, if multiple copies of your cp are run together on a machine where available physical mem is low I might end up with a machine swapping. That is certainly not going to speed up cp. My problem is still more fundamental. cp has no way to know how much resources the machine has available. Also, doing I/O in big chunks might be unfair to other processes doing I/O. Unless you know the relative costs of seeks vs large chunks of I/O. But that assumes you know all kinds of things about the hardware. -Otto