2011/9/13 Thomas Bushnell, BSG <t...@becket.net>: > I think this could also create extra churn, by scheduling a lot of disk > writes when a series of writes are done all at once to a single file.
This is mitigated due to competition for the pager's lock between data_requests and data_returns. > Traditionally, this is addressed by the every-30-seconds update task; there > is no expectation of an indefinitely postponed write. Synchronizing only at a fixed interval (5 seconds is the default for current libdiskfs), when writing a large file, induces the arrival of a lot of data_return messages in a pretty short amount of time. This is even worse when memory is scarce and GNU Mach starts flushing out dirty pages. > However, much could be improved in this area. Ideally, writes would happen > at a smoothly increasing rate as page pressure increased; and ideally, the > kernel would be correctly detected sequential access patterns and paging out > finished pages speedily on its own. I agree with you in this regard, but I'm not sure if should be Mach the one who throttles page outs to the servers. Mach doesn't know how fast the pagers are going to process each data_return, so it ends flushing out a bunch of pages and waiting an arbitrary amount of time, hoping servers will catch up. I think pagers should be the ones in control of the page outs, by calling m_o_lock_request at a reasonable rate. Also, when Mach's vm pageout daemon is awaken, it should always try to free external clean pages first (currently it doesn't even keep track of which ones are external), and if the desired number of pages can't be freed because most of them are dirty, a soft blockade (raised for memory objects with certain attributes) should be imposed and all pagers notified, perhaps by the use of an special exception raised at their page faults that would generate a data_request. When the target is met, the blockade is raised and all pagers can behave normally again. Ideally, this should be done on a per-task basis, to prevent a rogue translator from being able to compromise the entire system, but even if it's implemented system-wide, an offender could be heuristically found and killed (like the OOM Killer from Linux).