On Tue, 2008-03-18 at 10:14 +0100, [EMAIL PROTECTED] wrote: > Hi, > > On Mon, Mar 17, 2008 at 07:00:02PM -0400, Thomas Bushnell BSG wrote: > > > On Sun, 2008-03-16 at 08:25 +0100, [EMAIL PROTECTED] wrote: > > > > We could move the servers one by one -- starting with the disk > > > filesystems, as this is where the issues are manifesting most... > > > > But this is still not relevant, because the central problem is paging > > blocks; you have to work that one out first. That's the one that is a > > major hassle. > > I must admit that I do not fully understand the relation between > filesystems and paging yet... Probably this is what I really meant to > say :-)
Here's what you need to know. The virtual memory in the process is associated with a "memory object" which is just a port to some server, normally a file server. This association is set up by the vm_map call. In response, the kernel sets up memory maps internally, and saves the memory object port provided for future use. When a page fault occurs, the thread enters the kernel. The kernel recognizes the page fault, looks in the memory maps to find the memory object in question, computes the right offsets, and then sends a request to the memory object for the page in question. When the server responds with the data, the kernel installs the page in core, adjusts the memory maps, and returns from the page fault. Now the basic idea behind using one kernel thread to handle several user threads is that when a user thread *would* block, you don't let it block, instead you just take it away and run some other user thread. That works very nicely in Mach, in general, because almost all blocks happen inside mach_msg, and mach_msg was carefully constructed to make this work nicely. But there is a wrinkle: page faults. When I say "almost all blocks happen inside mach_msg" that's because one important category does not: page faults. Or rather, the page fault also blocks in a message send, but the message send is one that is done by the thread in kernel space, rather than by the user space mach_msg, and so the user-space threads library has no access to it. It is hard to see how to fix this without one of the following: 1) Having the kernel know that user threads are multiplexing, and do some fancy callout stuff when page fault waits occur; 2) Having the user thread handle its own page faults, which would require some other deep kernel magic. And throwing a big wrinkle into all that is that many architectures do not make it *possible* for users to handle page faults. The processor dumps a load of crap on the stack, and the kernel must preserve it carefully and then return the fault. It is very hard to encapsulate that so that it can be stored and restored by users without keeping the whole stack around. Thomas