On Thu, Sep 01, 2016 at 01:34:43PM -1000, Brent W. Baccala wrote: > What happened with select? mach_msg() returned prematurely? I can handle > that fine. In this case, mach_msg() isn't returning at all.
You're missing the point. The point is vm_map has no timeout. In addition, the fact that the receiver is the kernel certainly makes it more complicated to predict. See ipc/mach_msg.c:mach_msg_trap() where the time_out argument is ignored if it's a kernel send, i.e. when the code branches to the kernel_send label. > > In my opinion, your network server should do what all servers do, > > i.e. dedicate a thread to the processing of a complete RPC and spawn > > as many as necessary when receiving messages. > > > > I've thought of that. It might be desirable at some point for performance, > but do I really need it for correctness? I just have to accept that > mach_msg can hang once in a while and make sure that I can burn its thread > if it does? > > I don't like that idea at all! It's just ugly. Well, not ugly. It might be useful if we want to relay scheduling properties, but even that makes little sense across a network. Of course, the implementation of mach_msg when handling messages sent to kernel objects could be changed, but it makes little sense to implement timeouts on those messages, because the kernel guarantees immediate processing (the client thread services itself inside the kernel). Consider how this could be implemented. Should the kernel act as a true server with its own threads ? Should it queue messages ? Since there is currently no queueing, and instead the kernel part of the client thread directly runs the server routine, does it make sense to state that a send could fail because of a timeout ? So really, what's happening here is a misbehaving userspace pager affecting a client that shouldn't have trusted it in the first place. I'm really not sure how you can truely fix that on a system such as the Hurd. Perhaps by using this server trusting mechanism I mentioned, and somehow sharing it with the network server. But even then, trusting doesn't mean it works, it only means that if it doesn't, it's on you. So the other users of the network server shouldn't be affected because one got it wrong. Therefore I'm going back to the idea of using a thread pool. If a mach_msg call succeeds, good for you, grab the next message. If not, at least you're not affecting other users, except maybe through a denial of service, the only kind of security issue inevitable with a design where servers allocate from their own resources on behalf of their clients. That's when you get to things like quotas and resource limits. -- Richard Braun