Aloha -
I've run into a real kernel-level problem with 'netmsg'.
It's related to the libpager issue. The problem arises when a process gets
an unworkable memory object and tries to vm_map it. This causes the
mach_msg() that sent the vm_map to block indefinitely, even though I've
specified MACH_SEND_TIMEOUT with a zero timeout.
More specifically, the process in question is the exec server. It gets a
memory object from the file server to read a file, then uses it to map the
file into a remote task. This causes a vm_map to go across the network
connection. The kernel, upon receiving the vm_map, sends a
memory_object_init message, and then blocks waiting for the reply.
The block occurs in vm_object_copy_strategically(), which is labeled in its
comments "[t]his operation may block". Almost the first thing it does it
to wait for the memory object to become ready.
In our case, libpager already has a different client, so the memory object
never becomes ready.
The big problem, as I see it, is that mach_msg() is blocking, and that
hangs my entire thread. It seems to me that these low-level RPC operations
like vm_map can't block, otherwise it would defeat the purpose of
MACH_SEND_TIMEOUT. So vm_map() should record the mapping and then return,
putting the copy operation on some kind of queue. I guess.
Any thought on how to resolve this?
agape
brent