Hi! First, in my other message I said that ``we're leaking port rights''. This is wrong; we're just handling user reference counts incorrectly.
On Thu, 8 Sep 2011 09:43:58 -0700 (PDT), Roland McGrath <rol...@hack.frob.com> wrote: > > Here, we've unconditionally used the value of refs, and didn't take into > > account that record_refs ought to have been used instead, and refs could > > be any value (uninitialized) -- or which detail am I missing here? > > Should refs have simply been initialized to zero (as the zero value is > > noneffective, and we'll set the ss->thread, etc. values later on)? > > I concur that this is clearly wrong, and I've checked in the trivial fix. > > > Can this possibly have caused the following: > > It could have, though it's by no means certain that this was what you saw > happen. Since an uninitialized local variable was used here in those > cases, we could very well have set the ref count of the task-self and/or > thread-self ports in the child to any value whatsoever. Yes, but this doesn't fix the problem. And I see what the specific problem here is, but let's visit the following topic first before returning to fork: > Does __mach_thread_self bump the ref count? I suppose it probably does, > in which case extra calls can indeed hurt. Yes, it does. Attached is a small test program. The task-self, thread-self, host-self ports are special in that they have specific mach_task_self, mach_thread_self, mach_host_self system calls for retrieving send rights (and increment the user reference count by one). These calls are different from usual mach_port_mod_refs(+1), as follows: enum kind { TASK, TASK_REAL, THREAD, HOST }; mach_port_t plus_one(enum kind kind) { mach_port_t port; switch (kind) { case TASK: port = mach_task_self(); break; case TASK_REAL: /* Avoid glibc's cache. */ port = (mach_task_self)(); break; case THREAD: port = mach_thread_self(); break; case HOST: port = mach_host_self(); break; $ ./special_ports 0 task = 1: 60 thread = 30: 3 host = 25: 15520 port = 1 + 1 = 60 + 1 = 60 This means task-self has the local port name 1 and its user reference count is 60. We're working on port name 1. Reference it once (invoke mach_task_self), user reference count is still 60. Reference it once, user reference count is still 60. Here it hangs. This is due to glibc caching the mach_task_self value (mach/mach_init.c et al.). It cannot change during a tasks lifetime, so this indeed seems like a valid thing to do. Next, notice the high host-self user reference count. And it does increase (typically by 1 or 2) with every invocation of other programs: $ ./special_ports 0 [...] host = 25: 15520 [...] $ ls > /dev/null $ ./special_ports 0 [...] host = 25: 15524 [...] $ ./special_ports 0 [...] host = 25: 15525 [...] I don't have an en explanation yet, but this doesn't seem right? So, avoid glibc's caching of the task-self port: $ ./special_ports 1 task = 1: 60 thread = 30: 3 host = 25: 15543 port = 1 + 1 = 62 + 1 = 63 + N = 65533 + 1 = 65534 + 1 = 65534 + 1 = 65534 - 1 = 65533 - 1 = 65532 - 1 = 65531 - M = 2 - 1 = 1 Segmentation fault The segfault followx to the mach_port_deallocate call that sets the task-self user reference count to zero. What happens is that the (current) task is dismantled, and Mach either faults right away (don't know), or after returning to user-space, and there is no address space anymore, or similar, I guess. Anyway, this is expected. However, before that: when the host-self user reference count reaches 65534 (which is the maximum), further mach_task_self invocations still succeed, but the count is no longer incremented. This is problematic, in my opinion: as one doesn't know with which count you start at a certain position in the code, a sequence of a number of mach_task_self invocations followed by the same number of mach_port_deallocate(task-self) invocations does not necessarily lead back to the original user reference count. Thus, you must not call mach_port_deallocate on the task-self port, as your call may be the one that closes the port. If we agree this far, doesn't it follow that the user reference count of the task-self port is of no importance? (As long as it is different from zero; but I don't see what that might ever be useful for.) (And thus, fork has no business in setting this value different from the original value of one.) Would it make sense to implement a check in GNU Mach for this, something like: ``{mach_port_deallocate,mach_port_get_refs,mach_port_mod_refs} must not be called on the task-self port''. (In line with Samuel's ``task %p deallocating an invalid port %u, most probably a bug''.) Analoguous behavior is seen with the thread-self and host-self ports: $ ./special_ports 2 task = 1: 60 thread = 30: 3 host = 25: 15546 port = 30 + 1 = 5 + 1 = 6 + N = 65533 + 1 = 65534 + 1 = 65534 + 1 = 65534 - 1 = 65533 - 1 = 65532 - 1 = 65531 - M = 2 - 1 = 1 ./special_ports: mach_port_get_refs: (os/kern) invalid name $ ./special_ports 3 task = 1: 60 thread = 30: 3 host = 25: 15549 port = 25 + 1 = 15551 + 1 = 15552 + N = 65533 + 1 = 65534 + 1 = 65534 + 1 = 65534 - 1 = 65533 - 1 = 65532 - 1 = 65531 - M = 2 - 1 = 1 ./special_ports: mach_port_get_refs: (os/kern) invalid name Grüße, Thomas
#define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <error.h> #include <mach.h> /* [GNU Mach]/ipc/port.h */ #define MACH_PORT_UREFS_MAX ((mach_port_urefs_t) ((1 << 16) - 1)) enum kind { TASK, TASK_REAL, THREAD, HOST }; mach_port_t plus_one(enum kind kind) { mach_port_t port; switch (kind) { case TASK: port = mach_task_self(); break; case TASK_REAL: /* Avoid glibc's cache. */ port = (mach_task_self)(); break; case THREAD: port = mach_thread_self(); break; case HOST: port = mach_host_self(); break; default: port = MACH_PORT_NULL; } if (port == MACH_PORT_NULL) error(1, 0, "plus one, %d: MACH_PORT_NULL", kind); else if (port == MACH_PORT_DEAD) error(1, 0, "plus one, %d: MACH_PORT_DEAD", kind); return port; } void minus_one(task_t task, mach_port_t port) { int err; err = mach_port_deallocate(task, port); if (err != KERN_SUCCESS) error(1, err, "mach_port_deallocate(%d, %d)", task, port); } mach_port_urefs_t get_urefs(task_t task, thread_t thread) { int err; mach_port_urefs_t refs; err = mach_port_get_refs(task, thread, MACH_PORT_RIGHT_SEND, &refs); if (err != KERN_SUCCESS) error(1, err, "mach_port_get_refs"); return refs; } int main(int argc, char * argv[]) { task_t task = mach_task_self(); printf("task = %d: %d\n", task, get_urefs(task, task)); thread_t thread = mach_thread_self(); printf("thread = %d: %d\n", thread, get_urefs(task, thread)); host_t host = mach_host_self(); printf("host = %d: %d\n", host, get_urefs(task, host)); mach_port_t port; mach_port_urefs_t urefs; enum kind kind = atoi(argv[1]); port = plus_one(kind); printf ("port = %d\n", port); plus_one(kind); urefs = get_urefs(task, port); printf("+ 1 = %d\n", urefs); plus_one(kind); urefs = get_urefs(task, port); printf("+ 1 = %d\n", urefs); while ((urefs = get_urefs(task, port)) < (MACH_PORT_UREFS_MAX - 2)) plus_one(kind); printf("+ N = %d\n", urefs); plus_one(kind); urefs = get_urefs(task, port); printf("+ 1 = %d\n", urefs); plus_one(kind); urefs = get_urefs(task, port); printf("+ 1 = %d\n", urefs); plus_one(kind); urefs = get_urefs(task, port); printf("+ 1 = %d\n", urefs); minus_one(task, port); urefs = get_urefs(task, port); printf("- 1 = %d\n", urefs); minus_one(task, port); urefs = get_urefs(task, port); printf("- 1 = %d\n", urefs); minus_one(task, port); urefs = get_urefs(task, port); printf("- 1 = %d\n", urefs); while ((urefs = get_urefs(task, port)) > 2) minus_one(task, port); printf("- M = %d\n", urefs); minus_one(task, port); urefs = get_urefs(task, port); printf("- 1 = %d\n", urefs); minus_one(task, port); urefs = get_urefs(task, port); printf("- 1 = %d\n", urefs); minus_one(task, port); urefs = get_urefs(task, port); printf("- 1 = %d\n", urefs); return 0; }
pgppb6QgBetfL.pgp
Description: PGP signature