On 2011-07-28 11.29, Robert Nagy wrote: > On (2011-07-28 11:17), Mark Kettenis wrote: >>> Date: Thu, 28 Jul 2011 10:16:00 +0200 >>> From: David Coppa <dco...@gmail.com> >>> On Thu, 28 Jul 2011, Robert Nagy wrote: >>>> It seems that SIGTERM is not enough for mountd, according to the code >>>> SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients. >>>> So I think what we should do in this case is to first send a SIGTERM to >>>> mountd, >>>> and then SIGKILL it in rc_stop(). >>> Something like this? the sleep is just paranoia, don't know if it's >>> useful... >> >> Well, that sleep makes some sense at least; you want to give the >> daemon some time to clean up. The question is whether a single second >> is enough for that... > > Well mountd actually dies about 1.5-2 minutes after sending it a SIGTERM...
Oh darn, how much I hate RPC... The thing is, mountd executes this when it receives a SIGTERM: if (gotterm) { (void) clnt_broadcast(RPCPROG_MNT, RPCMNT_VER1, RPCMNT_UMNTALL, xdr_void, (caddr_t)0, xdr_void, (caddr_t)0, umntall_each); exit(0); } Now, clnt_broadcast() sends broadcasts blindly to the local net, and waits with a rather long, hardcoded timeout for answers that it may or may not get. If it gets at least one answer, the umntall_each() function returns 1 which makes sure it doesn't wait for other answers, thus exiting quickly. If however, and this is perhaps the most common use case, there are no other mountd listeners on the local net, it waits until its (arbitrarily chosen, it seems) time is up and then exits, at which time mountd itself also promptly exits. I question the need for the clnt_broadcast() call to be there at all. If my (admittedly cursory) analysis is correct, it only reaches other mountd daemons in the neighborhood, it causes minute-long exit delays in very common usage scenarios and mountd strangely makes no other effort to contact the clients that may actually be associated with it. To remove the call would certainly make mountd exit promptly, but someone with more insight into the magic of RPC than me needs to weigh in on potential regressions first... In any case, my gut feeling is that to kludgily "solve" the problem with an arbitrary sleep and then a SIGKILL in the rc script is wrong, wrong... Regards, /Benny -- internetlabbet.se / work: +46 8 551 124 80 / "Words must Benny Lofgren / mobile: +46 70 718 11 90 / be weighed, / fax: +46 8 551 124 89 / not counted." / email: benny -at- internetlabbet.se