On Mon, Feb 02, 2009 at 11:49:40AM +0100, Daniel Schreiber wrote: > Iustin Pop schrieb: >>> Twisted included a closed file descriptor in the select list. >> >> Ok, so it seems our cleaning up/closing filedescriptors is conflicting >> with twisted opening some pipes for itself. Why it only does in some >> cases and not always it's a mistery to me. >> >> In any case, could you try and run the node daemon in foreground >> (ganeti-noded -f)? In the non-daemon case, we don't cleanup the file >> descriptors, and this will confirm or infirm my theory. > > Works. call_version.py as well as gnt-commands.
Thanks for confirming. I have a patch that might or might not solve the problem, since I don't know how early twisted creates the pipes. Would you mind testing the attached patch? (Will be a little bit tricky to apply to installed sources though). If the patch doesn't solve the problem, then we will have to implement an option to not close file descriptors for Ganeti 1.2. regards, iustin
>From 904719b715a5d5d64f3ddd4a2d0ae6d393406ed5 Mon Sep 17 00:00:00 2001 From: Iustin Pop <iust...@gmail.com> Date: Fri, 9 Jan 2009 12:52:17 +0000 Subject: [PATCH] Rework the daemonization sequence This is a backport of commit 2258 on the 2.0 branch (actually trunk) to the 1.2 branch: The current fork+close fds sequence has deficiencies which are hard to work around: - logging can start logging before we fork (e.g. if we need to emit messages related to master checking), and thus use FDs which we can't track nicely - the queue locks the queue file, and again this fd needs to be kept open which is hard from the main loop (and this error is currently hidden by the fact that we don't log it) Given the above, it's much simpler, in case we will fork later, to close file descriptors right at the beginning of the program, and in Daemonize only close/reopen the stdin/out/err fds. In addition, we also close() the handlers we remove in SetupLogging so that the cleanup is more thorough. Reviewed-by: imsnah The reason for the backport is that in certain cases, Twisted 8.1 will open extra file descriptors at reactor instantiation, so our closing of the filedescriptors is too late. The patch attempts to close the filedescripts earlier (which might be or not early enough). --- daemons/ganeti-noded | 4 ++ daemons/ganeti-rapi | 1 + lib/utils.py | 78 ++++++++++++++++++++++++++++++++++++------------- 3 files changed, 62 insertions(+), 21 deletions(-) diff --git a/daemons/ganeti-noded b/daemons/ganeti-noded index 58c3804..49ed263 100755 --- a/daemons/ganeti-noded +++ b/daemons/ganeti-noded @@ -589,6 +589,10 @@ def main(): """ options, args = ParseOptions() utils.debug = options.debug + + if options.fork: + utils.CloseFDs() + for fname in (constants.SSL_CERT_FILE,): if not os.path.isfile(fname): print "config %s not there, will not run." % fname diff --git a/daemons/ganeti-rapi b/daemons/ganeti-rapi index fa1d41f..b379e11 100755 --- a/daemons/ganeti-rapi +++ b/daemons/ganeti-rapi @@ -86,6 +86,7 @@ def main(): """ options, args = ParseOptions() if options.fork: + utils.CloseFDs() utils.Daemonize(logfile=constants.LOG_RAPISERVER) RESTHTTPServer.start(options) sys.exit(0) diff --git a/lib/utils.py b/lib/utils.py index 083fd2d..7387781 100644 --- a/lib/utils.py +++ b/lib/utils.py @@ -1119,15 +1119,33 @@ def TestDelay(duration): return True -def Daemonize(logfile, noclose_fds=None): - """Daemonize the current process. +def _CloseFDNoErr(fd, retries=5): + """Close a file descriptor ignoring errors. - This detaches the current process from the controlling terminal and - runs it in the background as a daemon. + @type fd: int + @param fd: the file descriptor + @type retries: int + @param retries: how many retries to make, in case we get any + other error than EBADF + + """ + try: + os.close(fd) + except OSError, err: + if err.errno != errno.EBADF: + if retries > 0: + _CloseFDNoErr(fd, retries - 1) + # else either it's closed already or we're out of retries, so we + # ignore this and go on + + +def CloseFDs(noclose_fds=None): + """Close file descriptors. + + This closes all file descriptors above 2 (i.e. except + stdin/out/err). """ - UMASK = 077 - WORKDIR = "/" # Default maximum for the number of available file descriptors. if 'SC_OPEN_MAX' in os.sysconf_names: try: @@ -1138,6 +1156,31 @@ def Daemonize(logfile, noclose_fds=None): MAXFD = 1024 else: MAXFD = 1024 + maxfd = resource.getrlimit(resource.RLIMIT_NOFILE)[1] + if (maxfd == resource.RLIM_INFINITY): + maxfd = MAXFD + + # Iterate through and close all file descriptors (except the standard ones) + for fd in range(3, maxfd): + if noclose_fds and fd in noclose_fds: + continue + _CloseFDNoErr(fd) + + +def Daemonize(logfile): + """Daemonize the current process. + + This detaches the current process from the controlling terminal and + runs it in the background as a daemon. + + @type logfile: str + @param logfile: the logfile to which we should redirect stdout/stderr + @rtype: int + @returns: the value zero + + """ + UMASK = 077 + WORKDIR = "/" # this might fail pid = os.fork() @@ -1153,22 +1196,15 @@ def Daemonize(logfile, noclose_fds=None): os._exit(0) # Exit parent (the first child) of the second child. else: os._exit(0) # Exit parent of the first child. - maxfd = resource.getrlimit(resource.RLIMIT_NOFILE)[1] - if (maxfd == resource.RLIM_INFINITY): - maxfd = MAXFD - # Iterate through and close all file descriptors. - for fd in range(0, maxfd): - if noclose_fds and fd in noclose_fds: - continue - try: - os.close(fd) - except OSError: # ERROR, fd wasn't open to begin with (ignored) - pass - os.open(logfile, os.O_RDWR|os.O_CREAT|os.O_APPEND, 0600) - # Duplicate standard input to standard output and standard error. - os.dup2(0, 1) # standard output (1) - os.dup2(0, 2) # standard error (2) + for fd in range(3): + _CloseFDNoErr(fd) + i = os.open("/dev/null", os.O_RDONLY) # stdin + assert i == 0, "Can't close/reopen stdin" + i = os.open(logfile, os.O_WRONLY|os.O_CREAT|os.O_APPEND, 0600) # stdout + assert i == 1, "Can't close/reopen stdout" + # Duplicate standard output to standard error. + os.dup2(1, 2) return 0 -- 1.5.6.5