Sebastian, would you be willing to try the current patches in svn? 
This might solve the problem.

M

Sebastian Hetze wrote:
> Hi *,
> 
> at least one sort of "MAXFD, check for defunct children" problem still
> exists in version 2.2.8 of cfengine. 
> Here is what I found:
> 
> cfengine wants to limit the number of parallel pipes to a hardcoded
> number (MAXFD=20). To achieve this, every new pipe is checked for its
> fileno and if that is higher than 20, the error message appears and 
> the pipe is somehow ignored (cfengine does not close the pipe properly
> in that case, which is a bug of its own).
> 
> Now look at this real live example:
> [EMAIL PROTECTED]:~# ls -l /proc/13406/fd
> lr-x------ 1 root root 64 2008-08-24 20:39 0 -> /dev/null
> l-wx------ 1 root root 64 2008-08-24 20:39 1 -> /var/log/cfengine/cfrun.13400
> l-wx------ 1 root root 64 2008-08-24 20:39 2 -> /dev/null
> lr-x------ 1 root root 64 2008-08-24 20:39 3 -> /dev/urandom
> lr-x------ 1 root root 64 2008-08-24 20:39 4 -> /etc/cfengine/cfagent.conf
> lrwx------ 1 root root 64 2008-08-24 20:39 5 -> socket:[18203]
> lrwx------ 1 root root 64 2008-08-24 20:39 6 -> socket:[18204]
> lr-x------ 1 root root 64 2008-08-24 20:39 7 -> /proc/loadavg
> lrwx------ 1 root root 64 2008-08-24 20:39 8 -> socket:[18954]
> lrwx------ 1 root root 64 2008-08-24 20:39 9 -> socket:[18207]
> lrwx------ 1 root root 64 2008-08-24 20:39 10 -> socket:[18974]
> lrwx------ 1 root root 64 2008-08-24 20:39 11 -> socket:[18209]
> lrwx------ 1 root root 64 2008-08-24 20:39 12 -> socket:[18215]
> lrwx------ 1 root root 64 2008-08-24 20:39 13 -> socket:[18221]
> lrwx------ 1 root root 64 2008-08-24 20:39 14 -> socket:[18217]
> lrwx------ 1 root root 64 2008-08-24 20:39 15 -> socket:[18227]
> lrwx------ 1 root root 64 2008-08-24 20:39 16 -> socket:[18223]
> lrwx------ 1 root root 64 2008-08-24 20:39 17 -> socket:[18234]
> lrwx------ 1 root root 64 2008-08-24 20:39 18 -> socket:[18229]
> lr-x------ 1 root root 64 2008-08-24 20:39 19 -> pipe:[149711]
> lrwx------ 1 root root 64 2008-08-24 20:39 20 -> socket:[18236]
> lr-x------ 1 root root 64 2008-08-24 20:39 21 -> pipe:[150154]
> lr-x------ 1 root root 64 2008-08-24 20:39 22 -> pipe:[150163]
> lr-x------ 1 root root 64 2008-08-24 20:39 23 -> pipe:[150173]
> lr-x------ 1 root root 64 2008-08-24 20:39 24 -> pipe:[150184]
> lr-x------ 1 root root 64 2008-08-24 20:39 25 -> pipe:[150194]
> lr-x------ 1 root root 64 2008-08-24 20:39 26 -> pipe:[150205]
> lr-x------ 1 root root 64 2008-08-24 20:39 27 -> pipe:[150215]
> lr-x------ 1 root root 64 2008-08-24 20:39 28 -> pipe:[150225]
> lr-x------ 1 root root 64 2008-08-24 20:39 29 -> pipe:[150236]
> lr-x------ 1 root root 64 2008-08-24 20:39 30 -> pipe:[150249]
> lr-x------ 1 root root 64 2008-08-24 20:39 31 -> pipe:[150269]
> lr-x------ 1 root root 64 2008-08-24 20:39 32 -> pipe:[150283]
> lr-x------ 1 root root 64 2008-08-24 20:39 33 -> pipe:[150293]
> lr-x------ 1 root root 64 2008-08-24 20:39 34 -> pipe:[150302]
> lr-x------ 1 root root 64 2008-08-24 20:39 35 -> pipe:[150396]
> 
> 
> You see, that there are 14 sockets open for cfagent. In this
> particular case, these sockets belong to heartbeat, which happens
> to have started this instance of cfagent. Maybe not the most
> common case, but definitely something cfagent should work with.
> Since these sockets all count for fileno, there is simply no
> fileno for popen left. Or -- even worse -- there is only one
> fileno left and the bug hits only occasionally if one pipe
> does not return fast enough.
> 
> As a workaround, I changed MAXFD to 40. But I think, using a
> proper counter for open pipes would be more appropriate.
> It looks like, you already started to use the CHILD[] array
> to keep track of free pipe slots?!
> If you guide me the direction you want to go, I will happily
> help you coding and testing.
> 
> Best regards,
> 
>   Sebastian Hetze
> _______________________________________________
> Bug-cfengine mailing list
> [email protected]
> https://cfengine.org/mailman/listinfo/bug-cfengine

-- 


Mark Burgess

Web: http://www.iu.hio.no/~mark
Tlf: +47 22453272
_______________________________________________
Bug-cfengine mailing list
[email protected]
https://cfengine.org/mailman/listinfo/bug-cfengine

Reply via email to