Howdy Mark,

Thanks!  It seems much better.  I've had it running
on a few systems here for about 5 hours now and haven't
seen any "MAXFD" messages in the logs or any of the 
accompanying "zombie" processes (I didn't notice it was
creating a zombie for each fd until after I sent this report).

BTW/FYI, SVN is missing a few files needed to make the 
doc folder.  In particular, scan.eps and n_loadavg.eps
for cfengine-Anomalies.texinfo and the cfcomdoc.css
referenced in doc/Makefile.

jack/Slick

On Tue, 2008-04-22 at 08:43 +0200, Mark Burgess wrote:
> I found a file descriptor leak that would only occur on linux from the 
> latest change that added cpu utilization monitoring. I am testing it 
> now. Please check too and tell me if things get better.
> 
> SiliconSlick wrote:
> > Howdy all,
> > 
> > I rebuilt an RPM using the latest SVN code
> > late last week and installed it here.  I'm
> > now getting a lot of cfenvd failures.  It
> > appears to be leaking file descriptors.
> > 
> > A sample from the log gives the following:
> > 
> > Apr 19 22:34:52 gemenon cfenvd[22964]:  File descriptor 1021 of child
> > higher than MAXFD, check for defunct children
> > Apr 19 22:34:52 gemenon cfenvd[22964]:  File descriptor 1022 of child
> > 13520 higher than MAXFD, check for defunct children
> > Apr 19 22:34:52 gemenon cfenvd[22964]:  File descriptor 1022 of child
> > higher than MAXFD, check for defunct children
> > Apr 19 22:37:22 gemenon cfenvd[22964]:  File descriptor 1022 of child
> > 13533 higher than MAXFD, check for defunct children
> > Apr 19 22:37:22 gemenon cfenvd[22964]:  File descriptor 1022 of child
> > higher than MAXFD, check for defunct children
> > Apr 19 22:40:05 gemenon cfenvd[22964]:  Couldn't open average
> > database /var/cfengine/state/cf_observations.db
> > Apr 19 22:40:05 gemenon cfenvd[22964]:  db_open: Too many open files
> > Apr 19 22:40:05 gemenon cfenvd[22964]:  Error reading average database
> > Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Executing
> > shell command: /etc/rc.d/init.d/cfenvd restart
> > Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Restart:
> > Stopping cfengine anomaly detection service (cfenvd): [FAILED]
> > Apr 19 23:01:43 gemenon cfenvd[14285]: Lock
> > lock.db.localhost.cfenvd.daemon_2743 expired (after 2575/1 minutes)
> > Apr 19 23:01:43 gemenon cfenvd[14283]: cfenvd: starting
> > Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Restart:
> > Starting cfengine anomaly detection service (cfenvd): [  OK  ]
> > Apr 19 23:01:44 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: (Done
> > with /etc/rc.d/init.d/cfenvd restart)
> > Apr 19 23:24:21 gemenon cfenvd[14285]:  LDT Buffer full at 10
> > Apr 19 23:39:22 gemenon cfenvd[14285]:  File descriptor 20 of child
> > 14584 higher than MAXFD, check for defunct children
> > Apr 19 23:39:22 gemenon cfenvd[14285]:  File descriptor 20 of child
> > higher than MAXFD, check for defunct children
> > Apr 19 23:41:52 gemenon cfenvd[14285]:  File descriptor 20 of child
> > 14602 higher than MAXFD, check for defunct children
> > Apr 19 23:41:52 gemenon cfenvd[14285]:  File descriptor 20 of child
> > higher than MAXFD, check for defunct children
> > 
> > It appears after 40 minutes, it has reached MAXFD==20.  It goes
> > along for a while and then eventually dies a day and a half later 
> > (at ~30/hour and with 1024 fds avail, about 34 hours).
> > 
> > Given Mark's recent changes and request for help with cfenvd,
> > I thought it might be related.  Looking at the diff between
> > revision 550 and 553 of cfenvd.c[*], I'm thinking the culprit
> > might be a return without "fclose(fp)" on line 1404[**].   I haven't
> > tested a fix yet since I'm not sure what the fix is (close
> > the file first?... don't return?).
> > 
> > Does this seem like it could be the cause of the problem
> > I'm seeing above?  Anyone else having similar problems?
> > 
> > jack/SiliconSlick
> > 
> > [*]
> > http://svn.iu.hio.no/viewvc/trunk/src/cfenvd.c?root=Cfengine-2&r1=550&r2=553
> > 
> > [**] this bit:
> > 
> >    else
> >       {
> >       Verbose("Found nothing (%s)\n",cpuname);
> >       index = ob_spare;
> >       return;
> >       }
> > 
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bug-cfengine mailing list
> > [email protected]
> > https://cfengine.org/mailman/listinfo/bug-cfengine
> 

_______________________________________________
Bug-cfengine mailing list
[email protected]
https://cfengine.org/mailman/listinfo/bug-cfengine

Reply via email to