I take care of a server fo a client that handles DNS, mail, and radius
authentication.  Twice in as many months, the server has failed.  Prior to
that it has performed without incident for a year or so.  It is running
RH6.0 + updates.

By fail I mean it:
1. Will not accept telnet conections (from allowed IP addresses or
otherwise),
2. It will not allow logon at the console.  You get a userid and password
prompt, and after entering them you get a pause and then the login prompt
again.   I even tried pulling the network cable to be sure I was not getting
hammered by some sort of DOS attack.
3. Radius was not authenticating, but it's accounting logs continued to
generate stop records for users attempting but failing to authenticate via
radius (correct behavior), and for users disconnecting that were on prior to
server failure.
4. Dns was not dns'ing <g>. (rpm bind-8.2.3-0.6.x)
5. httpd was serving up web pages addressed by IP,  but would only server
pages partially - I'm assuming the cause of partial pages was it trying to
resolve links but dns wasn't working.
6. Not sure about sendmail.  Logs say it was working some, but apparantly at
a reduced level/activity. (sendmail 8.9.3).
7. This box also has the latest ucd-snmp installed and does some monitoring
via mrtg.  Both a script that uses ucd-snmp (snmpwalk), and mrtg running via
cron, returned a message at about the time of the suspected failure of:
fork: Resource temporarily unavailable

I captured several logs and can tell about when things stopped working but
see no apparant reason for why.

Since I was unable to log in as root on the console, I could not see if some
process had run-away.  I was forced to powercycle the box.  After the
unclean shutdown and reboot, all is functioning properly.

Of course I need to get this fixed.  Can someone suggest/recommend what I
can do to:

1. determine what the cause may be, are these symptoms of any known crack?
2. something(s) to monitor to catch it before it happens
3. other things I can try if it happens again short of an ungracefull power
cycle?

I have all the applicable latest packages installed and I've checked
/var/log/messages
/var/log/maillog
/var/log/secure
my radius logs

I did not see anything unusual in any of above other that being to tell when
(some) things stopped working from looking at the time stamp's time gap.

Thanks very much for any help.
Scott








_______________________________________________
Redhat-list mailing list
[EMAIL PROTECTED]
https://listman.redhat.com/mailman/listinfo/redhat-list

Reply via email to