Good afternoon, Our remotely hosted web server goes down about once per month and must be manually rebooted. Being a remote machine, I haven't been able to check the console to determine whether the system is overloaded or has crashed.
The server is a 1RU Pentium 2Ghz 1GB/40GB running RH 7.3 with all the latest up2date releases (except for custom compiled Apache/mod_perl/php). So, the first part of my question is; how can I tell whether the server has crashed or is just being unresponsive (overloaded)? The server is in a data warehouse on a NAT firewall which denies ping packets. But I have asked internal staff to ping the server, and they get zero packets back. Is there some other check I can do to test whether the server has crashed, or is there some sort of logging I can do that will show the problem after the server is restarted? I am currently doing a fair amount of logging with MRTG and I don't see any gradual growth before the 'crash'. Maybe there is a sudden load increase which prevents MRTG from running its checks. All graphs are always zeroed during the time of the 'crash'. Some of the targets for MRTG: - user/system cpu - load - free/used memory - number of processes - number of open files (lsof) - disk usage for all partitions (except swap) - network throughput - plus other hardware checks (cpu temp, fan speed, etc) None of the above show any noticeable change before a 'crash'. And the second part of my question; what are some of the culprits I should be looking for to determine what is either crashing the server or causing it to be unresponsive. My instinct tells me the problem is too much load. On one or two occasions, I was unable to use any existing ssh sessions (nor create new ones) but I was able to get a few http responses from some of the apache processes (before the server ground to a complete halt). Any feedback or suggestions would be greatly appreciated. I have been trying to solve this for about 4 months now, and I don't know what else to try. Thanks, Charlie -- Charlie Garrison [EMAIL PROTECTED] PO Box 141, Windsor, NSW 2756, Australia -- redhat-list mailing list unsubscribe mailto:[EMAIL PROTECTED]?subject=unsubscribe https://listman.redhat.com/mailman/listinfo/redhat-list