While Sun might have a way to monitor server health, I use check_logfiles (http://www.nagiosexchange.org/Misc.54.0.html?&tx_netnagext_pi1%5Bp_view %5D=538) to monitor the logs for errors more than anything. I have checks for power supplies, fans, SAN mounts, memory, cpu and SCSI errors, etc. setup to monitor the server health. I also use check_disksuite (software RAID for Solaris,) and Check Veritas Volume Manager for Veritas volumes.
I use active server checks to monitor the services provided as well of course, almost exclusively with check_by_ssh or check_http. > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:nagios-users- > [EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] > Sent: Tuesday, January 29, 2008 4:09 PM > To: 'nagios Users Mailinglist' > Subject: Re: [Nagios-users] Sun Monitoring > > Hugo van der Kooij wrote: > > Lars Stavholm wrote: > >> Edwin Zoeller wrote: > >>> We are monitoring ~75 Sun Servers via Nagios. What information > >>> would you like to know. > >> > >> Top down... > >> 1. method used (nsca, nrpe, snmp) and why? > >> 2. what specific checks are you using? > > > > Servers in themselves are useless pieces of junk. What you care about > > are the services they provide. So that is what you need to monitor. If > > you think about it that way you can choose whatever method allows you > > to monitor these services reliably. > > I'd suggest that you do also need to monitor the servers themselves. > While > they may just be useless pieces of junk on their own, you do still need > them > to run your services. Strictly monitoring the services they provide will > give you a good up/down view of things but you probably want to know when > things are starting to go sideways so you can take corrective action > before > you actually reach a down state. I don't work with Sun hardware, but I'm > sure Sun provides some kind of framework for monitoring some of the > physical > components of the system, and I'd bet dollars to doughnuts there are > Nagios > plugins that can take advantage of that framework Temperature, fans, > HD/RAID status, power supply status, etc. etc. are a very good thing to > keep > an eye on if you intend to catch problems early enough to resolve before > actually causing downtime. > > Andrew > > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
