I was toying with the idea of monitoring some key stats from my compute-nodes using SNMP (eg. load factors; local disk usage; health of my pbs_moms etc.). Especially since Nagios docs. seem to recommend snmp as a recommended way to do the monitoring of private resources (as opposed to ssh or nrpe plugins).
I've never been familiar with SNMP before (leave that my Dell switches have an option to export stats via SNMP that I never used!) What do the wise-Beowulf-sysadmins have to say? Any caeveats? I checked with "etc/init.d/snmpd status" which reports /etc/init.d/snmpd: Command not found." So I guess I first need to install "net-snmp". My compute-nodes are already behind a firewall so I guess security should not be an issue by running this additional service on my compute-nodes. Perhaps performance takes a tiny hit; but I doubt it! Does SNMP make for a sound monitoring-philosophy and are others using t on their clusters? -- Rahul _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf