Hi, I'm building a monitoring system for Hadoop and want to set up default alerts (threshold or anomaly) on 2-3 key metrics everyone who uses Hadoop would typically want to alert on, but I don't yet have production-grade experience with Hadoop.
Alert rules have to be generally useful, so can't be on metrics whose values vary wildly based on the size of deployment. In other words, which metrics would be most significant indicators that something went wrong with your Hadoop cluster? Thanks very much, Mark
