dlmarion commented on PR #5012: URL: https://github.com/apache/accumulo/pull/5012#issuecomment-2436303567
> I think the fundamental difference in our perspective on this is that you seem to see the monitor as a way to get some metrics information about the system. Not really, I think that the Monitor can display state from the metrics that we collect. > And, I see it as a way to get an overview of the system's deployed state and current activity. Some of the deployed state and current activity can be viewed as metrics, but some of it simply can't be. There isn't a metric for instanceName or instanceId, for example, or list of tables, or list of resource groups, or list of zookeepers the system is configured with, or whether a table is currently "offline" or "disabled". These kinds of things are status things that don't make sense as metrics, but are instead retrieved from the public API. So, I absolutely think that the monitor does have a need of getting data that can't be retrieved or calculated by just getting the metrics via some MeterRegistry. The MetricsResponse object that is returned from the Metrics Thrift API contains a small amount of information that is not metrics (server type, address, resource group, and timestamp) in addition to a list of Metrics. I think we should try to convey as much of that information as possible as metrics, and then add the remaining information to the MetricsResponse as non-metrics. Certainly the Manager could emit a metric for the number of tablets per table, from which we could create a list of tables. The nice thing about capturing most things as metrics is that we don't need two different code paths to capture the information about the system. If the monitor can run mostly off the metrics, then the monitor is optional as a user could decide to create their own monitor using some enterprise level monitoring system. > To the extent that the monitor is showing stuff that is specifically metrics, or can be metrics, I think the monitor should do very little of that, and instead rely on the user's use of a dedicated metrics collection/aggregation/alerting/visualization thing. If that's the kind of thing you're trying to prototype here, I think that's fine, but perhaps it shouldn't have anything to do with the monitor at all? Or, to the extent that it is, it's a very specific page on the monitor, separate from the non-metrics stuff. So, I think we should discuss what the actual goal is here, because it currently feels like "metrics" is being used as a hammer to hit every nail, when a lot of the monitor isn't actually a nail... some of it is, and for that stuff, it's probably useful to prototype that like you're doing in this PR... but not all of it is. My thought for the new monitor is based upon the image I put in https://github.com/apache/accumulo/issues/4973#issuecomment-2417690084. It's information about the system as whole, most of which can be gleaned from the metrics. I realize that not everything is a metric, but if your idea is that the Monitor should show non-metrics, what is it showing? A description would be useful I think as there are certainly a lot of numbers being shown on the current Monitor pages, most of which are captured as metrics in 4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
