Timothy Potter created SOLR-15059: ------------------------------------- Summary: Default Grafana dashboard needs to expose graphs for monitoring query performance Key: SOLR-15059 URL: https://issues.apache.org/jira/browse/SOLR-15059 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: Grafana Dashboard, metrics Reporter: Timothy Potter Assignee: Timothy Potter
The default Grafana dashboard doesn't expose graphs for monitoring query performance. For instance, if I want to see QPS for a collection, that's not shown in the default dashboard. Same for quantiles like p95 query latency. After some digging, these metrics are available in the output from {{/admin/metrics}} but are not exported by the exporter. This PR proposes to enhance the default dashboard with a new Query Metrics section with the following metrics: * Distributed QPS per Collection (aggregated across all cores) * Distributed QPS per Solr Node (aggregated across all base_url) * QPS 1-min rate per core * QPS 5-min rate per core * Top-level Query latency p99, p95, p75 * Local (non-distrib) query count per core (this is important for determining if there is unbalanced load) * Local (non-distrib) query rate per core (1-min) * Local (non-distrib) p95 per core Also, the {{solr-exporter-config.xml}} uses {{jq}} queries to pull metrics from the output from {{/admin/metrics}}. This file is huge and contains a bunch of {{jq}} boilerplate. Moreover, I'm introducing another 15-20 metrics in this PR, it only makes the file more verbose. Thus, I'm also introducing support for jq templates so as to reduce boilerplate, reduce syntax errors, and improve readability. For instance the query metrics I'm adding to the config look like this: {code} <str> $jq:core-query(1minRate, endswith(".distrib.requestTimes")) </str> <str> $jq:core-query(5minRate, endswith(".distrib.requestTimes")) </str> {code} Instead of duplicating the complicated {{jq}} query for each metric. The templates are optional and only should be used if a given jq structure is repeated 3 or more times. Otherwise, inlining the jq query is still supported. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org