size-on-disk of cores, size of tlogs, DIH stats over time, last modified date of cores
The most important alert-type things are -- collections in recovery or down state, solrcloud election events, various error rates It's also important to be able to tie these back to aliases so you are only monitoring cores you care about, even if their backing collection name changes every so often On Tue, Apr 28, 2020 at 7:57 AM Radu Gheorghe <radu.gheor...@sematext.com> wrote: > > Hi fellow Solr users, > > I'm looking into improving our Solr monitoring > <https://sematext.com/docs/integration/solr/> and I was curious on which > metrics you consider relevant. > > From what we currently have, I'm only really missing fieldCache. Which we > collect, but not show in the UI yet (unless you add a custom chart - we'll > add it to default soon). > > You can click on a demo account <https://apps.sematext.com/demo> (there's a > Solr app there called PH.Prod.Solr7) to see what we already collect, but > I'll write it here in short: > - query rate and latency (you can group per handler, per core, per > collection if it's SolrCloud) > - index size (number of segments, files...) > - indexing: added/deleted docs, commits > - caches (size, hit ratio, warmup...) > - OS- and JVM-level metrics (from CPU iowait to GC latency and everything > in between) > > Anything that we should add? > > I went through the Metrics API output, and the only significant thing I can > think of is the transaction log. But to be honest I never checked those > metrics in practice. > > Or maybe there's something outside the Metrics API that would be useful? I > thought about the breakdown of shards that are up/down/recovering... as > well as replica types. We plan on adding those, but there's a challenge in > de-duplicating metrics. Because one would install one agent per node, and > I'm not aware of a way to show only local shards in the Collections API -> > CLUSTERSTATUS. > > Thanks in advance for any feedback that you may have! > Radu > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/