We keep all essential user documentation (and some dev docs) in the Ref Guide.
The source for the Ref Guide is checked-in under solr/solr-ref-guide, it uses a simple ASCII markup so adding some content should be easy. You should follow the same workflow as with the code (create a JIRA, and then either add a patch or create a PR). > On 15 Oct 2019, at 17:33, Richard Goodman <richa...@brandwatch.com> wrote: > > Many thanks both for your responses, they've been helpful. > > @Andrzej - Sorry I wasn't clear on the "A latency of 1mil" as I wasn't > aware the image wouldn't come through. But following your bullet points > helped me present a better unit for measurement in the axis. > > In regards to contributing, would absolutely love to help there, just not > sure what the correct direction is? I wasn't sure if the web page source > code / contributions are in the apache-lucene repository? > > Thanks, > > > On Tue, 8 Oct 2019 at 11:04, Andrzej Białecki <a...@getopt.org> wrote: > >> Hi, >> >> Starting with Solr 7.0 all JMX metrics are actually internally driven by >> the metrics API - JMX (or Prometheus) is just a way of exposing them. >> >> I agree that we need more documentation on metrics - contributions are >> welcome :) >> >> Regarding your specific examples (btw. our mailing lists aggressively >> strip all attachments - your graphs didn’t make it): >> >> * time units in time-based counters are in nanoseconds. This is just a >> unit of value, not necessarily precision. In this specific example >> `ADMIN./admin/collections.totalTime` (and similarly named metrics for all >> other request handlers) represents the total elapsed time spent processing >> requests. >> * time-based histograms are expressed in milliseconds, where it is >> indicated by the “_ms” suffix. >> * 1-, 5- and 15-min rates represent an exponentially weighted moving >> average over that time window, expressed in events/second. >> * handlerStart is initialised with System.currentTimeMillis() when this >> instance of request handler is first created. >> * details on GC, memory buffer pools, and similar JVM metrics are >> documented in JDK documentation on Management Beans. For example: >> >> https://docs.oracle.com/javase/7/docs/api/java/lang/management/GarbageCollectorMXBean.html?is-external=true >> < >> https://docs.oracle.com/javase/7/docs/api/java/lang/management/GarbageCollectorMXBean.html?is-external=true >>> >> * "A latency of 1mil” - no idea what that is, I don’t think Solr API uses >> this abbreviation anywhere. >> >> Hope this helps. >> >> — >> >> Andrzej Białecki >> >>> On 7 Oct 2019, at 13:41, Emir Arnautović <emir.arnauto...@sematext.com> >> wrote: >>> >>> Hi Richard, >>> We do not use API to collect metrics but JMX, but I believe that those >> are the same (did not verify it in code). You can see how we handled those >> metrics into reports/charts or even use our agent to send data to >> Prometheus: >> https://github.com/sematext/sematext-agent-integrations/tree/master/solr < >> https://github.com/sematext/sematext-agent-integrations/tree/master/solr> >>> >>> You can also see some links to Solr metric related blog posts in this >> repo. If you find out that managing your own monitoring stack is >> overwhelming, you can try our Solr integration. >>> >>> HTH, >>> Emir >>> -- >>> Monitoring - Log Management - Alerting - Anomaly Detection >>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >>> >>> >>> >>>> On 7 Oct 2019, at 12:40, Richard Goodman <richa...@brandwatch.com> >> wrote: >>>> >>>> Hi there, >>>> >>>> I'm currently working on using the prometheus exporter to provide some >> detailed insights for our Solr Cloud clusters. >>>> >>>> Using the provided template killed our prometheus server, as well as >> the exporter due to the size of our clusters (each cluster is around 96 >> nodes, ~300 collections with 3way replication and 16 shards), so you can >> imagine the amount of data that comes through /admin/metrics and not >> filtering it down first. >>>> >>>> I've began working on writing my own template to reduce the amount of >> data being requested and it's working fine, and I'm starting to build some >> nice graphs in Grafana. >>>> >>>> The only difficulty I'm having with this, is I'm struggling to find >> decent documentation on the metrics themselves. I was using the resources >> metrics reporting - metrics-api < >> https://lucene.apache.org/solr/guide/7_7/metrics-reporting.html#metrics-api> >> and monitoring solr with prometheus and grafana < >> https://lucene.apache.org/solr/guide/7_7/monitoring-solr-with-prometheus-and-grafana.html> >> but there is a lack of information on most metrics. >>>> >>>> For example: >>>> "ADMIN./admin/collections.totalTime":6715327903, >>>> I understand this is a counter, however, I'm not sure what unit this >> would be represented when displaying it, for example: >>>> >>>> >>>> >>>> A latency of 1mil, not sure if this means milliseconds, million, etc., >>>> Another example would be the GC metrics: >>>> "gc.ConcurrentMarkSweep.count":7, >>>> "gc.ConcurrentMarkSweep.time":1247, >>>> "gc.ParNew.count":16759, >>>> "gc.ParNew.time":884173, >>>> Which when displayed, doesn't give the clearest insight as to what the >> unit is: >>>> >>>> >>>> If anyone has any advice / guidance, that would be greatly appreciated. >> If there isn't documentation for the API, then this would also be something >> I'll look into help contributing with too. >>>> >>>> Thanks, >>>> -- >>>> Richard Goodman >>> >> >> > > -- > > Richard Goodman | Data Infrastructure engineer > > richa...@brandwatch.com > > > NEW YORK | BOSTON | BRIGHTON | LONDON | BERLIN | STUTTGART | > PARIS | SINGAPORE | SYDNEY > > <https://www.brandwatch.com/blog/digital-consumer-intelligence/>