Hi All I've been trying to get a metric trigger set up in SolrCloud 8.4.1, but it's not working, and was hoping for some help.
I've created a metric trigger using this: ``` POST /solr/admin/autoscaling { "set-trigger": { "name": "metric_trigger", "event": "metric", "waitFor": "10s", "metric": "metrics:solr.jvm:os.systemCpuLoad", "above": 0.7, "preferredOperation": "MOVEREPLICA", "enabled": true } } ``` And I get a successful response. I can also see the new trigger in the `files -> tree -> autoscaling.json`. However, I don't see any difference in the logs (I had the autoscaling logging set to debug), and it's definitely not moving any replicas around when under load, and the node is consistently in the > 85% overall systemCpuLoad. (I can see this as well when I use the `/metrics` endpoint with the above key.) I then restarted all the nodes, and saw this error on startup, saying it couldn't set the state during a restore, with the worrying part saying that it is discarding the trigger... I'd really like some help with this. We've been seeing that out of the 3 nodes, there's always - seemingly randomly - massively utilised on CPU (maxed out 8 cores, and it's not always the one with overseer), so we were hoping that we could let the Metric Trigger sort it out in the short term. ``` 2020-10-22 23:03:19.905 ERROR (ScheduledTrigger-7-thread-3) [ ] o.a.s.c.a.ScheduledTriggers Error restoring trigger state jvm_cpu_trigger => java.lang.NullPointerException at org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94) java.lang.NullPointerException: null at org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94) ~[?:?] at org.apache.solr.cloud.autoscaling.TriggerBase.restoreState(TriggerBase.java:279) ~[?:?] at org.apache.solr.cloud.autoscaling.ScheduledTriggers$TriggerWrapper.run(ScheduledTriggers.java:638) ~[?:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:834) [?:?] 2020-10-22 23:03:19.912 ERROR (ScheduledTrigger-7-thread-1) [ ] o.a.s.c.a.ScheduledTriggers Failed to re-play event, discarding: { "id":"dd2ebf3d56bTboddkoovyjxdvy1hauq2zskpt", "source":"metric_trigger", "eventTime":15199552918891, "eventType":"METRIC", "properties":{ "node":{"mycoll-solr-solr-service-1.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr":0.7322834645669292}, "_dequeue_time_":261690991035, "metric":"metrics:solr.jvm:os.systemCpuLoad", "preferredOperation":"MOVEREPLICA", "_enqueue_time_":15479182216601, "requestedOps":[{ "action":"MOVEREPLICA", "hints":{"SRC_NODE":["mycoll-solr-solr-service-1.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr"]}}], "replaying":true}} 2020-10-22 23:03:19.913 INFO (OverseerStateUpdate-144115201265369088-mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr-n_0000000199) [ ] o.a.s.c.o.SliceMutator createReplica() { "operation":"addreplica", "collection":"mycoll-2", "shard":"shard5", "core":"mycoll-2_shard5_replica_n122", "state":"down", "base_url":" http://mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983/solr ", "node_name":"mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr", "type":"NRT"} 2020-10-22 23:03:19.921 ERROR (ScheduledTrigger-7-thread-1) [ ] o.a.s.c.a.ScheduledTriggers Error restoring trigger state metric_trigger => java.lang.NullPointerException at org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94) java.lang.NullPointerException: null at org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94) ~[?:?] at org.apache.solr.cloud.autoscaling.TriggerBase.restoreState(TriggerBase.java:279) ~[?:?] at org.apache.solr.cloud.autoscaling.ScheduledTriggers$TriggerWrapper.run(ScheduledTriggers.java:638) ~[?:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:834) [?:?] ``` Any help please? Thank you Jonathan