sigram commented on a change in pull request #2040: URL: https://github.com/apache/lucene-solr/pull/2040#discussion_r513469553
########## File path: solr/core/src/java/org/apache/solr/cloud/Overseer.java ########## @@ -173,6 +175,9 @@ private final Stats zkStats; + private final SolrMetricManager metricManager = new SolrMetricManager(); Review comment: This is wrong - you should never create instances of SolrMetricManager in your components. Always use the single instance constructed in `CoreContainer.getMetricManager()`. ########## File path: solr/core/src/java/org/apache/solr/cloud/Overseer.java ########## @@ -185,6 +190,9 @@ public ClusterStateUpdater(final ZkStateReader reader, final String myId, Stats this.completedMap = getCompletedMap(zkClient); this.myId = myId; this.reader = reader; + + registryName = SolrMetricManager.getRegistryName(SolrInfoBean.Group.overseer); + metricManager.registerGauge(null, registryName, () -> stateUpdateQueue.getZkStats().getQueueLength(), "overseer", true, "stateUpdateQueueSize", "queue"); Review comment: You should create a `SolrMetricsContext` in Overseer, and create a child context here, and then pass it as the first argument to `registerGauge`. This helps to properly unregister old gauges from the registry (because a new instance of `ClusterStateUpdater` is created each time this Overseer instance becomes the leader, and then it registers a new instance of gauge with the same name). Using a static string "overseer" defeats this mechanism, because it falsely indicates that the life-cycle of `ClusterStateUpdater` is the same as its parent object Overseer. Then in the `ClusterStateUpdater.close()` method you should unregister this gauge using the value of `solrMetricsContext.getTag()`. This prevents obscure reference leaks in the metrics registry. ########## File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java ########## @@ -154,6 +158,9 @@ public OverseerTaskProcessor(ZkStateReader zkStateReader, String myId, this.runningTasks = ConcurrentHashMap.newKeySet(); this.completedTasks = new ConcurrentHashMap<>(); thisNode = Utils.getMDCNode(); + + registryName = SolrMetricManager.getRegistryName(SolrInfoBean.Group.overseer); + metricManager.registerGauge(null, registryName, () -> workQueue.getZkStats().getQueueLength(), "overseer", true, "collectionWorkQueueSize", "queue"); Review comment: See above comments - you should create a child context here (from the one created in Overseer) and use it in this call, and then unregister this gauge using this context's tag value. ########## File path: solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java ########## @@ -423,13 +429,13 @@ public boolean hasRegistry(String name) { /** * Check for predefined shared registry names. This compares the input name * with normalized names of predefined shared registries - - * {@link #JVM_REGISTRY} and {@link #JETTY_REGISTRY}. + * {@link #JVM_REGISTRY}, {@link #JETTY_REGISTRY}, and {@link #OVERSEER_REGISTRY} Review comment: This is wrong - Overseer registry CANNOT be a shared registry because it's possible to have multiple Overseers in a single JVM, especially in unit tests. This should be a regular per-CoreContainer registry, which is created within the scope of the instance available from CC.getMetricManager(). ########## File path: solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java ########## @@ -423,13 +429,13 @@ public boolean hasRegistry(String name) { /** * Check for predefined shared registry names. This compares the input name * with normalized names of predefined shared registries - - * {@link #JVM_REGISTRY} and {@link #JETTY_REGISTRY}. + * {@link #JVM_REGISTRY}, {@link #JETTY_REGISTRY}, and {@link #OVERSEER_REGISTRY} * * @param registry already normalized name * @return true if the name matches one of shared registries */ private static boolean isSharedRegistry(String registry) { - return JETTY_REGISTRY.equals(registry) || JVM_REGISTRY.equals(registry); + return JETTY_REGISTRY.equals(registry) || JVM_REGISTRY.equals(registry) || OVERSEER_REGISTRY.equals(registry); Review comment: See above - this can't be a shared registry. ########## File path: solr/core/src/java/org/apache/solr/cloud/Overseer.java ########## @@ -544,6 +552,7 @@ private LeaderStatus amILeader() { @Override public void close() { this.isClosed = true; + metricManager.unregisterGauges(registryName, "overseer"); Review comment: Please use the context tag value here. ########## File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java ########## @@ -386,6 +393,7 @@ private void cleanUpWorkQueue() throws KeeperException, InterruptedException { public void close() { isClosed = true; + metricManager.unregisterGauges(registryName, "overseer"); Review comment: See above - use the tag value from the context. ########## File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java ########## @@ -94,6 +96,8 @@ private boolean isClosed; private volatile Stats stats; + private final SolrMetricManager metricManager = new SolrMetricManager(); Review comment: See above - this should never be done. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org