sigram commented on a change in pull request #2040:
URL: https://github.com/apache/lucene-solr/pull/2040#discussion_r513469553



##########
File path: solr/core/src/java/org/apache/solr/cloud/Overseer.java
##########
@@ -173,6 +175,9 @@
 
     private final Stats zkStats;
 
+    private final SolrMetricManager metricManager = new SolrMetricManager();

Review comment:
       This is wrong - you should never create instances of SolrMetricManager 
in your components. Always use the single instance constructed in 
`CoreContainer.getMetricManager()`.

##########
File path: solr/core/src/java/org/apache/solr/cloud/Overseer.java
##########
@@ -185,6 +190,9 @@ public ClusterStateUpdater(final ZkStateReader reader, 
final String myId, Stats
       this.completedMap = getCompletedMap(zkClient);
       this.myId = myId;
       this.reader = reader;
+
+      registryName = 
SolrMetricManager.getRegistryName(SolrInfoBean.Group.overseer);
+      metricManager.registerGauge(null, registryName, () -> 
stateUpdateQueue.getZkStats().getQueueLength(), "overseer", true, 
"stateUpdateQueueSize", "queue");

Review comment:
       You should create a `SolrMetricsContext` in Overseer, and create a child 
context here, and then pass it as the first argument to `registerGauge`. This 
helps to properly unregister old gauges from the registry (because a new 
instance of `ClusterStateUpdater` is created each time this Overseer instance 
becomes the leader, and then it registers a new instance of gauge with the same 
name). Using a static string "overseer" defeats this mechanism, because it 
falsely indicates that the life-cycle of `ClusterStateUpdater` is the same as 
its parent object Overseer.
   
   Then in the `ClusterStateUpdater.close()` method you should unregister this 
gauge using the value of `solrMetricsContext.getTag()`. This prevents obscure 
reference leaks in the metrics registry.

##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -154,6 +158,9 @@ public OverseerTaskProcessor(ZkStateReader zkStateReader, 
String myId,
     this.runningTasks = ConcurrentHashMap.newKeySet();
     this.completedTasks = new ConcurrentHashMap<>();
     thisNode = Utils.getMDCNode();
+
+    registryName = 
SolrMetricManager.getRegistryName(SolrInfoBean.Group.overseer);
+    metricManager.registerGauge(null, registryName, () -> 
workQueue.getZkStats().getQueueLength(), "overseer", true, 
"collectionWorkQueueSize", "queue");

Review comment:
       See above comments - you should create a child context here (from the 
one created in Overseer) and use it in this call, and then unregister this 
gauge using this context's tag value.

##########
File path: solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java
##########
@@ -423,13 +429,13 @@ public boolean hasRegistry(String name) {
   /**
    * Check for predefined shared registry names. This compares the input name
    * with normalized names of predefined shared registries -
-   * {@link #JVM_REGISTRY} and {@link #JETTY_REGISTRY}.
+   * {@link #JVM_REGISTRY}, {@link #JETTY_REGISTRY}, and {@link 
#OVERSEER_REGISTRY}

Review comment:
       This is wrong - Overseer registry CANNOT be a shared registry because 
it's possible to have multiple Overseers in a single JVM, especially in unit 
tests. This should be a regular per-CoreContainer registry, which is created 
within the scope of the instance available from CC.getMetricManager().

##########
File path: solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java
##########
@@ -423,13 +429,13 @@ public boolean hasRegistry(String name) {
   /**
    * Check for predefined shared registry names. This compares the input name
    * with normalized names of predefined shared registries -
-   * {@link #JVM_REGISTRY} and {@link #JETTY_REGISTRY}.
+   * {@link #JVM_REGISTRY}, {@link #JETTY_REGISTRY}, and {@link 
#OVERSEER_REGISTRY}
    *
    * @param registry already normalized name
    * @return true if the name matches one of shared registries
    */
   private static boolean isSharedRegistry(String registry) {
-    return JETTY_REGISTRY.equals(registry) || JVM_REGISTRY.equals(registry);
+    return JETTY_REGISTRY.equals(registry) || JVM_REGISTRY.equals(registry) || 
OVERSEER_REGISTRY.equals(registry);

Review comment:
       See above - this can't be a shared registry.

##########
File path: solr/core/src/java/org/apache/solr/cloud/Overseer.java
##########
@@ -544,6 +552,7 @@ private LeaderStatus amILeader() {
     @Override
       public void close() {
         this.isClosed = true;
+        metricManager.unregisterGauges(registryName, "overseer");

Review comment:
       Please use the context tag value here.

##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -386,6 +393,7 @@ private void cleanUpWorkQueue() throws KeeperException, 
InterruptedException {
 
   public void close() {
     isClosed = true;
+    metricManager.unregisterGauges(registryName, "overseer");

Review comment:
       See above - use the tag value from the context.

##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -94,6 +96,8 @@
   private boolean isClosed;
 
   private volatile Stats stats;
+  private final SolrMetricManager metricManager = new SolrMetricManager();

Review comment:
       See above - this should never be done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to