jackjlli commented on a change in pull request #7122:
URL: https://github.com/apache/incubator-pinot/pull/7122#discussion_r663542258



##########
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java
##########
@@ -461,6 +462,34 @@ public synchronized PinotResourceManagerResponse 
updateInstanceTags(String insta
     return PinotResourceManagerResponse.SUCCESS;
   }
 
+  /**
+   * Validates whether an instance is offline.
+   * Since ZNodes under "/LIVEINSTANCES" are ephemeral, if there is a ZK 
session expire (e.g. due to network issue),
+   * the ZNode under "/LIVEINSTANCES" will be deleted. Thus, such race 
condition can happen when this task is running.
+   * In order to double confirm the live status of an instance, the field 
"LAST_OFFLINE_TIME" in ZNode under
+   * "/INSTANCES/<instance_id>/HISTORY" needs to be checked. If the value is 
"-1", that means the instance is ONLINE;
+   * if the value is a timestamp, that means the instance starts to be OFFLINE 
since that time.
+   * @param instanceId instance id
+   * @param offlineTimeRangeMs the time range that it's valid for an instance 
to be offline
+   */
+  public boolean isInstanceOffline(String instanceId, long offlineTimeRangeMs) 
{

Review comment:
       I think it's okay to keep the method name as it is, since it's actually 
checking whether the instance is truly offline. And this method can be reused 
for other purposes in the future. Dropping instance is only one of the use 
cases.

##########
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/ControllerConf.java
##########
@@ -146,6 +149,7 @@ private static long getRandomInitialDelayInSeconds() {
     private static final int 
DEFAULT_STATUS_CONTROLLER_WAIT_FOR_PUSH_TIME_IN_SECONDS = 10 * 60; // 10 minutes
     private static final int DEFAULT_TASK_MANAGER_FREQUENCY_IN_SECONDS = -1; 
// Disabled
     private static final int 
DEFAULT_MINION_INSTANCES_CLEANUP_TASK_FREQUENCY_IN_SECONDS = 60 * 60; // 1 Hour.
+    private static final int 
DEFAULT_MINION_INSTANCES_CLEANUP_TASK_MAX_OFFLINE_TIME_RANGE_IN_SECONDS = 10 * 
60; // 10 minutes

Review comment:
       I think if we want to make this config human readable, we should do the 
same for all the other configs of periodic task as well, which are already 
using time in seconds format, since they are all under 
`ControllerPeriodicTasksConf` class. So if we want to clean this up, we should 
do it all together. For now using the same format should be sufficient.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to