shounakmk219 commented on code in PR #16886:
URL: https://github.com/apache/pinot/pull/16886#discussion_r2382602659


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTenantRestletResource.java:
##########
@@ -701,6 +705,46 @@ public SuccessResponse deleteTenant(
         Response.Status.INTERNAL_SERVER_ERROR);
   }
 
+  @DELETE
+  @Produces(MediaType.APPLICATION_JSON)
+  @Authenticate(AccessType.DELETE)
+  @Authorize(targetType = TargetType.CLUSTER, action = 
Actions.Cluster.REBALANCE_TENANT_TABLES)
+  @Path("/tenants/rebalance/{jobId}")
+  @ApiOperation(value = "Cancels a running tenant rebalance job")
+  @ApiResponses(value = {
+      @ApiResponse(code = 200, message = "Success", response = 
SuccessResponse.class),
+      @ApiResponse(code = 404, message = "Tenant rebalance job not found"),
+      @ApiResponse(code = 500, message = "Internal server error during 
cancelling the rebalance job")
+  })
+  public SuccessResponse cancelRebalance(
+      @ApiParam(value = "Tenant rebalance job id", required = true) 
@PathParam("jobId") String jobId) {
+    Map<String, String> jobMetadata =
+        _pinotHelixResourceManager.getControllerJobZKMetadata(jobId, 
ControllerJobTypes.TENANT_REBALANCE);
+    if (jobMetadata == null) {
+      throw new ControllerApplicationException(LOGGER, "Failed to cancel 
tenant rebalance job: " + jobId,
+          Response.Status.NOT_FOUND);
+    }
+    try {
+      TenantRebalanceContext originalContext = 
TenantRebalanceContext.fromTenantRebalanceJobMetadata(jobMetadata);
+      TenantRebalanceProgressStats progressStats =
+          
JsonUtils.stringToObject(jobMetadata.get(RebalanceJobConstants.JOB_METADATA_KEY_REBALANCE_PROGRESS_STATS),
+              TenantRebalanceProgressStats.class);
+      originalContext.getParallelQueue().clear();
+      originalContext.getSequentialQueue().clear();
+      TenantRebalancer.TenantTableRebalanceJobContext ctx;
+      while ((ctx = originalContext.getOngoingJobsQueue().poll()) != null) {

Review Comment:
   What happens if a new table is picked after we fetch the `originalContext`?
   In case those are not cancelled, will marking the tenant job as cancelled 
right away prevent it from picking any new tables ? 



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/tenant/TenantRebalanceProgressStats.java:
##########
@@ -121,6 +131,30 @@ public Map<String, String> getTableRebalanceJobIdMap() {
   }
 
   public enum TableStatus {

Review Comment:
   I could not find any issue but please ensure the updated values won't create 
any issues post upgrade if there are ongoing tenant rebalance jobs from older 
code 



##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTenantRestletResource.java:
##########
@@ -701,6 +705,46 @@ public SuccessResponse deleteTenant(
         Response.Status.INTERNAL_SERVER_ERROR);
   }
 
+  @DELETE
+  @Produces(MediaType.APPLICATION_JSON)
+  @Authenticate(AccessType.DELETE)
+  @Authorize(targetType = TargetType.CLUSTER, action = 
Actions.Cluster.REBALANCE_TENANT_TABLES)
+  @Path("/tenants/rebalance/{jobId}")
+  @ApiOperation(value = "Cancels a running tenant rebalance job")
+  @ApiResponses(value = {
+      @ApiResponse(code = 200, message = "Success", response = 
SuccessResponse.class),
+      @ApiResponse(code = 404, message = "Tenant rebalance job not found"),
+      @ApiResponse(code = 500, message = "Internal server error during 
cancelling the rebalance job")
+  })
+  public SuccessResponse cancelRebalance(
+      @ApiParam(value = "Tenant rebalance job id", required = true) 
@PathParam("jobId") String jobId) {
+    Map<String, String> jobMetadata =
+        _pinotHelixResourceManager.getControllerJobZKMetadata(jobId, 
ControllerJobTypes.TENANT_REBALANCE);
+    if (jobMetadata == null) {
+      throw new ControllerApplicationException(LOGGER, "Failed to cancel 
tenant rebalance job: " + jobId,
+          Response.Status.NOT_FOUND);
+    }
+    try {
+      TenantRebalanceContext originalContext = 
TenantRebalanceContext.fromTenantRebalanceJobMetadata(jobMetadata);
+      TenantRebalanceProgressStats progressStats =
+          
JsonUtils.stringToObject(jobMetadata.get(RebalanceJobConstants.JOB_METADATA_KEY_REBALANCE_PROGRESS_STATS),
+              TenantRebalanceProgressStats.class);
+      originalContext.getParallelQueue().clear();
+      originalContext.getSequentialQueue().clear();
+      TenantRebalancer.TenantTableRebalanceJobContext ctx;
+      while ((ctx = originalContext.getOngoingJobsQueue().poll()) != null) {
+        TableRebalanceManager.cancelRebalance(ctx.getTableName(), 
_pinotHelixResourceManager,
+            RebalanceResult.Status.CANCELLED);
+      }
+      TenantRebalanceChecker.markTenantRebalanceJobAsCancelled(jobId, 
jobMetadata, originalContext, progressStats,
+          _pinotHelixResourceManager, true);
+      return new SuccessResponse("Successfully cancelled tenant rebalance job: 
" + jobId);

Review Comment:
   Does it makes sense to also add some stats around how many table rebalance 
jobs were cancelled and how many were still in queue?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to