wirybeaver commented on code in PR #17521:
URL: https://github.com/apache/pinot/pull/17521#discussion_r2927627134


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTableRestletResource.java:
##########
@@ -372,6 +388,9 @@ public CopyTableResponse copyTable(
         response.setTableConfig(realtimeTableConfig);
         response.setWatermarkInductionResult(watermarkInductionResult);
       }
+      String jobID = UUID.randomUUID().toString();
+      _tableReplicator.replicateTable(jobID, 
realtimeTableConfig.getTableName(), copyTablePayload,

Review Comment:
   Yeah, the design doc has mentioned this failure scenario. Users need to 
delete the target table and copy from beginning again at the moment.
   
   The Mid term solution: provider a resume api to the user if they notice the 
progress get stuck for a long while. The api will cancel the job on the old 
controller first and then figure out what the rest of the segments need to be 
backfilled by (source table segments - source segments whose sequence >= 
watermark - target segments); Finally, re-trigger the copy.
   
   The Long term solution:  the cluster's lead controller run a background 
check thread: (1) scan the table replication controller jobs; (2) If the 
controller behind the job UUID is dead, select a new controller and hit the 
/resume endpoint I mention above; (3) If the controller behind the job UUID is 
not dead, send a HelixMessage containing the id of the controller being 
assigned with the job. If the controller notice the id equals to self and the 
local executor service doesn't have associated job UUID, recreate the copy 
tasks and insert into execution thread.
   
   Given that this PR has 2000+ lines already, I prefer to deferring such 
improvement as a follow up PR and design. This PR provides the skeleton and 
basic feature.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to