mcvsubbu commented on a change in pull request #6778:
URL: https://github.com/apache/incubator-pinot/pull/6778#discussion_r618833361



##########
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
##########
@@ -1214,4 +1239,86 @@ private int 
getMaxNumPartitionsPerInstance(InstancePartitions instancePartitions
       return (numPartitions + numInstancesPerReplicaGroup - 1) / 
numInstancesPerReplicaGroup;
     }
   }
+
+  /**
+   * Validate the committed low level consumer segments to see if its segment 
store copy is available. Fix the missing segment store copy by asking servers 
to upload to segment store.
+   * Since uploading to segment store involves expensive compression step 
(first tar up the segment and then upload), we don't want to retry the 
uploading. Segment without segment store copy can still be downloaded from peer 
servers.
+   * @see <a 
href="https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion#BypassingdeepstorerequirementforRealtimesegmentcompletion-Failurecasesandhandling";>By-passing
 deep-store requirement for Realtime segment completion:Failure cases and 
handling</a>
+   */
+  public void uploadToSegmentStoreIfMissing(TableConfig tableConfig) {
+    Preconditions.checkState(!_isStopping, "Segment manager is stopping");
+
+    String realtimeTableName = tableConfig.getTableName();
+    // Get all the LLC segment ZK metadata for this table
+    List<LLCRealtimeSegmentZKMetadata> segmentZKMetadataList = 
ZKMetadataProvider.getLLCRealtimeSegmentZKMetadataListForTable(_propertyStore, 
realtimeTableName);
+
+    // Iterate through llc segments and upload missing segment store copy by 
following steps:
+    //  1. Ask servers which have online segment replica to upload to segment 
store. Servers return segment store download url after successful uploading.
+    //  2. Update the llc segment ZK metadata by adding segment store download 
url.
+    for (LLCRealtimeSegmentZKMetadata segmentZKMetadata : 
segmentZKMetadataList) {
+      String segmentName = segmentZKMetadata.getSegmentName();
+      // Only fix the committed llc segment without segment store copy
+      if (segmentZKMetadata.getStatus() == Status.DONE && 
(segmentZKMetadata.getDownloadUrl() == null || 
segmentZKMetadata.getDownloadUrl().isEmpty() || 
segmentZKMetadata.getDownloadUrl().equals(CommonConstants.Segment.METADATA_URI_FOR_PEER_DOWNLOAD)))
 {
+        try {
+          if (!isExceededMaxSegmentCompletionTime(realtimeTableName, 
segmentName, getCurrentTimeMs())) {

Review comment:
       The update of URL and the status to DONE will be atomic. So, if the URL 
says "peer", then "peer" it is. So, we don't need this check for correctness 
sake.
   
   That said, it is good to elapse some time after the commit (not the same 
time as maxSegmentCompletionTime), because it is possible that the segment just 
completed (as you say), and the servers are still in the process of loading the 
segment. It is not yet ONLINE in externalview. So, it is good to have the check 
for performance sake.
   
   I would say, declare a new time limit (even if it is the same value as the 
other one), and use that. The other one is needed for correction, so please 
don't re-use it for this purpose.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to