mcvsubbu commented on a change in pull request #6778: URL: https://github.com/apache/incubator-pinot/pull/6778#discussion_r618835210
########## File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java ########## @@ -1214,4 +1243,86 @@ private int getMaxNumPartitionsPerInstance(InstancePartitions instancePartitions return (numPartitions + numInstancesPerReplicaGroup - 1) / numInstancesPerReplicaGroup; } } + + /** + * Validate the committed low level consumer segments to see if its segment store copy is available. Fix the missing segment store copy by asking servers to upload to segment store. + * Since uploading to segment store involves expensive compression step (first tar up the segment and then upload), we don't want to retry the uploading. Segment without segment store copy can still be downloaded from peer servers. + * @see <a href="https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion#BypassingdeepstorerequirementforRealtimesegmentcompletion-Failurecasesandhandling">By-passing deep-store requirement for Realtime segment completion:Failure cases and handling</a> + */ + public void uploadToSegmentStoreIfMissing(TableConfig tableConfig) { + Preconditions.checkState(!_isStopping, "Segment manager is stopping"); + + String realtimeTableName = tableConfig.getTableName(); + // Get all the LLC segment ZK metadata for this table + List<LLCRealtimeSegmentZKMetadata> segmentZKMetadataList = ZKMetadataProvider.getLLCRealtimeSegmentZKMetadataListForTable(_propertyStore, realtimeTableName); + + // Iterate through llc segments and upload missing segment store copy by following steps: + // 1. Ask servers which have online segment replica to upload to segment store. Servers return segment store download url after successful uploading. + // 2. Update the llc segment ZK metadata by adding segment store download url. + for (LLCRealtimeSegmentZKMetadata segmentZKMetadata : segmentZKMetadataList) { Review comment: I thought of that, but what if the fix was unsuccessful for multple attempts? (we can always give up, of course). Another way is to keep a separate queue in zk of all the segments that are in peer state. This is more tricky than it sounds (e.g. we may complete the segment with "peer" url and fail to add to this queue), so I am not certain. Another way to do this is to maintain an in-memory queue in the controller. Update the queue each time we have a new segment that is in this state. When the controller starts up, fetch all segments(or recent segments) and populate the queue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org