chenboat commented on code in PR #12317: URL: https://github.com/apache/pinot/pull/12317#discussion_r1476843904
########## pinot-common/src/main/java/org/apache/pinot/common/utils/fetcher/BaseSegmentFetcher.java: ########## @@ -109,6 +112,38 @@ public File fetchUntarSegmentToLocalStreamed(URI uri, File dest, long rateLimit, throw new UnsupportedOperationException(); } + // Download segment to a local location with retries. + @Override + public boolean fetchSegmentToLocal(String segmentName, File dest, HelixManager helixManager, String downloadScheme) + throws Exception { + try { + int attempt = + RetryPolicies.exponentialBackoffRetryPolicy(_retryCount, _retryWaitMs, _retryDelayScaleFactor).attempt(() -> { + // First find servers hosting the segment in ONLINE state. + List<URI> peerSegmentURIs = + PeerServerSegmentFinder.getPeerServerURIs(segmentName, downloadScheme, helixManager); + // Shuffle the list of URIs. + Collections.shuffle(peerSegmentURIs); + // Next get through the list of URIs to fetch the segment until success. + for (URI uri : peerSegmentURIs) { + try { + fetchSegmentToLocalWithoutRetry(uri, dest); Review Comment: 3 is the default retry_count and can be changed. Yes. This PR intentionally retries all the peers servers in a random order until success. The goal is to maximize the success rate to improve system availability. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org