mayankshriv commented on a change in pull request #5764:
URL: https://github.com/apache/incubator-pinot/pull/5764#discussion_r461970668



##########
File path: 
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/SegmentFetcherAndLoader.java
##########
@@ -184,39 +186,48 @@ private boolean isNewSegmentMetadata(String 
tableNameWithType, OfflineSegmentZKM
 
   private String downloadSegmentToLocal(String uri, PinotCrypter crypter, 
String tableName, String segmentName)
       throws Exception {
-    File tempDir = new File(new 
File(_instanceDataManager.getSegmentFileDirectory(), tableName),
-        "tmp-" + segmentName + "-" + UUID.randomUUID());
-    FileUtils.forceMkdir(tempDir);
-    File tempDownloadFile = new File(tempDir, segmentName + ENCODED_SUFFIX);
-    File tempTarFile = new File(tempDir, segmentName + TAR_GZ_SUFFIX);
-    File tempSegmentDir = new File(tempDir, segmentName);
-    try {
-      SegmentFetcherFactory.fetchSegmentToLocal(uri, tempDownloadFile);
-      if (crypter != null) {
-        crypter.decrypt(tempDownloadFile, tempTarFile);
-      } else {
-        tempTarFile = tempDownloadFile;
-      }
-
-      LOGGER
-          .info("Downloaded tarred segment: {} for table: {} from: {} to: {}, 
file length: {}", segmentName, tableName,
-              uri, tempTarFile, tempTarFile.length());
+    // Even if the tar file has been downloaded successfully, the file itself 
could be corrupted during the transmission.

Review comment:
       Agree, however, we did observe in production that the file downloaded 
was corrupt (likely truncated), and restarting the server downloaded the 
segment again and was able to untar it.
   
   Perhaps, we can start off by first emitting a metric and alerting on it to 
see how often we run into the issue. If this was a one-off due to manual 
intervention, then perhaps retry is not warranted. But if there's a systematic 
problem, we should try to understand what's causing the corruption, and then 
decide whether retry is warranted.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to