mcvsubbu commented on a change in pull request #7969: URL: https://github.com/apache/pinot/pull/7969#discussion_r778468032
########## File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/LoaderUtils.java ########## @@ -161,4 +160,36 @@ public static void reloadFailureRecovery(File indexDir) FileUtils.forceDelete(segmentTempDir); } } + + public static void createBackup(File indexDir) + throws IOException { + if (!indexDir.exists()) { + return; + } + File parentDir = indexDir.getParentFile(); + File segmentBackupDir = new File(parentDir, indexDir.getName() + CommonConstants.Segment.SEGMENT_BACKUP_DIR_SUFFIX); + // Rename index directory to segment backup directory (atomic) + Preconditions.checkState(indexDir.renameTo(segmentBackupDir), + "Failed to rename index directory: %s to segment backup directory: %s", indexDir, segmentBackupDir); + // Copy the backup dir back to proceed. + FileUtils.copyDirectory(segmentBackupDir, indexDir); Review comment: Agree that the rename happens the same way, but prior to this PR, we just downloaded the segment. Looks like we are now copying after the rename (and then perhaps overwriting what we copied?) ########## File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java ########## @@ -275,53 +278,23 @@ public void addSegmentError(String segmentName, SegmentErrorInfo segmentErrorInf @Override public void reloadSegment(String segmentName, IndexLoadingConfig indexLoadingConfig, SegmentZKMetadata zkMetadata, - SegmentMetadata localMetadata, @Nullable Schema schema, boolean forceDownload) + SegmentMetadata segmentMetadata, @Nullable Schema schema, boolean forceDownload) throws Exception { - File indexDir = localMetadata.getIndexDir(); - Preconditions.checkState(indexDir.isDirectory(), "Index directory: %s is not a directory", indexDir); - - File parentFile = indexDir.getParentFile(); - File segmentBackupDir = - new File(parentFile, indexDir.getName() + CommonConstants.Segment.SEGMENT_BACKUP_DIR_SUFFIX); - + File indexDir = getSegmentDataDir(segmentName); + // Create backup dir to make segment reloading atomic for local tier backend. + LoaderUtils.createBackup(indexDir); try { - // First rename index directory to segment backup directory so that original segment have all file descriptors - // point to the segment backup directory to ensure original segment serves queries properly - - // Rename index directory to segment backup directory (atomic) - Preconditions.checkState(indexDir.renameTo(segmentBackupDir), - "Failed to rename index directory: %s to segment backup directory: %s", indexDir, segmentBackupDir); - - // Download from remote or copy from local backup directory into index directory, - // and then continue to load the segment from index directory. - boolean shouldDownload = forceDownload || !hasSameCRC(zkMetadata, localMetadata); - if (shouldDownload && allowDownload(segmentName, zkMetadata)) { - if (forceDownload) { - LOGGER.info("Segment: {} of table: {} is forced to download", segmentName, _tableNameWithType); - } else { - LOGGER.info("Download segment:{} of table: {} as local crc: {} mismatches remote crc: {}", segmentName, - _tableNameWithType, localMetadata.getCrc(), zkMetadata.getCrc()); - } - indexDir = downloadSegment(segmentName, zkMetadata); + boolean shouldDownloadRawSegment = forceDownload || !hasSameCRC(zkMetadata, segmentMetadata); + if (shouldDownloadRawSegment && allowDownloadRawSegment(segmentName, zkMetadata)) { + downloadRawSegmentAndProcess(segmentName, indexLoadingConfig, zkMetadata, schema); Review comment: What does it mean to have a "local" vs "non-local" tieredBackend? How is a segment expected to be different in a tierBackend? (Or, should we have this discussion in the design doc?) I am fine abstracting things as long as the comments in the code are clear about what the interfaces are supposed to do, and what layout of the cluster are we coding for -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org