klsince commented on code in PR #15142:
URL: https://github.com/apache/pinot/pull/15142#discussion_r1982294881


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java:
##########
@@ -77,6 +88,7 @@ protected void processTable(String tableNameWithType) {
       return;
     }
 
+    // Did not understand

Review Comment:
   remove?



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/strategy/TimeRetentionStrategy.java:
##########
@@ -50,4 +50,17 @@ public boolean isPurgeable(String tableNameWithType, 
SegmentZKMetadata segmentZK
 
     return System.currentTimeMillis() - endTimeMs > _retentionMs;
   }
+
+  @Override
+  public boolean isPurgeable(String segmentName, String tableNameWithType, 
long endTimeMs) {

Review Comment:
   1. better move tableNameWithType as the first param for consistency
   2. can reuse this method to implement the method above
   3. perhaps call endTimeMs segmentTimeMs to be generic? and leave a comment 
that segmentTimeMs can be endTime or mtime etc. to be decided by the caller



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/SegmentDeletionManager.java:
##########
@@ -221,6 +221,7 @@ private void deleteSegmentMetadataFromStore(PinotFS 
pinotFS, URI segmentFileUri,
       URI segmentMetadataUri = 
SegmentPushUtils.generateSegmentMetadataURI(segmentFileUri.toString(), 
segmentId);
       if (pinotFS.exists(segmentMetadataUri)) {
         LOGGER.info("Deleting segment metadata {} from {}", segmentId, 
segmentMetadataUri);
+        // TODO: check if the deletion was successful and add a warning here.

Review Comment:
   is this TODO supposed to check the return of the delete() method and log 
warning if false? if so, maybe we can just add it in this PR. 



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java:
##########
@@ -136,11 +164,86 @@ private void manageRetentionForOfflineTable(String 
offlineTableName, RetentionSt
     }
   }
 
+  /**
+   * Identifies segments in deepstore that are ready for deletion based on the 
retention strategy.
+   *
+   * This method finds segments that are beyond the retention period and are 
ready to be purged.
+   * It only considers segments that do not have entries in ZooKeeper metadata.
+   * The lastModified time of the file in deepstore is used to determine 
whether the segment
+   * should be retained or purged.
+   *
+   * @param tableNameWithType   Name of the offline table
+   * @param retentionStrategy  Strategy to determine if a segment should be 
purged
+   * @param segmentsToExclude  List of segment names that should be excluded 
from deletion
+   * @return List of segment names that should be deleted from deepstore
+   * @throws IOException If there's an error accessing the filesystem
+   */
+  private List<String> getSegmentsToDeleteFromDeepstore(String 
tableNameWithType, RetentionStrategy retentionStrategy,
+      List<String> segmentsToExclude)
+      throws IOException {
+
+    List<String> segmentsToDelete = new ArrayList<>();
+    String rawTableName = 
TableNameBuilder.extractRawTableName(tableNameWithType);
+    URI tableDataUri = 
URIUtils.getUri(_pinotHelixResourceManager.getDataDir(), rawTableName);
+    PinotFS pinotFS = PinotFSFactory.create(tableDataUri.getScheme());
+
+    List<FileMetadata> deepstoreFiles = 
pinotFS.listFilesWithMetadata(tableDataUri, false);
+

Review Comment:
   maybe print a INFO log here about how many segments found, and the timestamp 
in the log, with another INFO log after the for-loop, can be used to calculate 
how long it'd take to figure out the files to delete.



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java:
##########
@@ -124,8 +138,19 @@ private void manageRetentionForTable(TableConfig 
tableConfig) {
   }
 
   private void manageRetentionForOfflineTable(String offlineTableName, 
RetentionStrategy retentionStrategy) {
+    List<SegmentZKMetadata> segmentZKMetadataList = 
_pinotHelixResourceManager.getSegmentsZKMetadata(offlineTableName);

Review Comment:
   Would it happen that segment's ZKMetadata was still there in ZK but segment 
was not in IS any more? 



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java:
##########
@@ -136,11 +164,86 @@ private void manageRetentionForOfflineTable(String 
offlineTableName, RetentionSt
     }
   }
 
+  /**
+   * Identifies segments in deepstore that are ready for deletion based on the 
retention strategy.
+   *
+   * This method finds segments that are beyond the retention period and are 
ready to be purged.
+   * It only considers segments that do not have entries in ZooKeeper metadata.
+   * The lastModified time of the file in deepstore is used to determine 
whether the segment
+   * should be retained or purged.
+   *
+   * @param tableNameWithType   Name of the offline table
+   * @param retentionStrategy  Strategy to determine if a segment should be 
purged
+   * @param segmentsToExclude  List of segment names that should be excluded 
from deletion
+   * @return List of segment names that should be deleted from deepstore
+   * @throws IOException If there's an error accessing the filesystem
+   */
+  private List<String> getSegmentsToDeleteFromDeepstore(String 
tableNameWithType, RetentionStrategy retentionStrategy,
+      List<String> segmentsToExclude)
+      throws IOException {
+
+    List<String> segmentsToDelete = new ArrayList<>();
+    String rawTableName = 
TableNameBuilder.extractRawTableName(tableNameWithType);
+    URI tableDataUri = 
URIUtils.getUri(_pinotHelixResourceManager.getDataDir(), rawTableName);
+    PinotFS pinotFS = PinotFSFactory.create(tableDataUri.getScheme());
+
+    List<FileMetadata> deepstoreFiles = 
pinotFS.listFilesWithMetadata(tableDataUri, false);
+
+    for (FileMetadata fileMetadata : deepstoreFiles) {
+      if (fileMetadata.isDirectory()) {
+        continue;
+      }
+
+      String segmentName = extractSegmentName(fileMetadata.getFilePath());
+      if (Strings.isEmpty(segmentName) || 
segmentsToExclude.contains(segmentName)) {
+        continue;
+      }
+
+      // determine whether the segment should be perged or not based on the 
last modified time of the file
+      long lastModifiedTime = fileMetadata.getLastModifiedTime();
+
+      if (retentionStrategy.isPurgeable(segmentName, tableNameWithType, 
lastModifiedTime)) {
+        segmentsToDelete.add(segmentName);
+      }
+    }
+

Review Comment:
   add an INFO log about how many segments to delete



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to