Jackie-Jiang commented on code in PR #10915:
URL: https://github.com/apache/pinot/pull/10915#discussion_r1262144652


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -247,6 +267,17 @@ public void addSegment(ImmutableSegmentImpl segment, 
@Nullable ThreadSafeMutable
     Lock segmentLock = SegmentLocks.getSegmentLock(_tableNameWithType, 
segmentName);
     segmentLock.lock();
     try {
+      // Skip adding segments that has segment EndTime in the comparison cols 
earlier than (largestSeenTimestamp - TTL).

Review Comment:
   (MAJOR) Just realized a problem. If we return here, the valid doc ids and 
queryable doc ids won't be set in the segment. The correct solution should be:
   - If `delete` is not enabled, set valid doc ids snapshot into the segment
   - If `delete` is enabled, since we don't have snapshot for queryable doc 
ids, we need to loop over valid doc ids to create the queryable doc ids (find 
non-delete records)
   
   Can you add a test to catch this problem? With current solution, after 
restarting all the invalid docs will show up again



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -556,6 +659,32 @@ protected void finishOperation() {
     }
   }
 
+  /**
+   * When TTL is enabled for upsert, this function is used to remove expired 
keys from the primary key indexes.
+   * Primary keys that has comparison value earlier than 
largestSeenComparisonValue - TTL value will be removed.
+   */
+  @Override
+  public void removeExpiredPrimaryKeys() {
+    // If upsertTTL is enabled, we will remove expired primary keys from 
upsertMetadata after taking snapshot.
+    if (_metadataTTL == 0) {

Review Comment:
   To be more robust
   ```suggestion
       if (_metadataTTL <= 0) {
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -367,6 +398,17 @@ public void replaceSegment(ImmutableSegment segment, 
@Nullable ThreadSafeMutable
     Lock segmentLock = SegmentLocks.getSegmentLock(_tableNameWithType, 
segmentName);
     segmentLock.lock();
     try {
+      // Skip adding segments that has segment EndTime in the comparison cols 
earlier than (largestSeenTimestamp - TTL).
+      // Note: We only update largestSeenComparisonValue when addRecord, and 
access the value when addOrReplaceSegments.
+      // We only support single comparison column for TTL-enabled upsert 
tables.
+      if (_largestSeenComparisonValue > 0) {

Review Comment:
   We cannot skip replacing segment because the new segment doesn't have 
snapshot, thus we don't know which docs are valid. Unfortunately I don't see a 
good way to handle it if user is replacing a segment that is out of TTL. Or we 
only allow user to upload segment without any invalid doc. We need to add more 
comments explaining the logic.
   Also, we can move this check outside of the segment lock



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to