klsince commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1635324871


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -874,19 +879,32 @@ protected void doTakeSnapshot() {
         numConsumingSegments++;
         continue;
       }
-      ImmutableSegmentImpl immutableSegment = (ImmutableSegmentImpl) segment;
-      if (!immutableSegment.hasValidDocIdsSnapshotFile()) {
-        segmentsWithoutSnapshot.add(immutableSegment);
+      if (!_updatedSegmentsSinceLastSnapshot.contains(segment)) {
+        // if no updates since last snapshot then skip
         continue;
       }
-      immutableSegment.persistValidDocIdsSnapshot();
-      numImmutableSegments++;
-      numPrimaryKeysInSnapshot += 
immutableSegment.getValidDocIds().getMutableRoaringBitmap().getCardinality();
+      try {
+        ImmutableSegmentImpl immutableSegment = (ImmutableSegmentImpl) segment;
+        if (!immutableSegment.hasValidDocIdsSnapshotFile()) {
+          segmentsWithoutSnapshot.add(immutableSegment);
+          continue;
+        }
+        immutableSegment.persistValidDocIdsSnapshot();
+        _updatedSegmentsSinceLastSnapshot.remove(segment);
+        numImmutableSegments++;
+        numPrimaryKeysInSnapshot += 
immutableSegment.getValidDocIds().getMutableRoaringBitmap().getCardinality();
+      } catch (Exception e) {
+        _logger.warn("Caught exception while taking snapshot for segment: {}, 
skipping", segment.getSegmentName(), e);

Review Comment:
   There could be a potential issue to skip failed segments. If we skipped some 
segments due to failure but have successfully taken snapshot for the latest 
committed segment, the bitmaps persisted on disk might have valid docs with 
duplicate PKs. With duplicate PKs, the preloading logic is not ensured to be 
correct, as it assumes that bitmaps on disk represent a set of valid docs all 
with unique PKs so that it can simply do Map.put() instead of more costly 
read-check-write.
   
   If the try-catch here was meant to avoid blocking ingestion from snapshot 
failure, perhaps better to return early upon failure, instead of skipping. 
thoughts?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to