tibrewalpratik17 commented on code in PR #13285: URL: https://github.com/apache/pinot/pull/13285#discussion_r1635375547
########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java: ########## @@ -874,19 +879,32 @@ protected void doTakeSnapshot() { numConsumingSegments++; continue; } - ImmutableSegmentImpl immutableSegment = (ImmutableSegmentImpl) segment; - if (!immutableSegment.hasValidDocIdsSnapshotFile()) { - segmentsWithoutSnapshot.add(immutableSegment); + if (!_updatedSegmentsSinceLastSnapshot.contains(segment)) { + // if no updates since last snapshot then skip continue; } - immutableSegment.persistValidDocIdsSnapshot(); - numImmutableSegments++; - numPrimaryKeysInSnapshot += immutableSegment.getValidDocIds().getMutableRoaringBitmap().getCardinality(); + try { + ImmutableSegmentImpl immutableSegment = (ImmutableSegmentImpl) segment; + if (!immutableSegment.hasValidDocIdsSnapshotFile()) { + segmentsWithoutSnapshot.add(immutableSegment); + continue; + } + immutableSegment.persistValidDocIdsSnapshot(); + _updatedSegmentsSinceLastSnapshot.remove(segment); + numImmutableSegments++; + numPrimaryKeysInSnapshot += immutableSegment.getValidDocIds().getMutableRoaringBitmap().getCardinality(); + } catch (Exception e) { + _logger.warn("Caught exception while taking snapshot for segment: {}, skipping", segment.getSegmentName(), e); Review Comment: Very good point! Ah I see this method: `addSegmentWithoutUpsert` is used in `preloadSegment` flow and uses the `put` method directly. But it seems the ordering is not guaranteed using `ConcurrentHashMap.newKeySet()` ([link](https://javadoc.io/static/net.sf.ehcache/ehcache/2.10.6/net/sf/ehcache/util/concurrent/ConcurrentHashMap.html#newKeySet--:~:text=Because%20the%20elements%20of%20a%20ConcurrentHashMap%20are%20not%20ordered%20in%20any%20particular%20way%2C%20and%20may%20be%20processed%20in%20different%20orders%20in%20different%20parallel%20executions%2C%20the%20correctness%20of%20supplied%20functions%20should%20not%20depend%20on%20any%20ordering)) and we may end up in this situation even in the present scenario or by skipping early. Right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org