tarun11Mavani opened a new pull request, #17356: URL: https://github.com/apache/pinot/pull/17356
### Problem During upsert compact merge operations, merged segment are created with the same creation time as the maximum creation time of the merging segments. If the merging segment are UPLOADED segment (merged earlier), they share the same create time as merged segment. This means that records in segment with highest creation time is not replaced due to the tie-breaking logic in [shouldReplaceOnComparisonTie](https://github.com/apache/pinot/blob/52db36c816f91ef8887fddd0beade5d169824296/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java#L518). Not replacing records from this segment could lead to dataloss as discussed in in #17337. ### Solution We set the creation time of merged segment = max(creation time of all segment) + 1. This ensures that the merging segment takes priority and all records in existing segment are replaced with records in new merged segment. ### Test Tested in a test cluster. Verified that the new merging segment has the creation time as expected. Validated that all records from merging segment were replaced with merged segment. All compacted segments were deleted successfully in next task iteration. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
