rohityadav1993 commented on code in PR #17249:
URL: https://github.com/apache/pinot/pull/17249#discussion_r2581651770


##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java:
##########
@@ -438,10 +442,10 @@ protected void 
replaceSegmentIfCrcMismatch(SegmentDataManager segmentDataManager
       _logger.info("Segment: {} has CRC: {} same as before, not replacing it", 
segmentName, localMetadata.getCrc());
       return;
     }
-    if (!_instanceDataManagerConfig.shouldCheckCRCOnSegmentLoad()) {
+    if (!_instanceDataManagerConfig.shouldCheckCRCOnSegmentLoad() || 
_skipCrcCheckForThisTable) {

Review Comment:
   We lost all replicas of a segment.
   Set up:
   2 server replica + 1 deepstore copy
   
   Failure scenario happens when segment commit + CRC mismatch in replicas + 
leader servers's host replacement happens. Following is the sequence:
   ```
   1. Segment commit in progress
   2. Leader server tries to upload segment to deepstore and fails
   4. Follower server builds segment but invalidates its copy due to CRC 
mismatch
   5. Follower server tries deepstore download but fails due to missing copy
   6. leader server host starts getting replaced due to regular infra activity
   7. Follower segment fails to download after retries from the only peer which 
is unavailable
   8. Follower server segment goes in error state and when leader server comes 
up it goes in error state.
   ```
   
   While we have internal infra improvement items we are pursuing two tracks:
   
   Short term (Faster to adopt): 
   Make ignore CRC config introduced in 
https://github.com/apache/pinot/pull/16432 to a table level config so we can 
selectively apply it to tables unaffected by the side effect.
   
   Long term:
   Data CRC (#17264) is the ideal way and when this becomes de-facto for CRC 
comparison,  move away from the previous config.
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to