9aman opened a new pull request, #15087:
URL: https://github.com/apache/pinot/pull/15087

   Issue: https://github.com/apache/pinot/issues/15059
   
   - The setup() function only waits till all the documents have been loaded 
for the pauseless table
   - This has lead to race condition between the commit protocol and validation 
manager run leading to test failures
   - Checking the completion of the commit protocol i.e. segment marked 
COMMITTING before triggering validation manager in the tests prevents this.
   
   ## Logs from failed tests
   
   Commit start calls:
   
   ```
   12:29:58.856 WARN [RealtimeSegmentDataManager_mytable__1__0__20250219T0659Z] 
[mytable__1__0__20250219T0659Z] CommitStart failed  with response 
{"streamPartitionMsgOffset":null,"buildTimeSec":-1,"isSplitCommitType":true,"status":"FAILED"}
   
   12:29:59.954 WARN [RealtimeSegmentDataManager_mytable__0__0__20250219T0659Z] 
[mytable__0__0__20250219T0659Z] CommitStart failed  with response 
{"streamPartitionMsgOffset":null,"buildTimeSec":-1,"isSplitCommitType":true,"status":"FAILED"}
   ```
   
   The cluster is marked ready to test even before this commit call. This 
commit call changes the segment ZK metadata and hence impacts the mTime for the 
ZK node.
   
   mTime is used to verify whether a segment should be fixed by validation 
manager or not.
   
   ```
   12:30:09.677 ERROR [FailureInjectingPinotLLCRealtimeSegmentManager] [main] 
Segment: mytable__0__0__20250219T0659Z does not exceed the max completion time: 
10000ms, metadata update time: 1739948399951, current time: 1739948409676
   12:30:09.677 ERROR [FailureInjectingPinotLLCRealtimeSegmentManager] [main] 
Segment: mytable__1__0__20250219T0659Z exceeds the max completion time: 
10000ms, metadata update time: 1739948398851, current time: 1739948409676
   ```
   
   `Thus mytable__1__0__20250219T0659Z` whose commit start call is first is 
fixed by the validation manager while `mytable__0__0__20250219T0659Z` is not. 
   
   
   ## Changes:
   Ensure that the commit start completes before the test begins.
   
   ### Testing these changes
   The test was failing in local before the changes were made. The test was run 
10 times using the following command: 
   
   `TEST_CMD='mvn test -pl pinot-integration-tests 
-Dtest=org.apache.pinot.integration.tests.PauselessRealtimeIngestionNewSegmentMetadataCreationFailureTest'
   `
   
   and following is the output for the same. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to