vvivekiyer opened a new pull request #8110:
URL: https://github.com/apache/pinot/pull/8110


   The current implementation doesn't protect against concurrent
   UploadSegment calls for a non-existing segment.
   
   The current implementation of ProcessNewSegment is as follows:
   1. Move segment to permanent directory
   2. Set SegmentZkMetadata
   3. AssignTableSegments
   4. Update customZkMetadata map
   
   Assume there are two concurrent uploadSegment calls for the same
   non-existing segment:
   1. Calls C1 and C2 will try to move to permanent directory. There
      could be a race here and either C1 or C2 can win. Additionally,
      move() implementation in PinotFS is not atomic.
   2. Now, only one of C1 and C2 can set SegmentZkMetadata. This is
      because updates to Zk are made atomic with expectedVersion. But
      there is no guarantee that the call that succeeded to move the
      segment to permanent location sets the Zk metadata.
   
   To fix this issue, enableParallelPushProtection can be extended
   to protect against concurrent calls when uploading non-existent
   segments as well.
   
   Made the following changes:
   1. Added locking to ProcessNewSegment codepath if
      enableParallelPushProtection is enabled.
   2. Reordered code so that "MoveSegmentToFinalLocation",
      AssignTableSegments, customZkMetadata map update is done within
      the lock.
   3. Enhanced exception handling to clean up segmentZkMetadata and stale
      segments in permanent directory.
   
   Testing:
   1. Unit test verifying that lock is released after upload when
      enableParallelPushProtection=true for non-existing segments.
   2. Unit test verifying that segmentZkMetadata is cleaned up if
      exception is raised.
   3. Integration test uploading concurrent segments with the same
      name.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to