[ 
https://issues.apache.org/jira/browse/GEODE-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188638#comment-17188638
 ] 

Xiaojian Zhou commented on GEODE-8475:
--------------------------------------

This fix is actually a patch to GEODE-5748. Gaining the writeLock of 
lockFailedInitialImageWriteLock (lock-B) was introduced there. But at that 
time, we did not realized this potential deadlock and our tests did not find 
the issue either. 

This patch can be applied back to old release up to 1.8 if necessary. 

> Resolve a potential dead lock in ParallelGatewaySenderQueue 
> ------------------------------------------------------------
>
>                 Key: GEODE-8475
>                 URL: https://issues.apache.org/jira/browse/GEODE-8475
>             Project: Geode
>          Issue Type: Improvement
>            Reporter: Xiaojian Zhou
>            Assignee: Xiaojian Zhou
>            Priority: Major
>              Labels: GeodeOperationAPI, pull-request-available
>
> When brq is created but encountered a failed GII, enqueue to it could have a 
> potential deadlock:
> Thread-1:
> ParallelGatewaySenderQueue.put() will get a 
> brq.getInitializationLock().readLock().lock() (lock-A’s read lock). Then 
> during the put operation, it will try to call lockWhenRegionIsInitializing() 
> to get failedInitialImageLock.readLock().lock (lock-B’s read lock)
> Thread-2: 
> PRDS.createBucketRegion() will trigger GII but failed. So it will call 
> cleanUpAfterFailedGII(), where it will call lockFailedInitialImageWriteLock
> () to get lock-B’s write lock first. Then call 
> BucketRegionQueue.clearEntries().
> It will call getInitializationLock().writeLock().lock() (lock-A’s write lock).
> To fix it, we need to let thread-1 to get failedInitialImageLock.readLock() 
> (lock-B) before requesting lock-A. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to