[ 
https://issues.apache.org/jira/browse/GEODE-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746621#comment-16746621
 ] 

Kirk Lund commented on GEODE-6255:
----------------------------------

Attempts to fix the deadlock by changing ManagementListener to use a dedicated 
thread or to not use ReadWriteLock cause lots of other NullPointerExceptions in 
various places. Those failures would probably require lots of changes to code 
invoked by ManagementListener.

The best way to fix this might be to move the creation of the default disk 
store before the creation of the persistent region that requires it.

> ManagementListener may deadlock with Cache close
> ------------------------------------------------
>
>                 Key: GEODE-6255
>                 URL: https://issues.apache.org/jira/browse/GEODE-6255
>             Project: Geode
>          Issue Type: Bug
>          Components: management
>            Reporter: Kirk Lund
>            Assignee: Kirk Lund
>            Priority: Major
>
> This is a product deadlock that was discovered by analyzing a dunit hang 
> (GEODE-6232).
> {noformat}
> Java stack information for the threads listed above:
> ===================================================
> "Distributed system shutdown hook":
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1348)
>       - waiting to lock <0x00000006c010d508> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.lambda$static$0(InternalDistributedSystem.java:2328)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$$Lambda$6/1645228084.run(Unknown
>  Source)
>       at java.lang.Thread.run(Thread.java:748)
> "pool-1-thread-2":
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.removeRoot(GemFireCacheImpl.java:3577)
>       - waiting to lock <0x0000000773583c28> (a java.util.HashMap)
>       at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6333)
>       at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1755)
>       at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6255)
>       at 
> org.apache.geode.internal.cache.LocalRegion.localDestroyRegion(LocalRegion.java:2242)
>       at 
> org.apache.geode.internal.cache.AbstractRegion.localDestroyRegion(AbstractRegion.java:430)
>       at 
> org.apache.geode.management.internal.ManagementResourceRepo.destroyLocalMonitoringRegion(ManagementResourceRepo.java:73)
>       at 
> org.apache.geode.management.internal.LocalManager.cleanUpResources(LocalManager.java:260)
>       at 
> org.apache.geode.management.internal.LocalManager.stopManager(LocalManager.java:388)
>       at 
> org.apache.geode.management.internal.SystemManagementService.close(SystemManagementService.java:239)
>       - locked <0x000000077361b900> (a java.util.HashMap)
>       at 
> org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheRemoval(ManagementAdapter.java:737)
>       at 
> org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:119)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2127)
>       - locked <0x00000006c010d508> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1966)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1956)
>       at 
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.closeCache(CreateDestroyRegionRegressionTest.java:119)
>       at 
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$1(CreateDestroyRegionRegressionTest.java:93)
>       at 
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$3/1456208737.run(Unknown
>  Source)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> "pool-1-thread-1":
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000007735ff8e0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>       at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>       at 
> org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:110)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
>       at 
> org.apache.geode.internal.cache.DiskStoreFactoryImpl.create(DiskStoreFactoryImpl.java:144)
>       - locked <0x0000000773583ac8> (a 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getOrCreateDefaultDiskStore(GemFireCacheImpl.java:2566)
>       - locked <0x0000000773583ac8> (a 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>       at 
> org.apache.geode.internal.cache.LocalRegion.findDiskStore(LocalRegion.java:7600)
>       at 
> org.apache.geode.internal.cache.LocalRegion.<init>(LocalRegion.java:647)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3023)
>       - locked <0x0000000773583c28> (a java.util.HashMap)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:2956)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:2944)
>       at org.apache.geode.cache.RegionFactory.create(RegionFactory.java:755)
>       at 
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.createRegionWithDefaultDiskStore(CreateDestroyRegionRegressionTest.java:105)
>       at 
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$0(CreateDestroyRegionRegressionTest.java:92)
>       at 
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$2/901506536.run(Unknown
>  Source)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Found 1 deadlock.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to