[ https://issues.apache.org/jira/browse/GEODE-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119814#comment-17119814 ]
Eric Shu commented on GEODE-8204: --------------------------------- Test run artifacts are: =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0090/test-results/upgradeTest/1590765052/ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test report artifacts from this job are available at: http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0090/test-artifacts/1590765052/upgradetestfiles-OpenJDK8-1.14.0-build.0090.tgz > CI Failure: > RollingUpgradeQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion.luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion > hang > -------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: GEODE-8204 > URL: https://issues.apache.org/jira/browse/GEODE-8204 > Project: Geode > Issue Type: Bug > Reporter: Eric Shu > Priority: Major > > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UpgradeTestOpenJDK8/builds/217#A > run timed out. > The dunit-hangs.txt shows two tests not finished, and I think the > ClientServerMiscBCDUnitTest giiEventQueueFromOldToCurrentMemberShouldSucceed > is not really hang. It is just a little hard to read the stack dumps as they > are for both test runs. > Started @ 2020-05-29 14:11:25.687 +0000 > 2020-05-29 14:59:03.911 +0000 > org.apache.geode.internal.cache.tier.sockets.ClientServerMiscBCDUnitTest > giiEventQueueFromOldToCurrentMemberShouldSucceed[12] > Ended @ 2020-05-29 14:59:03.911 +0000 > Started @ 2020-05-29 14:11:25.635 +0000 > 2020-05-29 14:46:47.615 +0000 > org.apache.geode.cache.lucene.RollingUpgradeQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion > > luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion[from_v1.9.2, > Ended @ 2020-05-29 14:56:25.635 +0000 > There are multiple DLock requesters for LockGrantor during > PartitionedRepositoryManager.computeRepository(). > {noformat} > "Pooled Waiting Message Processor 40" #163 daemon prio=5 os_prio=0 > tid=0x00007fc94c043000 nid=0x955c waiting on condition [0x00007fc9d07ad000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000e351c5c0> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.internal.util.concurrent.FutureResult.get(FutureResult.java:69) > at > org.apache.geode.distributed.internal.locks.DLockService.waitForLockGrantorFutureResult(DLockService.java:750) > at > org.apache.geode.distributed.internal.locks.DLockService.getLockGrantorId(DLockService.java:301) > at > org.apache.geode.distributed.internal.locks.DLockService.lockInterruptibly(DLockService.java:1445) > at > org.apache.geode.distributed.internal.locks.DLockService.lock(DLockService.java:1241) > at > org.apache.geode.distributed.internal.locks.DLockService.lock(DLockService.java:1232) > at > org.apache.geode.distributed.internal.locks.DLockService.lock(DLockService.java:1227) > at > org.apache.geode.distributed.internal.locks.DLockService.lock(DLockService.java:1223) > at > org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.finishComputingRepository(IndexRepositoryFactory.java:105) > at > org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:65) > at > org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:151) > at > org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.lambda$computeRepository$1(PartitionedRepositoryManager.java:170) > at > org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager$$Lambda$308/437574105.apply(Unknown > Source) > at > java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) > - locked <0x00000000e35149f0> (a > java.util.concurrent.ConcurrentHashMap$ReservationNode) > at > org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:162) > at > org.apache.geode.cache.lucene.internal.LuceneBucketListener.lambda$afterPrimary$0(LuceneBucketListener.java:40) > at > org.apache.geode.cache.lucene.internal.LuceneBucketListener$$Lambda$307/1978619753.run(Unknown > Source) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.doWaitingThread(ClusterOperationExecutors.java:411) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$86/1114445410.invoke(Unknown > Source) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$82/1488044985.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) > Locked ownable synchronizers: > - <0x00000000e3596b58> (a > java.util.concurrent.ThreadPoolExecutor$Worker) > {noformat} > The remote process that should processing the requests are being closed and > is wait for GrantorRequestProcessor.clearGrantor to finish: > {noformat} > "RMI TCP Connection(1)-172.17.0.30" #32 daemon prio=5 os_prio=0 > tid=0x00007f8178001800 nid=0x939b waiting on condition [0x00007f826f269000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000e1982990> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247) > at > java.util.concurrent.locks.ReentrantLock.tryLock(ReentrantLock.java:442) > at > org.apache.geode.internal.util.concurrent.StoppableReentrantLock.lockInterruptibly(StoppableReentrantLock.java:85) > at > org.apache.geode.internal.util.concurrent.StoppableReentrantLock.lock(StoppableReentrantLock.java:71) > at > org.apache.geode.distributed.internal.ClusterElderManager.initializeElderState(ClusterElderManager.java:102) > at > org.apache.geode.distributed.internal.ClusterElderManager.getElderState(ClusterElderManager.java:97) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.getElderState(ClusterDistributionManager.java:2091) > at > org.apache.geode.distributed.internal.locks.GrantorRequestProcessor.startElderCall(GrantorRequestProcessor.java:227) > at > org.apache.geode.distributed.internal.locks.GrantorRequestProcessor.basicOp(GrantorRequestProcessor.java:336) > at > org.apache.geode.distributed.internal.locks.GrantorRequestProcessor.basicOp(GrantorRequestProcessor.java:319) > at > org.apache.geode.distributed.internal.locks.GrantorRequestProcessor.clearGrantor(GrantorRequestProcessor.java:310) > at > org.apache.geode.distributed.internal.locks.DLockService.clearGrantor(DLockService.java:874) > at > org.apache.geode.distributed.internal.locks.DLockGrantor.destroy(DLockGrantor.java:1249) > - locked <0x00000000fbd65f58> (a > org.apache.geode.distributed.internal.locks.DLockGrantor) > at > org.apache.geode.distributed.internal.locks.DLockService.nullLockGrantorId(DLockService.java:649) > at > org.apache.geode.distributed.internal.locks.DLockService.basicDestroy(DLockService.java:2336) > at > org.apache.geode.distributed.internal.locks.DLockService.destroyAndRemove(DLockService.java:2256) > - locked <0x00000000fbc85130> (a java.lang.Object) > at > org.apache.geode.distributed.internal.locks.DLockService.destroyServiceNamed(DLockService.java:2195) > at > org.apache.geode.distributed.DistributedLockService.destroy(DistributedLockService.java:84) > at > org.apache.geode.internal.cache.GemFireCacheImpl.destroyPartitionedRegionLockService(GemFireCacheImpl.java:2042) > at > org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2296) > - locked <0x00000000e2282030> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2117) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1985) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1975) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeTestBase.closeCache(LuceneSearchWithRollingUpgradeTestBase.java:755) > at > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeTestBase$12.run2(LuceneSearchWithRollingUpgradeTestBase.java:568) > at > org.apache.geode.cache30.CacheSerializableRunnable.run(CacheSerializableRunnable.java:55) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) > at > org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) > at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) > at sun.rmi.transport.Transport$1.run(Transport.java:200) > at sun.rmi.transport.Transport$1.run(Transport.java:197) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:196) > at > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$15/1032579838.run(Unknown > Source) > at java.security.AccessController.doPrivileged(Native Method) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Locked ownable synchronizers: > - <0x00000000fb3d0db0> (a > java.util.concurrent.ThreadPoolExecutor$Worker) > - <0x00000000fbec63a8> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > {noformat} > The thread holding the lock is still in ElderInitProcessor.init. I can not > tell why it is not able to finish it. > {noformat} > "Pooled Message Processor 2" #124 daemon prio=10 os_prio=0 > tid=0x00007f8100034000 nid=0x9401 waiting on condition [0x00007f82648cc000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000fb403e98> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:731) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:802) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:779) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:865) > at > org.apache.geode.distributed.internal.locks.ElderInitProcessor.init(ElderInitProcessor.java:71) > at > org.apache.geode.distributed.internal.locks.ElderState.<init>(ElderState.java:53) > at > org.apache.geode.distributed.internal.ClusterElderManager.lambda$new$0(ClusterElderManager.java:41) > at > org.apache.geode.distributed.internal.ClusterElderManager$$Lambda$76/1497552839.get(Unknown > Source) > at > org.apache.geode.distributed.internal.ClusterElderManager.initializeElderState(ClusterElderManager.java:106) > at > org.apache.geode.distributed.internal.ClusterElderManager.getElderState(ClusterElderManager.java:97) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.getElderState(ClusterDistributionManager.java:2091) > at > org.apache.geode.distributed.internal.locks.GrantorRequestProcessor$GrantorRequestMessage.basicProcess(GrantorRequestProcessor.java:504) > at > org.apache.geode.distributed.internal.locks.GrantorRequestProcessor$GrantorRequestMessage.process(GrantorRequestProcessor.java:497) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:389) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$81/548873202.invoke(Unknown > Source) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$82/1455748764.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) > Locked ownable synchronizers: > - <0x00000000fb437ee0> (a > java.util.concurrent.ThreadPoolExecutor$Worker) > - <0x00000000e1982990> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)