[ https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390123#comment-17390123 ]
ASF subversion and git services commented on GEODE-9141: -------------------------------------------------------- Commit 4ef2bcd4962bc9506c2b7eec2fdab8467759488f in geode's branch refs/heads/master from Bill Burcham [ https://gitbox.apache.org/repos/asf?p=geode.git;h=4ef2bcd ] GEODE-9141: (2 of 2) Handle in-buffer concurrency * Connection uses a ByteBufferVendor to mediate access to inputBuffer * Prevent return to pool before socket closer is finished (cherry picked from commit 9d0d4d1d33794d0f6a21c3bcae71e965cbbd7fbd) (cherry picked from commit 9e8b3972fcf449eed4d41c254cf3f553e517eaa1) > Hang while shutting down a cache server due to corrupted message > ---------------------------------------------------------------- > > Key: GEODE-9141 > URL: https://issues.apache.org/jira/browse/GEODE-9141 > Project: Geode > Issue Type: Bug > Components: membership, messaging > Affects Versions: 1.13.2, 1.14.0, 1.15.0 > Reporter: Bruce J Schuchardt > Assignee: Bill Burcham > Priority: Major > Labels: blocks-1.14.0, blocks-1.15.0, pull-request-available > Fix For: 1.12.4, 1.13.4, 1.14.0, 1.15.0 > > > We have a test that fails once in 5000 runs with a corrupted > DestroyRegionMessage. It is always during CacheServer teardown when > destroying a HARegionQueue Region. > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x00007fec70058800 nid=0x1d28 waiting on condition [0x00007fec62063000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000f4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0x00000000f8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) > - locked <0x00000000f5f7b888> (a java.lang.Object) > at > org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) > - locked <0x00000000f7ef2980> (a > org.apache.geode.internal.cache.CacheServerImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) > at > org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) > - locked <0x00000000f5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559) > - locked <0x00000000f5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1257) > at hydra.RemoteTestModule$2.run(RemoteTestModule.java:388) > {noformat} > Another server logs this corrupted message. It is almost always the same > corruption. When it's not we see the message header messed up, not a bad > DSFID. > {noformat} > [fatal 2021/03/06 09:45:02.796 PST bridgegemfire_1_3_host1_582 <P2P message > reader for > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920)<ec><v100>:41007 > unshared ordered sender uid=42 dom #1 local port=58695 remote port=52758> > tid=0xcd] Error deserializing message > java.lang.IllegalStateException: unexpected byte: HASH_TABLE while reading > dsfid > at > org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2397) > at > org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2403) > at > org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:2979) > at > org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2797) > at > org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1651) > at org.apache.geode.internal.tcp.Connection.run(Connection.java:1482) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)