Kirk, do you recall which of the tests in that class hit this problem? It looks pretty serious.
On 2/19/20, 9:24 AM, "Kirk Lund" <kl...@apache.org> wrote: While running PersistentColocatedPartitionedRegionDistributedTest a thousand times to verify that I've fixed a flaky issue in the test, it hit an interesting failure trying to send a RequestImageMessage. This generated a stack trace which caused the test to fail grep for suspect strings. I can easily suppress this failure, BUT it looks like a bug in message distribution which may have been introduced by the recent membership changes (modularization). Here's the stack trace for anyone who wants it (I'm not working on this): [fatal 2020/02/19 02:50:04.862 GMT <Pooled Waiting Message Processor 1> tid=8410] While pushing message <InitialImageOperation$RequestImageMessage(region path='/__PR/_B__region2_1'; sender=172.17.0.4(185)<v758>:41003; keysOnly=false; processorId=40462; waitForInit=false; checkTombstoneVersions=true; versionVector=RegionVersionVector[2ab5849689d446bd-a7da0400b0e718f7={rv0 gc0 localVersion=0 local exceptions=[]} others={}, gc={}]; unfinished keys=[])> to recipients: <172.17.0.4(179)<v757>:41002> java.lang.IllegalArgumentException: newPosition > limit: (32768 > 90) at java.base/java.nio.Buffer.createPositionException(Buffer.java:318) at java.base/java.nio.Buffer.position(Buffer.java:293) at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1086) at java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:226) at java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:67) at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:116) at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:58) at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:50) at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:463) at org.apache.geode.internal.tcp.Connection.writeFully(Connection.java:2587) at org.apache.geode.internal.tcp.Connection.sendPreserialized(Connection.java:1867) at org.apache.geode.internal.tcp.MsgStreamer.realFlush(MsgStreamer.java:324) at org.apache.geode.internal.tcp.MsgStreamer.writeMessage(MsgStreamer.java:249) at org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:393) at org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:248) at org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:604) at org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:348) at org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:293) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2060) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1987) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2024) at org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1084) at org.apache.geode.internal.cache.InitialImageOperation.getFromOne(InitialImageOperation.java:514) at org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1222) at org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1082) at org.apache.geode.internal.cache.BucketRegion.initialize(BucketRegion.java:259) at org.apache.geode.internal.cache.LocalRegion.createSubregion(LocalRegion.java:983) at org.apache.geode.internal.cache.PartitionedRegionDataStore.createBucketRegion(PartitionedRegionDataStore.java:785) at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucket(PartitionedRegionDataStore.java:460) at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucketRecursively(PartitionedRegionDataStore.java:319) at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabBucket(PartitionedRegionDataStore.java:2896) at org.apache.geode.internal.cache.partitioned.ManageBackupBucketMessage.operateOnPartitionedRegion(ManageBackupBucketMessage.java:159) at org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:333) at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:394) at org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:458) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:449) at org.apache.geode.distributed.internal.ClusterOperationExecutors.doWaitingThread(ClusterOperationExecutors.java:416) at org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119) at java.base/java.lang.Thread.run(Thread.java:834)