Yep, the test method was testReplaceOfflineMemberAndRestart_WithMultipleDiskStores. I can probably reproduce with log files if that helps.
On Wed, Feb 19, 2020 at 10:04 AM Bruce Schuchardt <bschucha...@pivotal.io> wrote: > Kirk, do you recall which of the tests in that class hit this problem? It > looks pretty serious. > > On 2/19/20, 9:24 AM, "Kirk Lund" <kl...@apache.org> wrote: > > While running PersistentColocatedPartitionedRegionDistributedTest a > thousand times to verify that I've fixed a flaky issue in the test, it > hit > an interesting failure trying to send a RequestImageMessage. This > generated > a stack trace which caused the test to fail grep for suspect strings. > I can > easily suppress this failure, BUT it looks like a bug in message > distribution which may have been introduced by the recent membership > changes (modularization). > > Here's the stack trace for anyone who wants it (I'm not working on > this): > > [fatal 2020/02/19 02:50:04.862 GMT <Pooled Waiting Message Processor 1> > tid=8410] While pushing message > <InitialImageOperation$RequestImageMessage(region > path='/__PR/_B__region2_1'; sender=172.17.0.4(185)<v758>:41003; > keysOnly=false; processorId=40462; waitForInit=false; > checkTombstoneVersions=true; > > versionVector=RegionVersionVector[2ab5849689d446bd-a7da0400b0e718f7={rv0 > gc0 localVersion=0 local exceptions=[]} others={}, gc={}]; unfinished > keys=[])> to recipients: <172.17.0.4(179)<v757>:41002> > java.lang.IllegalArgumentException: newPosition > limit: (32768 > > 90) > at > java.base/java.nio.Buffer.createPositionException(Buffer.java:318) > at java.base/java.nio.Buffer.position(Buffer.java:293) > at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1086) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:226) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:67) > at > java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:116) > at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:58) > at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:50) > at > java.base/sun.nio.ch > .SocketChannelImpl.write(SocketChannelImpl.java:463) > at > > org.apache.geode.internal.tcp.Connection.writeFully(Connection.java:2587) > at > > org.apache.geode.internal.tcp.Connection.sendPreserialized(Connection.java:1867) > at > > org.apache.geode.internal.tcp.MsgStreamer.realFlush(MsgStreamer.java:324) > at > > org.apache.geode.internal.tcp.MsgStreamer.writeMessage(MsgStreamer.java:249) > at > > org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:393) > at > > org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:248) > at > > org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:604) > at > > org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:348) > at > > org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:293) > at > > org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2060) > at > > org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1987) > at > > org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2024) > at > > org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1084) > at > > org.apache.geode.internal.cache.InitialImageOperation.getFromOne(InitialImageOperation.java:514) > at > > org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1222) > at > > org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1082) > at > > org.apache.geode.internal.cache.BucketRegion.initialize(BucketRegion.java:259) > at > > org.apache.geode.internal.cache.LocalRegion.createSubregion(LocalRegion.java:983) > at > > org.apache.geode.internal.cache.PartitionedRegionDataStore.createBucketRegion(PartitionedRegionDataStore.java:785) > at > > org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucket(PartitionedRegionDataStore.java:460) > at > > org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucketRecursively(PartitionedRegionDataStore.java:319) > at > > org.apache.geode.internal.cache.PartitionedRegionDataStore.grabBucket(PartitionedRegionDataStore.java:2896) > at > > org.apache.geode.internal.cache.partitioned.ManageBackupBucketMessage.operateOnPartitionedRegion(ManageBackupBucketMessage.java:159) > at > > org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:333) > at > > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:394) > at > > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:458) > at > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at > > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:449) > at > > org.apache.geode.distributed.internal.ClusterOperationExecutors.doWaitingThread(ClusterOperationExecutors.java:416) > at > > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119) > at java.base/java.lang.Thread.run(Thread.java:834) > > > >