[ 
https://issues.apache.org/jira/browse/GEODE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904880#comment-17904880
 ] 

Leon Finker commented on GEODE-10453:
-------------------------------------

It's a bug specific to compact index logic. When switching to asynchronous 
index (non compact), this issue doesn't happen. It also happens on server cache 
startup and index creation over existing data that is received as initial 
snapshot from other peer. And it's not really possible to work around when 
using overflow to disk regions because those do not support non compact indexes.
{noformat}
[warn <ThreadsMonitor> tid=55] Thread <77> (0x4d) that was executed at <07 Dec 
2024 12:47:28 EST> has been stuck for <994.887 seconds> and number of thread 
monitor iteration <17>
Thread Name <Pooled High Priority Message Processor 3> state <RUNNABLE>
Executor Group <PooledExecutorWithDMStats>
Monitored metric <ResourceManagerStats.numThreadsStuck>
Thread stack for "Pooled High Priority Message Processor 3" (0x4d):
java.lang.ThreadState: RUNNABLE
  at java.base@17.0.6/java.lang.Throwable.fillInStackTrace(Native Method)
  at java.base@17.0.6/java.lang.Throwable.fillInStackTrace(Throwable.java:798)
  at java.base@17.0.6/java.lang.Throwable.<init>(Throwable.java:271)
  at java.base@17.0.6/java.lang.Exception.<init>(Exception.java:67)
  at 
java.base@17.0.6/java.lang.RuntimeException.<init>(RuntimeException.java:63)
  at 
java.base@17.0.6/java.lang.ClassCastException.<init>(ClassCastException.java:57)
  at java.base@17.0.6/java.lang.String.compareTo(String.java:140)
  at 
app//org.apache.geode.cache.query.internal.types.TypeUtils$ComparisonStrategy$4.execute(TypeUtils.java:90)
  at 
app//org.apache.geode.cache.query.internal.types.TypeUtils.compare(TypeUtils.java:499)
  at 
app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.getOldKey(MemoryIndexStore.java:275)
  at 
app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.basicRemoveMapping(MemoryIndexStore.java:399)
  at 
app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.removeMapping(MemoryIndexStore.java:298)
  at 
app//org.apache.geode.cache.query.internal.index.CompactRangeIndex.removeMapping(CompactRangeIndex.java:173)
  at 
app//org.apache.geode.cache.query.internal.index.AbstractIndex.removeIndexMapping(AbstractIndex.java:508)
  at 
app//org.apache.geode.cache.query.internal.index.IndexManager.removeIndexMapping(IndexManager.java:1156)
  at 
app//org.apache.geode.cache.query.internal.index.IndexManager.processAction(IndexManager.java:1121)
  at 
app//org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:982)
  at 
app//org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:956)
  at 
app//org.apache.geode.internal.cache.AbstractRegionMap.initialImagePut(AbstractRegionMap.java:836)
  at 
app//org.apache.geode.internal.cache.InitialImageOperation.processChunk(InitialImageOperation.java:980)
  at 
app//org.apache.geode.internal.cache.InitialImageOperation$ImageProcessor.process(InitialImageOperation.java:1306)
  at 
app//org.apache.geode.distributed.internal.ReplyMessage.process(ReplyMessage.java:215)
  at 
app//org.apache.geode.internal.cache.InitialImageOperation$ImageReplyMessage.process(InitialImageOperation.java:2829)
  at 
app//org.apache.geode.distributed.internal.ReplyMessage.dmProcess(ReplyMessage.java:198)
  at 
app//org.apache.geode.distributed.internal.ReplyMessage.process(ReplyMessage.java:191)
  at 
app//org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:380)
  at 
app//org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:445)
  at 
java.base@17.0.6/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  at 
java.base@17.0.6/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  at 
app//org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:449)
  at 
app//org.apache.geode.distributed.internal.ClusterOperationExecutors.doHighPriorityThread(ClusterOperationExecutors.java:407)
  at 
app//org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$312/0x00000008018c1e50.invoke(Unknown
 Source)
  at 
app//org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
  at 
app//org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$310/0x00000008018c1780.run(Unknown
 Source)
  at java.base@17.0.6/java.lang.Thread.run(Thread.java:833)
Locked ownable synchronizers:
  - None
{noformat}

> Infinite/slow indexing on reconnect and register interest replay
> ----------------------------------------------------------------
>
>                 Key: GEODE-10453
>                 URL: https://issues.apache.org/jira/browse/GEODE-10453
>             Project: Geode
>          Issue Type: Bug
>    Affects Versions: 1.15.1
>            Reporter: Leon Finker
>            Priority: Major
>
> Cache server was restarted. Client side upon reconnect went into 
> infinite/slow indexing loop. This has not recovered even after multiple days. 
> The thread stack for thread taking 100% CPU was:
> {code}
> Thread Name <poolTimer-Server-21659> state <BLOCKED>
> Waiting on <org.apache.geode.cache.client.internal.ConnectionImpl@293d172>
> Owned By <queueTimer-Server1> with ID <140>
> Executor Group <ScheduledThreadPoolExecutorWithKeepAlive>
> Monitored metric <ResourceManagerStats.numThreadsStuck>
> Thread stack for "poolTimer-Server-21659" (0x128275):
> java.lang.ThreadState: BLOCKED
>   at 
> app//org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:283)
>   at 
> app//org.apache.geode.cache.client.internal.QueueConnectionImpl.execute(QueueConnectionImpl.java:191)
>   at 
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:760)
>   at 
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:343)
>   at 
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:312)
>   at 
> app//org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:848)
>   at 
> app//org.apache.geode.cache.client.internal.PingOp.execute(PingOp.java:40)
>   at 
> app//org.apache.geode.cache.client.internal.LiveServerPinger$PingTask.run2(LiveServerPinger.java:128)
>   at 
> app//org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1340)
>   at 
> java.base@17.0.6/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>   at 
> java.base@17.0.6/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>   at 
> app//org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:285)
>   at 
> java.base@17.0.6/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base@17.0.6/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at java.base@17.0.6/java.lang.Thread.run(Thread.java:833)
> Locked ownable synchronizers:
>   - None
> Lock owner thread stack for "queueTimer-Server1" (0x6a):
> java.lang.ThreadState: RUNNABLE
>   at 
> app//org.apache.geode.cache.query.internal.types.TypeUtils$ComparisonStrategy$4.execute(TypeUtils.java:90)
>   at 
> app//org.apache.geode.cache.query.internal.types.TypeUtils.compare(TypeUtils.java:499)
>   at 
> app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.getOldKey(MemoryIndexStore.java:275)
>   at 
> app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.updateMapping(MemoryIndexStore.java:122)
>   at 
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.applyProjection(CompactRangeIndex.java:1563)
>   at 
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.doNestedIterations(CompactRangeIndex.java:1519)
>   at 
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.evaluate(CompactRangeIndex.java:1372)
>   at 
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex.addMapping(CompactRangeIndex.java:143)
>   at 
> app//org.apache.geode.cache.query.internal.index.AbstractIndex.addIndexMapping(AbstractIndex.java:488)
>   at 
> app//org.apache.geode.cache.query.internal.index.IndexManager.addIndexMapping(IndexManager.java:1143)
>   at 
> app//org.apache.geode.cache.query.internal.index.IndexManager.processAction(IndexManager.java:1089)
>   at 
> app//org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:982)
>   at 
> app//org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:956)
>   at 
> app//org.apache.geode.internal.cache.AbstractRegionMap.initialImagePut(AbstractRegionMap.java:839)
>   at 
> app//org.apache.geode.internal.cache.LocalRegion.refreshEntriesFromServerKeys(LocalRegion.java:4348)
>   at 
> app//org.apache.geode.cache.client.internal.RegisterInterestOp$RegisterInterestOpImpl.processResponse(RegisterInterestOp.java:217)
>   at 
> app//org.apache.geode.cache.client.internal.RegisterInterestOp$RegisterInterestOpImpl.processResponse(RegisterInterestOp.java:121)
>   at 
> app//org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:209)
>   at 
> app//org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:394)
>   at 
> app//org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:284)
>   at 
> app//org.apache.geode.cache.client.internal.QueueConnectionImpl.execute(QueueConnectionImpl.java:191)
>   at 
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:760)
>   at 
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:475)
>   at 
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:488)
>   at 
> app//org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:861)
>   at 
> app//org.apache.geode.cache.client.internal.RegisterInterestOp.executeOn(RegisterInterestOp.java:113)
>   at 
> app//org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterestOn(ServerRegionProxy.java:506)
>   at 
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleKey(QueueManagerImpl.java:1236)
>   at 
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleRegion(QueueManagerImpl.java:1183)
>   at 
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleList(QueueManagerImpl.java:1129)
>   at 
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverInterestList(QueueManagerImpl.java:1250)
>   at 
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverAllInterestTypes(QueueManagerImpl.java:1264)
>   at 
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverInterest(QueueManagerImpl.java:1094)
>   at 
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverPrimary(QueueManagerImpl.java:938)
>   at 
> app//org.apache.geode.cache.client.internal.QueueManagerImpl$RedundancySatisfierTask.run2(QueueManagerImpl.java:1475)
>   at 
> app//org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1340)
>   at 
> java.base@17.0.6/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>   at java.base@17.0.6/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base@17.0.6/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>   at 
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.doNestedIterations(CompactRangeIndex.java:1509)
> {code}
> After client stop attempt and cache close, the following stack trace was 
> logged:
> {code}
>  The index is corrupted and
> marked as invalid.
> org.apache.geode.cache.CacheClosedException: The cache is closed.
>         at 
> org.apache.geode.internal.cache.GemFireCacheImpl$Stopper.generateCancelledException(GemFireCacheImpl.java:5207)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.internal.cache.LocalRegion.checkRegionDestroyed(LocalRegion.java:7382)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.internal.cache.LocalRegion.checkReadiness(LocalRegion.java:2788)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.internal.cache.LocalRegion.values(LocalRegion.java:1970) 
> ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.QRegion.<init>(QRegion.java:81) 
> ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.index.DummyQRegion.<init>(DummyQRegion.java:52)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.evaluate(CompactRangeIndex.java:1342)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.index.CompactRangeIndex.addMapping(CompactRangeIndex.java:143)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.index.AbstractIndex.addIndexMapping(AbstractIndex.java:488)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.index.IndexManager.addIndexMapping(IndexManager.java:1143)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.index.IndexManager.processAction(IndexManager.java:1089)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:982)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:956)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.internal.cache.AbstractRegionMap.initialImagePut(AbstractRegionMap.java:839)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.internal.cache.LocalRegion.refreshEntriesFromServerKeys(LocalRegion.java:4348)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.RegisterInterestOp$RegisterInterestOpImpl.processResponse(RegisterInterestOp.java:217)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.RegisterInterestOp$RegisterInterestOpImpl.processResponse(RegisterInterestOp.java:121)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:209)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:394)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:284)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueConnectionImpl.execute(QueueConnectionImpl.java:191)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:760)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:475)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:488)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:861) 
> ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.RegisterInterestOp.executeOn(RegisterInterestOp.java:113)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterestOn(ServerRegionProxy.java:506)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleKey(QueueManagerImpl.java:1236)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleRegion(QueueManagerImpl.java:1183)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleList(QueueManagerImpl.java:1129)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverInterestList(QueueManagerImpl.java:1250)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverAllInterestTypes(QueueManagerImpl.java:1264)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverInterest(QueueManagerImpl.java:1094)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverPrimary(QueueManagerImpl.java:938)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.QueueManagerImpl$RedundancySatisfierTask.run2(QueueManagerImpl.java:1475)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1340)
>  ~[geode-core-1.15.1.jar:?]
>         at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>  ~[?:?]
>         at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) 
> ~[?:?]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to