[
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17220704#comment-17220704
]
Anver Sotnikov commented on SOLR-14940:
---------------------------------------
Mike, you were right. We instrumented Solr with extra logging on registerHook
and shutdown in ReplicationController to confirm that leak was due to flaky ZK
connection. We bumped timeouts (SOLR-10471) and fine tuned GC as well.
Replication going into recovery happens way less then it was before.
Stacktrace from registerHook
{code}
at
org.apache.solr.handler.ReplicationHandler.registerCloseHook(ReplicationHandler.java:1397)
java.lang.RuntimeException: ReplicationHandler.registerCloseHooks
at
org.apache.solr.handler.ReplicationHandler.registerCloseHook(ReplicationHandler.java:1397)
~[?:?]
at
org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:1239)
~[?:?]
at
org.apache.solr.cloud.ReplicateFromLeader.startReplication(ReplicateFromLeader.java:109)
~[?:?]
at
org.apache.solr.cloud.ZkController.startReplicationFromLeader(ZkController.java:1327)
~[?:?]
at
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:713)
~[?:?]
at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:334)
~[?:?]
at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:317) ~[?:?]
at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
~[metrics-core-4.1.5.jar:4.1.5]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
~[?:?]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212)
~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
~[?:?]
at java.lang.Thread.run(Unknown Source)
{code}
> ReplicationHandler memory leak through SolrCore.closeHooks
> ----------------------------------------------------------
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes
> with 2 cores in each JVM.
>
> Reporter: Anver Sotnikov
> Priority: Major
> Attachments: Actual references to hooks that in turn hold references
> to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG
> nodes.
> Leader does not seem to be affected while Followers are.
>
> Looking at memory dump we noticed that SolrCore holds lots of references to
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks,
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in
> SolrCore.closeHooks through ReplicationHandler.inform() ->
> ReplicationHandler.registerCloseHook().
>
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is
> started again on same SolrCore - new ReplicationHandler will be created and
> registered in closeHooks.
>
> It looks like there are few scenarios when replication is stopped and
> restarted on same core and in our TLOG setup it shows up quite often.
>
> Potential solutions:
> # Allow unregistering SolrCore.closeHooks so it can be used from
> ReplicationHandler.shutdown
> # Hack but easier - break the link between ReplicationHandler close hooks
> and full ReplicationHandler object so ReplicationHandler can be GCed even
> when hooks are still registered in SolrCore.closeHooks
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]