[
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17220810#comment-17220810
]
Mike Drob commented on SOLR-14940:
----------------------------------
Thanks for confirming the ZK changes helped!
I pushed a commit to your PR with a test that passes now, and fails before your
change. So that's a good sign. I'm a little confused about why there are 5
hooks at the end of each reconnect, instead of what I would expect to be 3
hooks. Are we removing the close hooks too late in the process somewhere? Need
to continue investigating.
> ReplicationHandler memory leak through SolrCore.closeHooks
> ----------------------------------------------------------
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes
> with 2 cores in each JVM.
>
> Reporter: Anver Sotnikov
> Priority: Major
> Attachments: Actual references to hooks that in turn hold references
> to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG
> nodes.
> Leader does not seem to be affected while Followers are.
>
> Looking at memory dump we noticed that SolrCore holds lots of references to
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks,
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in
> SolrCore.closeHooks through ReplicationHandler.inform() ->
> ReplicationHandler.registerCloseHook().
>
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is
> started again on same SolrCore - new ReplicationHandler will be created and
> registered in closeHooks.
>
> It looks like there are few scenarios when replication is stopped and
> restarted on same core and in our TLOG setup it shows up quite often.
>
> Potential solutions:
> # Allow unregistering SolrCore.closeHooks so it can be used from
> ReplicationHandler.shutdown
> # Hack but easier - break the link between ReplicationHandler close hooks
> and full ReplicationHandler object so ReplicationHandler can be GCed even
> when hooks are still registered in SolrCore.closeHooks
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]