I think this is pretty bad. I created https://issues.apache.org/jira/browse/SOLR-12743. Feel free to add any more details you have there.
On Mon, Sep 3, 2018 at 1:50 PM Markus Jelsma <markus.jel...@openindex.io> wrote: > Hello Björn, > > Take great care, 7.2.1 cannot read an index written by 7.4.0, so you > cannot roll back but need to reindex! > > Andrey Kudryavtsev made a good suggestion in the thread on how to find the > culprit, but it will be a tedious task. I have not yet had the time or > courage to venture there. > > Hope it helps, > Markus > > > > -----Original message----- > > From:Björn Häuser <bjoernhaeu...@gmail.com> > > Sent: Monday 3rd September 2018 22:28 > > To: solr-user@lucene.apache.org > > Subject: Re: Heap Memory Problem after Upgrading to 7.4.0 > > > > Hi Markus, > > > > this reads exactly like what we have. Where you able to figure out > anything? Currently thinking about rollbacking to 7.2.1. > > > > > > > > > On 3. Sep 2018, at 21:54, Markus Jelsma <markus.jel...@openindex.io> > wrote: > > > > > > Hello, > > > > > > Getting an OOM plus the fact you are having a lot of IndexSearcher > instances rings a familiar bell. One of our collections has the same issue > [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all > our custom Solr code but had to keep our Lucene filters in the schema, the > problem persisted. > > > > > > The odd thing, however, is that you appear to have the same problem, > but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can > you confirm the problem is not also in 7.3.0? > > > > > > > We had very similar problems with 7.3.0 but never analyzed them and just > updated to 7.4.0 because I thought thats the bug we hit: > https://issues.apache.org/jira/browse/SOLR-11882 < > https://issues.apache.org/jira/browse/SOLR-11882> > > > > > > > You should see the instance count for IndexSearcher increase by one > for each replica on each commit. > > > > > > Sorry, where can I find this? ;) Sorry, did not find anything. > > > > Thanks > > Björn > > > > > > > > Regards, > > > Markus > > > > > > [1] > http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html > > > > > > > > > > > > -----Original message----- > > >> From:Erick Erickson <erickerick...@gmail.com> > > >> Sent: Monday 3rd September 2018 20:49 > > >> To: solr-user <solr-user@lucene.apache.org> > > >> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0 > > >> > > >> I would expect at least 1 IndexSearcher per replica, how many total > > >> replicas hosted in your JVM? > > >> > > >> Plus, if you're actively indexing, there may temporarily be 2 > > >> IndexSearchers open while the new searcher warms. > > >> > > >> And there may be quite a few caches, at least queryResultCache and > > >> filterCache and documentCache, one of each per replica and maybe two > > >> (for queryResultCache and filterCache) if you have a background > > >> searcher autowarming. > > >> > > >> At a glance, your autowarm counts are very high, so it may take some > > >> time to autowarm leading to multiple IndexSearchers and caches open > > >> per replica when you happen to hit a commit point. I usually start > > >> with 16-20 as an autowarm count, the benefit decreases rapidly as you > > >> increase the count. > > >> > > >> I'm not quite sure why it would be different in 7x .vs. 6x. How much > > >> heap do you allocate to the JVM? And do you see similar heap dumps in > > >> 6.6? > > >> > > >> Best, > > >> Erick > > >> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <bjoernhaeu...@gmail.com> > wrote: > > >>> > > >>> Hello, > > >>> > > >>> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard > each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We > are running Zookeeper 4.1.13. > > >>> > > >>> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space > exhaustion. After obtaining a heap dump it looks like that we have a lot of > IndexSearchers open for our largest collection. > > >>> > > >>> The dump contains around ~60 IndexSearchers, and each containing > around ~40mb heap. Another 500MB of heap is the fieldcache, which is > expected in my opinion. > > >>> > > >>> The current config can be found here: > https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 < > https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844> > > >>> > > >>> Analyzing the heap dump eclipse MAT says this: > > >>> > > >>> Problem Suspect 1 > > >>> > > >>> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded > by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.981.148.336 (38,26%) bytes. > > >>> > > >>> Biggest instances: > > >>> > > >>> • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - > 70.087.272 (1,35%) bytes. > > >>> • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - > 65.678.264 (1,27%) bytes. > > >>> • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - > 63.050.600 (1,22%) bytes. > > >>> > > >>> > > >>> Problem Suspect 2 > > >>> > > >>> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded > by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy > 1.373.110.208 (26,52%) bytes. > > >>> > > >>> > > >>> Any help is appreciated. Thank you very much! > > >>> Björn > > >> > > > > >