This is truly puzzling then, I'm clueless. It's hard to imagine this is lurking out there and nobody else notices, but you've eliminated the custom code. And this is also very peculiar:
* it occurs only in our main text search collection, all other collections are unaffected; * despite what i said earlier, it is so far unreproducible outside production, even when mimicking production as good as we can; Here's a tedious idea. Restart Solr with the -v option, I _think_ that shows you each and every jar file Solr loads. Is it "somehow" possible that your main collection is loading some jar from somewhere that's different than you expect? 'cause silly ideas like this are all I can come up with. Erick On Fri, Jun 29, 2018 at 9:56 AM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Hello Erick, > > The custom search handler doesn't interact with SolrIndexSearcher, this is > really all it does: > > public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) > throws Exception { > super.handleRequestBody(req, rsp); > > if (rsp.getToLog().get("hits") instanceof Integer) { > rsp.addHttpHeader("X-Solr-Hits", > String.valueOf((Integer)rsp.getToLog().get("hits"))); > } > if (rsp.getToLog().get("hits") instanceof Long) { > rsp.addHttpHeader("X-Solr-Hits", > String.valueOf((Long)rsp.getToLog().get("hits"))); > } > } > > I am not sure this qualifies as one more to go. > > Re: compiler warnings on resources, yes! This and tests failing due to > resources leaks have always warned me when i forgot to release something or > decrement a reference. But except for the above method (and the token filters > which i really can't disable) are all that is left. > > I am quite desperate about this problem so although i am unwilling to disable > stuff, i can do it if i must. But i so reason, yet, to remove the search > handler or the token filter stuff, i mean, how could those leak a > SolrIndexSearcher? > > Let me know :) > > Many thanks! > Markus > > -----Original message----- >> From:Erick Erickson <erickerick...@gmail.com> >> Sent: Friday 29th June 2018 18:46 >> To: solr-user <solr-user@lucene.apache.org> >> Subject: Re: 7.3 appears to leak >> >> bq. The only custom stuff left is an extension of SearchHandler that >> only writes numFound to the response headers. >> >> Well, one more to go ;). It's incredibly easy to overlook >> innocent-seeming calls that increment the underlying reference count >> of some objects but don't decrement them, usually through a close >> call. Which isn't necessarily a close if the underlying reference >> count is still > 0. >> >> You may infer that I've been there and done that ;). Sometime the >> compiler warnings about "resource leak" can help pinpoint those too. >> >> Best, >> Erick >> >> On Fri, Jun 29, 2018 at 9:16 AM, Markus Jelsma >> <markus.jel...@openindex.io> wrote: >> > Hello Yonik, >> > >> > I took one node of the 7.2.1 cluster out of the load balancer so it would >> > only receive shard queries, this way i could kind of 'safely' disable our >> > custom components one by one, while keeping functionality in place by >> > letting the other 7.2.1 nodes continue on with the full configuration. >> > >> > I am now at a point where literally all custom components are deleted or >> > commented out in the config for the node running 7.4. The only custom >> > stuff left is an extension of SearchHandler that only writes numFound to >> > the response headers, and all the token filters in our schema. >> > >> > You were right, it was leaking exactly one SolrIndexSearcher instance on >> > each commit. But, with all our stuff gone, the leak is still there! I >> > triple checked it! Of course, the bastard is locally still not >> > reproducible. >> > >> > So, what is next? I have no clues left. >> > >> > Many, many thanks, >> > Markus >> > >> > -----Original message----- >> >> From:Markus Jelsma <markus.jel...@openindex.io> >> >> Sent: Thursday 28th June 2018 23:52 >> >> To: solr-user@lucene.apache.org >> >> Subject: RE: 7.3 appears to leak >> >> >> >> Hello Yonik, >> >> >> >> If leaking a whole SolrIndexSearcher would cause this problem, then the >> >> only custom component would be our copy/paste-and-enhance version of the >> >> elevator component, is the root of all problems. It is a direct copy of >> >> the 7.2 source where only things like getAnalyzedQuery, the ElevationObj >> >> and the loop over the map entries is changed. >> >> >> >> There are no changes to code related to the searcher. Other component >> >> where we get a RefCount of searcher is used without issues, we always >> >> decrement the reference after using it. But those components are not in >> >> use in this collection. >> >> >> >> The source has changed a lot with 7.4 but we still use the old code. I >> >> will investigate the component thoroughly, even revert to the old 7.2 >> >> vanilla component for a brief period in production for one machine. It >> >> may not be a problem if i don't let our load balancer access it directly, >> >> so it only serves shard queries. >> >> >> >> I will get back to this topic tomorrow! >> >> >> >> Many thanks, >> >> Markus >> >> >> >> >> >> >> >> -----Original message----- >> >> > From:Yonik Seeley <ysee...@gmail.com> >> >> > Sent: Thursday 28th June 2018 23:30 >> >> > To: solr-user@lucene.apache.org >> >> > Subject: Re: 7.3 appears to leak >> >> > >> >> > > * SortedIntDocSet instances ánd ConcurrentLRUCache$CacheEntry >> >> > > instances are both leaked on commit; >> >> > >> >> > If these are actually filterCache entries being leaked, it stands to >> >> > reason that a whole searcher is being leaked somewhere. >> >> > >> >> > -Yonik >> >> > >> >> >>