Hello Andrey,

I didn't think of that! I will try it when i have the courage again, probably 
next week or so.

Many thanks,
Markus
 
 
-----Original message-----
> From:Kydryavtsev Andrey <werde...@yandex.ru>
> Sent: Wednesday 4th July 2018 14:48
> To: solr-user@lucene.apache.org
> Subject: Re: 7.3 appears to leak
> 
> If it is not possible to find a resource leak by code analysis and there is 
> no better ideas, I can suggest a brute force approach:
> - Clone Solr's sources from appropriate branch 
> https://github.com/apache/lucene-solr/tree/branch_7_3
> - Log every searcher's holder increment/decrement operation in a way to catch 
> every caller name (use Thread.currentThread().getStackTrace() or something) 
> https://github.com/apache/lucene-solr/blob/branch_7_3/solr/core/src/java/org/apache/solr/util/RefCounted.java
> - Build custom artefacts and upload them on prod
> - After memory leak happened - analyse logs to see what part of functionality 
> doesn't decrement searcher after counter was incremented. If searchers are 
> leaked - there should be such code I guess.
> 
> This is not something someone would like to do, but it is what it is.
> 
> 
> 
> Thank you,
> 
> Andrey Kudryavtsev
> 
> 
> 03.07.2018, 14:26, "Markus Jelsma" <markus.jel...@openindex.io>:
> > Hello Erick,
> >
> > Even the silliest ideas may help us, but unfortunately this is not the 
> > case. All our Solr nodes run binaries from the same source from our central 
> > build server, with the same libraries thanks to provisioning. Only schema 
> > and config are different, but the <lib/> directive is the same all over.
> >
> > Are there any other ideas, speculations, whatever, on why only our main 
> > text collection leaks a SolrIndexSearcher instance on commit since 7.3.0 
> > and every version up?
> >
> > Many thanks?
> > Markus
> >
> > -----Original message-----
> >>  From:Erick Erickson <erickerick...@gmail.com>
> >>  Sent: Friday 29th June 2018 19:34
> >>  To: solr-user <solr-user@lucene.apache.org>
> >>  Subject: Re: 7.3 appears to leak
> >>
> >>  This is truly puzzling then, I'm clueless. It's hard to imagine this
> >>  is lurking out there and nobody else notices, but you've eliminated
> >>  the custom code. And this is also very peculiar:
> >>
> >>  * it occurs only in our main text search collection, all other
> >>  collections are unaffected;
> >>  * despite what i said earlier, it is so far unreproducible outside
> >>  production, even when mimicking production as good as we can;
> >>
> >>  Here's a tedious idea. Restart Solr with the -v option, I _think_ that
> >>  shows you each and every jar file Solr loads. Is it "somehow" possible
> >>  that your main collection is loading some jar from somewhere that's
> >>  different than you expect? 'cause silly ideas like this are all I can
> >>  come up with.
> >>
> >>  Erick
> >>
> >>  On Fri, Jun 29, 2018 at 9:56 AM, Markus Jelsma
> >>  <markus.jel...@openindex.io> wrote:
> >>  > Hello Erick,
> >>  >
> >>  > The custom search handler doesn't interact with SolrIndexSearcher, this 
> >> is really all it does:
> >>  >
> >>  >   public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse 
> >> rsp) throws Exception {
> >>  >     super.handleRequestBody(req, rsp);
> >>  >
> >>  >     if (rsp.getToLog().get("hits") instanceof Integer) {
> >>  >       rsp.addHttpHeader("X-Solr-Hits", 
> >> String.valueOf((Integer)rsp.getToLog().get("hits")));
> >>  >     }
> >>  >     if (rsp.getToLog().get("hits") instanceof Long) {
> >>  >       rsp.addHttpHeader("X-Solr-Hits", 
> >> String.valueOf((Long)rsp.getToLog().get("hits")));
> >>  >     }
> >>  >   }
> >>  >
> >>  > I am not sure this qualifies as one more to go.
> >>  >
> >>  > Re: compiler warnings on resources, yes! This and tests failing due to 
> >> resources leaks have always warned me when i forgot to release something 
> >> or decrement a reference. But except for the above method (and the token 
> >> filters which i really can't disable) are all that is left.
> >>  >
> >>  > I am quite desperate about this problem so although i am unwilling to 
> >> disable stuff, i can do it if i must. But i so reason, yet, to remove the 
> >> search handler or the token filter stuff, i mean, how could those leak a 
> >> SolrIndexSearcher?
> >>  >
> >>  > Let me know :)
> >>  >
> >>  > Many thanks!
> >>  > Markus
> >>  >
> >>  > -----Original message-----
> >>  >> From:Erick Erickson <erickerick...@gmail.com>
> >>  >> Sent: Friday 29th June 2018 18:46
> >>  >> To: solr-user <solr-user@lucene.apache.org>
> >>  >> Subject: Re: 7.3 appears to leak
> >>  >>
> >>  >> bq. The only custom stuff left is an extension of SearchHandler that
> >>  >> only writes numFound to the response headers.
> >>  >>
> >>  >> Well, one more to go ;). It's incredibly easy to overlook
> >>  >> innocent-seeming calls that increment the underlying reference count
> >>  >> of some objects but don't decrement them, usually through a close
> >>  >> call. Which isn't necessarily a close if the underlying reference
> >>  >> count is still > 0.
> >>  >>
> >>  >> You may infer that I've been there and done that ;). Sometime the
> >>  >> compiler warnings about "resource leak" can help pinpoint those too.
> >>  >>
> >>  >> Best,
> >>  >> Erick
> >>  >>
> >>  >> On Fri, Jun 29, 2018 at 9:16 AM, Markus Jelsma
> >>  >> <markus.jel...@openindex.io> wrote:
> >>  >> > Hello Yonik,
> >>  >> >
> >>  >> > I took one node of the 7.2.1 cluster out of the load balancer so it 
> >> would only receive shard queries, this way i could kind of 'safely' 
> >> disable our custom components one by one, while keeping functionality in 
> >> place by letting the other 7.2.1 nodes continue on with the full 
> >> configuration.
> >>  >> >
> >>  >> > I am now at a point where literally all custom components are 
> >> deleted or commented out in the config for the node running 7.4. The only 
> >> custom stuff left is an extension of SearchHandler that only writes 
> >> numFound to the response headers, and all the token filters in our schema.
> >>  >> >
> >>  >> > You were right, it was leaking exactly one SolrIndexSearcher 
> >> instance on each commit. But, with all our stuff gone, the leak is still 
> >> there! I triple checked it! Of course, the bastard is locally still not 
> >> reproducible.
> >>  >> >
> >>  >> > So, what is next? I have no clues left.
> >>  >> >
> >>  >> > Many, many thanks,
> >>  >> > Markus
> >>  >> >
> >>  >> > -----Original message-----
> >>  >> >> From:Markus Jelsma <markus.jel...@openindex.io>
> >>  >> >> Sent: Thursday 28th June 2018 23:52
> >>  >> >> To: solr-user@lucene.apache.org
> >>  >> >> Subject: RE: 7.3 appears to leak
> >>  >> >>
> >>  >> >> Hello Yonik,
> >>  >> >>
> >>  >> >> If leaking a whole SolrIndexSearcher would cause this problem, then 
> >> the only custom component would be our copy/paste-and-enhance version of 
> >> the elevator component, is the root of all problems. It is a direct copy 
> >> of the 7.2 source where only things like getAnalyzedQuery, the 
> >> ElevationObj and the loop over the map entries is changed.
> >>  >> >>
> >>  >> >> There are no changes to code related to the searcher. Other 
> >> component where we get a RefCount of searcher is used without issues, we 
> >> always decrement the reference after using it. But those components are 
> >> not in use in this collection.
> >>  >> >>
> >>  >> >> The source has changed a lot with 7.4 but we still use the old 
> >> code. I will investigate the component thoroughly, even revert to the old 
> >> 7.2 vanilla component for a brief period in production for one machine. It 
> >> may not be a problem if i don't let our load balancer access it directly, 
> >> so it only serves shard queries.
> >>  >> >>
> >>  >> >> I will get back to this topic tomorrow!
> >>  >> >>
> >>  >> >> Many thanks,
> >>  >> >> Markus
> >>  >> >>
> >>  >> >>
> >>  >> >>
> >>  >> >> -----Original message-----
> >>  >> >> > From:Yonik Seeley <ysee...@gmail.com>
> >>  >> >> > Sent: Thursday 28th June 2018 23:30
> >>  >> >> > To: solr-user@lucene.apache.org
> >>  >> >> > Subject: Re: 7.3 appears to leak
> >>  >> >> >
> >>  >> >> > > * SortedIntDocSet instances ánd ConcurrentLRUCache$CacheEntry 
> >> instances are both leaked on commit;
> >>  >> >> >
> >>  >> >> > If these are actually filterCache entries being leaked, it stands 
> >> to
> >>  >> >> > reason that a whole searcher is being leaked somewhere.
> >>  >> >> >
> >>  >> >> > -Yonik
> >>  >> >> >
> >>  >> >>
> >>  >>
> 

Reply via email to