Hello Thomas,

To be absolutely sure you suffer from the same problem as one of our 
collections, can you confirm that your Solr cores are leaking a 
SolrIndexSearcher instance on each commit? If not, there may be a second 
problem.

Also, do you run any custom plugins or apply patches to your Solr instances? Or 
is your Solr a 100 % official build?

Thanks,
Markus

 
 
-----Original message-----
> From:Thomas Scheffler <thomas.scheff...@uni-jena.de>
> Sent: Monday 16th July 2018 13:39
> To: solr-user@lucene.apache.org
> Subject: Re: 7.3 appears to leak
> 
> Hi,
> 
> we noticed the same problems here in a rather small setup. 40.000 metadata 
> documents with nearly as much files that have „literal.*“ fields with it. 
> While 7.2.1 has brought some tika issues the real problems started to appear 
> with version 7.3.0 which are currently unresolved in 7.4.0. Memory 
> consumption is out-of-roof. Where previously 512MB heap was enough, now 6G 
> aren’t enough to index all files.
> 
> kind regards,
> 
> Thomas
> 
> > Am 04.07.2018 um 15:03 schrieb Markus Jelsma <markus.jel...@openindex.io>:
> > 
> > Hello Andrey,
> > 
> > I didn't think of that! I will try it when i have the courage again, 
> > probably next week or so.
> > 
> > Many thanks,
> > Markus
> > 
> > 
> > -----Original message-----
> >> From:Kydryavtsev Andrey <werde...@yandex.ru>
> >> Sent: Wednesday 4th July 2018 14:48
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: 7.3 appears to leak
> >> 
> >> If it is not possible to find a resource leak by code analysis and there 
> >> is no better ideas, I can suggest a brute force approach:
> >> - Clone Solr's sources from appropriate branch 
> >> https://github.com/apache/lucene-solr/tree/branch_7_3
> >> - Log every searcher's holder increment/decrement operation in a way to 
> >> catch every caller name (use Thread.currentThread().getStackTrace() or 
> >> something) 
> >> https://github.com/apache/lucene-solr/blob/branch_7_3/solr/core/src/java/org/apache/solr/util/RefCounted.java
> >> - Build custom artefacts and upload them on prod
> >> - After memory leak happened - analyse logs to see what part of 
> >> functionality doesn't decrement searcher after counter was incremented. If 
> >> searchers are leaked - there should be such code I guess.
> >> 
> >> This is not something someone would like to do, but it is what it is.
> >> 
> >> 
> >> 
> >> Thank you,
> >> 
> >> Andrey Kudryavtsev
> >> 
> >> 
> >> 03.07.2018, 14:26, "Markus Jelsma" <markus.jel...@openindex.io>:
> >>> Hello Erick,
> >>> 
> >>> Even the silliest ideas may help us, but unfortunately this is not the 
> >>> case. All our Solr nodes run binaries from the same source from our 
> >>> central build server, with the same libraries thanks to provisioning. 
> >>> Only schema and config are different, but the <lib/> directive is the 
> >>> same all over.
> >>> 
> >>> Are there any other ideas, speculations, whatever, on why only our main 
> >>> text collection leaks a SolrIndexSearcher instance on commit since 7.3.0 
> >>> and every version up?
> >>> 
> >>> Many thanks?
> >>> Markus
> >>> 
> >>> -----Original message-----
> >>>>  From:Erick Erickson <erickerick...@gmail.com>
> >>>>  Sent: Friday 29th June 2018 19:34
> >>>>  To: solr-user <solr-user@lucene.apache.org>
> >>>>  Subject: Re: 7.3 appears to leak
> >>>> 
> >>>>  This is truly puzzling then, I'm clueless. It's hard to imagine this
> >>>>  is lurking out there and nobody else notices, but you've eliminated
> >>>>  the custom code. And this is also very peculiar:
> >>>> 
> >>>>  * it occurs only in our main text search collection, all other
> >>>>  collections are unaffected;
> >>>>  * despite what i said earlier, it is so far unreproducible outside
> >>>>  production, even when mimicking production as good as we can;
> >>>> 
> >>>>  Here's a tedious idea. Restart Solr with the -v option, I _think_ that
> >>>>  shows you each and every jar file Solr loads. Is it "somehow" possible
> >>>>  that your main collection is loading some jar from somewhere that's
> >>>>  different than you expect? 'cause silly ideas like this are all I can
> >>>>  come up with.
> >>>> 
> >>>>  Erick
> >>>> 
> >>>>  On Fri, Jun 29, 2018 at 9:56 AM, Markus Jelsma
> >>>>  <markus.jel...@openindex.io> wrote:
> >>>>  > Hello Erick,
> >>>>  >
> >>>>  > The custom search handler doesn't interact with SolrIndexSearcher, 
> >>>> this is really all it does:
> >>>>  >
> >>>>  >   public void handleRequestBody(SolrQueryRequest req, 
> >>>> SolrQueryResponse rsp) throws Exception {
> >>>>  >     super.handleRequestBody(req, rsp);
> >>>>  >
> >>>>  >     if (rsp.getToLog().get("hits") instanceof Integer) {
> >>>>  >       rsp.addHttpHeader("X-Solr-Hits", 
> >>>> String.valueOf((Integer)rsp.getToLog().get("hits")));
> >>>>  >     }
> >>>>  >     if (rsp.getToLog().get("hits") instanceof Long) {
> >>>>  >       rsp.addHttpHeader("X-Solr-Hits", 
> >>>> String.valueOf((Long)rsp.getToLog().get("hits")));
> >>>>  >     }
> >>>>  >   }
> >>>>  >
> >>>>  > I am not sure this qualifies as one more to go.
> >>>>  >
> >>>>  > Re: compiler warnings on resources, yes! This and tests failing due 
> >>>> to resources leaks have always warned me when i forgot to release 
> >>>> something or decrement a reference. But except for the above method (and 
> >>>> the token filters which i really can't disable) are all that is left.
> >>>>  >
> >>>>  > I am quite desperate about this problem so although i am unwilling to 
> >>>> disable stuff, i can do it if i must. But i so reason, yet, to remove 
> >>>> the search handler or the token filter stuff, i mean, how could those 
> >>>> leak a SolrIndexSearcher?
> >>>>  >
> >>>>  > Let me know :)
> >>>>  >
> >>>>  > Many thanks!
> >>>>  > Markus
> >>>>  >
> >>>>  > -----Original message-----
> >>>>  >> From:Erick Erickson <erickerick...@gmail.com>
> >>>>  >> Sent: Friday 29th June 2018 18:46
> >>>>  >> To: solr-user <solr-user@lucene.apache.org>
> >>>>  >> Subject: Re: 7.3 appears to leak
> >>>>  >>
> >>>>  >> bq. The only custom stuff left is an extension of SearchHandler that
> >>>>  >> only writes numFound to the response headers.
> >>>>  >>
> >>>>  >> Well, one more to go ;). It's incredibly easy to overlook
> >>>>  >> innocent-seeming calls that increment the underlying reference count
> >>>>  >> of some objects but don't decrement them, usually through a close
> >>>>  >> call. Which isn't necessarily a close if the underlying reference
> >>>>  >> count is still > 0.
> >>>>  >>
> >>>>  >> You may infer that I've been there and done that ;). Sometime the
> >>>>  >> compiler warnings about "resource leak" can help pinpoint those too.
> >>>>  >>
> >>>>  >> Best,
> >>>>  >> Erick
> >>>>  >>
> >>>>  >> On Fri, Jun 29, 2018 at 9:16 AM, Markus Jelsma
> >>>>  >> <markus.jel...@openindex.io> wrote:
> >>>>  >> > Hello Yonik,
> >>>>  >> >
> >>>>  >> > I took one node of the 7.2.1 cluster out of the load balancer so 
> >>>> it would only receive shard queries, this way i could kind of 'safely' 
> >>>> disable our custom components one by one, while keeping functionality in 
> >>>> place by letting the other 7.2.1 nodes continue on with the full 
> >>>> configuration.
> >>>>  >> >
> >>>>  >> > I am now at a point where literally all custom components are 
> >>>> deleted or commented out in the config for the node running 7.4. The 
> >>>> only custom stuff left is an extension of SearchHandler that only writes 
> >>>> numFound to the response headers, and all the token filters in our 
> >>>> schema.
> >>>>  >> >
> >>>>  >> > You were right, it was leaking exactly one SolrIndexSearcher 
> >>>> instance on each commit. But, with all our stuff gone, the leak is still 
> >>>> there! I triple checked it! Of course, the bastard is locally still not 
> >>>> reproducible.
> >>>>  >> >
> >>>>  >> > So, what is next? I have no clues left.
> >>>>  >> >
> >>>>  >> > Many, many thanks,
> >>>>  >> > Markus
> >>>>  >> >
> >>>>  >> > -----Original message-----
> >>>>  >> >> From:Markus Jelsma <markus.jel...@openindex.io>
> >>>>  >> >> Sent: Thursday 28th June 2018 23:52
> >>>>  >> >> To: solr-user@lucene.apache.org
> >>>>  >> >> Subject: RE: 7.3 appears to leak
> >>>>  >> >>
> >>>>  >> >> Hello Yonik,
> >>>>  >> >>
> >>>>  >> >> If leaking a whole SolrIndexSearcher would cause this problem, 
> >>>> then the only custom component would be our copy/paste-and-enhance 
> >>>> version of the elevator component, is the root of all problems. It is a 
> >>>> direct copy of the 7.2 source where only things like getAnalyzedQuery, 
> >>>> the ElevationObj and the loop over the map entries is changed.
> >>>>  >> >>
> >>>>  >> >> There are no changes to code related to the searcher. Other 
> >>>> component where we get a RefCount of searcher is used without issues, we 
> >>>> always decrement the reference after using it. But those components are 
> >>>> not in use in this collection.
> >>>>  >> >>
> >>>>  >> >> The source has changed a lot with 7.4 but we still use the old 
> >>>> code. I will investigate the component thoroughly, even revert to the 
> >>>> old 7.2 vanilla component for a brief period in production for one 
> >>>> machine. It may not be a problem if i don't let our load balancer access 
> >>>> it directly, so it only serves shard queries.
> >>>>  >> >>
> >>>>  >> >> I will get back to this topic tomorrow!
> >>>>  >> >>
> >>>>  >> >> Many thanks,
> >>>>  >> >> Markus
> >>>>  >> >>
> >>>>  >> >>
> >>>>  >> >>
> >>>>  >> >> -----Original message-----
> >>>>  >> >> > From:Yonik Seeley <ysee...@gmail.com>
> >>>>  >> >> > Sent: Thursday 28th June 2018 23:30
> >>>>  >> >> > To: solr-user@lucene.apache.org
> >>>>  >> >> > Subject: Re: 7.3 appears to leak
> >>>>  >> >> >
> >>>>  >> >> > > * SortedIntDocSet instances ánd ConcurrentLRUCache$CacheEntry 
> >>>> instances are both leaked on commit;
> >>>>  >> >> >
> >>>>  >> >> > If these are actually filterCache entries being leaked, it 
> >>>> stands to
> >>>>  >> >> > reason that a whole searcher is being leaked somewhere.
> >>>>  >> >> >
> >>>>  >> >> > -Yonik
> >>>>  >> >> >
> >>>>  >> >>
> >>>>  >>
> >> 
> 
> 
> 

Reply via email to