Hi, we noticed the same problems here in a rather small setup. 40.000 metadata documents with nearly as much files that have „literal.*“ fields with it. While 7.2.1 has brought some tika issues the real problems started to appear with version 7.3.0 which are currently unresolved in 7.4.0. Memory consumption is out-of-roof. Where previously 512MB heap was enough, now 6G aren’t enough to index all files.
kind regards, Thomas > Am 04.07.2018 um 15:03 schrieb Markus Jelsma <markus.jel...@openindex.io>: > > Hello Andrey, > > I didn't think of that! I will try it when i have the courage again, probably > next week or so. > > Many thanks, > Markus > > > -----Original message----- >> From:Kydryavtsev Andrey <werde...@yandex.ru> >> Sent: Wednesday 4th July 2018 14:48 >> To: solr-user@lucene.apache.org >> Subject: Re: 7.3 appears to leak >> >> If it is not possible to find a resource leak by code analysis and there is >> no better ideas, I can suggest a brute force approach: >> - Clone Solr's sources from appropriate branch >> https://github.com/apache/lucene-solr/tree/branch_7_3 >> - Log every searcher's holder increment/decrement operation in a way to >> catch every caller name (use Thread.currentThread().getStackTrace() or >> something) >> https://github.com/apache/lucene-solr/blob/branch_7_3/solr/core/src/java/org/apache/solr/util/RefCounted.java >> - Build custom artefacts and upload them on prod >> - After memory leak happened - analyse logs to see what part of >> functionality doesn't decrement searcher after counter was incremented. If >> searchers are leaked - there should be such code I guess. >> >> This is not something someone would like to do, but it is what it is. >> >> >> >> Thank you, >> >> Andrey Kudryavtsev >> >> >> 03.07.2018, 14:26, "Markus Jelsma" <markus.jel...@openindex.io>: >>> Hello Erick, >>> >>> Even the silliest ideas may help us, but unfortunately this is not the >>> case. All our Solr nodes run binaries from the same source from our central >>> build server, with the same libraries thanks to provisioning. Only schema >>> and config are different, but the <lib/> directive is the same all over. >>> >>> Are there any other ideas, speculations, whatever, on why only our main >>> text collection leaks a SolrIndexSearcher instance on commit since 7.3.0 >>> and every version up? >>> >>> Many thanks? >>> Markus >>> >>> -----Original message----- >>>> From:Erick Erickson <erickerick...@gmail.com> >>>> Sent: Friday 29th June 2018 19:34 >>>> To: solr-user <solr-user@lucene.apache.org> >>>> Subject: Re: 7.3 appears to leak >>>> >>>> This is truly puzzling then, I'm clueless. It's hard to imagine this >>>> is lurking out there and nobody else notices, but you've eliminated >>>> the custom code. And this is also very peculiar: >>>> >>>> * it occurs only in our main text search collection, all other >>>> collections are unaffected; >>>> * despite what i said earlier, it is so far unreproducible outside >>>> production, even when mimicking production as good as we can; >>>> >>>> Here's a tedious idea. Restart Solr with the -v option, I _think_ that >>>> shows you each and every jar file Solr loads. Is it "somehow" possible >>>> that your main collection is loading some jar from somewhere that's >>>> different than you expect? 'cause silly ideas like this are all I can >>>> come up with. >>>> >>>> Erick >>>> >>>> On Fri, Jun 29, 2018 at 9:56 AM, Markus Jelsma >>>> <markus.jel...@openindex.io> wrote: >>>> > Hello Erick, >>>> > >>>> > The custom search handler doesn't interact with SolrIndexSearcher, this >>>> is really all it does: >>>> > >>>> > public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse >>>> rsp) throws Exception { >>>> > super.handleRequestBody(req, rsp); >>>> > >>>> > if (rsp.getToLog().get("hits") instanceof Integer) { >>>> > rsp.addHttpHeader("X-Solr-Hits", >>>> String.valueOf((Integer)rsp.getToLog().get("hits"))); >>>> > } >>>> > if (rsp.getToLog().get("hits") instanceof Long) { >>>> > rsp.addHttpHeader("X-Solr-Hits", >>>> String.valueOf((Long)rsp.getToLog().get("hits"))); >>>> > } >>>> > } >>>> > >>>> > I am not sure this qualifies as one more to go. >>>> > >>>> > Re: compiler warnings on resources, yes! This and tests failing due to >>>> resources leaks have always warned me when i forgot to release something >>>> or decrement a reference. But except for the above method (and the token >>>> filters which i really can't disable) are all that is left. >>>> > >>>> > I am quite desperate about this problem so although i am unwilling to >>>> disable stuff, i can do it if i must. But i so reason, yet, to remove the >>>> search handler or the token filter stuff, i mean, how could those leak a >>>> SolrIndexSearcher? >>>> > >>>> > Let me know :) >>>> > >>>> > Many thanks! >>>> > Markus >>>> > >>>> > -----Original message----- >>>> >> From:Erick Erickson <erickerick...@gmail.com> >>>> >> Sent: Friday 29th June 2018 18:46 >>>> >> To: solr-user <solr-user@lucene.apache.org> >>>> >> Subject: Re: 7.3 appears to leak >>>> >> >>>> >> bq. The only custom stuff left is an extension of SearchHandler that >>>> >> only writes numFound to the response headers. >>>> >> >>>> >> Well, one more to go ;). It's incredibly easy to overlook >>>> >> innocent-seeming calls that increment the underlying reference count >>>> >> of some objects but don't decrement them, usually through a close >>>> >> call. Which isn't necessarily a close if the underlying reference >>>> >> count is still > 0. >>>> >> >>>> >> You may infer that I've been there and done that ;). Sometime the >>>> >> compiler warnings about "resource leak" can help pinpoint those too. >>>> >> >>>> >> Best, >>>> >> Erick >>>> >> >>>> >> On Fri, Jun 29, 2018 at 9:16 AM, Markus Jelsma >>>> >> <markus.jel...@openindex.io> wrote: >>>> >> > Hello Yonik, >>>> >> > >>>> >> > I took one node of the 7.2.1 cluster out of the load balancer so it >>>> would only receive shard queries, this way i could kind of 'safely' >>>> disable our custom components one by one, while keeping functionality in >>>> place by letting the other 7.2.1 nodes continue on with the full >>>> configuration. >>>> >> > >>>> >> > I am now at a point where literally all custom components are >>>> deleted or commented out in the config for the node running 7.4. The only >>>> custom stuff left is an extension of SearchHandler that only writes >>>> numFound to the response headers, and all the token filters in our schema. >>>> >> > >>>> >> > You were right, it was leaking exactly one SolrIndexSearcher >>>> instance on each commit. But, with all our stuff gone, the leak is still >>>> there! I triple checked it! Of course, the bastard is locally still not >>>> reproducible. >>>> >> > >>>> >> > So, what is next? I have no clues left. >>>> >> > >>>> >> > Many, many thanks, >>>> >> > Markus >>>> >> > >>>> >> > -----Original message----- >>>> >> >> From:Markus Jelsma <markus.jel...@openindex.io> >>>> >> >> Sent: Thursday 28th June 2018 23:52 >>>> >> >> To: solr-user@lucene.apache.org >>>> >> >> Subject: RE: 7.3 appears to leak >>>> >> >> >>>> >> >> Hello Yonik, >>>> >> >> >>>> >> >> If leaking a whole SolrIndexSearcher would cause this problem, then >>>> the only custom component would be our copy/paste-and-enhance version of >>>> the elevator component, is the root of all problems. It is a direct copy >>>> of the 7.2 source where only things like getAnalyzedQuery, the >>>> ElevationObj and the loop over the map entries is changed. >>>> >> >> >>>> >> >> There are no changes to code related to the searcher. Other >>>> component where we get a RefCount of searcher is used without issues, we >>>> always decrement the reference after using it. But those components are >>>> not in use in this collection. >>>> >> >> >>>> >> >> The source has changed a lot with 7.4 but we still use the old >>>> code. I will investigate the component thoroughly, even revert to the old >>>> 7.2 vanilla component for a brief period in production for one machine. It >>>> may not be a problem if i don't let our load balancer access it directly, >>>> so it only serves shard queries. >>>> >> >> >>>> >> >> I will get back to this topic tomorrow! >>>> >> >> >>>> >> >> Many thanks, >>>> >> >> Markus >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> -----Original message----- >>>> >> >> > From:Yonik Seeley <ysee...@gmail.com> >>>> >> >> > Sent: Thursday 28th June 2018 23:30 >>>> >> >> > To: solr-user@lucene.apache.org >>>> >> >> > Subject: Re: 7.3 appears to leak >>>> >> >> > >>>> >> >> > > * SortedIntDocSet instances ánd ConcurrentLRUCache$CacheEntry >>>> instances are both leaked on commit; >>>> >> >> > >>>> >> >> > If these are actually filterCache entries being leaked, it stands >>>> to >>>> >> >> > reason that a whole searcher is being leaked somewhere. >>>> >> >> > >>>> >> >> > -Yonik >>>> >> >> > >>>> >> >> >>>> >> >>
signature.asc
Description: Message signed with OpenPGP