Hi,

we noticed the same problems here in a rather small setup. 40.000 metadata 
documents with nearly as much files that have „literal.*“ fields with it. While 
7.2.1 has brought some tika issues the real problems started to appear with 
version 7.3.0 which are currently unresolved in 7.4.0. Memory consumption is 
out-of-roof. Where previously 512MB heap was enough, now 6G aren’t enough to 
index all files.

kind regards,

Thomas

> Am 04.07.2018 um 15:03 schrieb Markus Jelsma <markus.jel...@openindex.io>:
> 
> Hello Andrey,
> 
> I didn't think of that! I will try it when i have the courage again, probably 
> next week or so.
> 
> Many thanks,
> Markus
> 
> 
> -----Original message-----
>> From:Kydryavtsev Andrey <werde...@yandex.ru>
>> Sent: Wednesday 4th July 2018 14:48
>> To: solr-user@lucene.apache.org
>> Subject: Re: 7.3 appears to leak
>> 
>> If it is not possible to find a resource leak by code analysis and there is 
>> no better ideas, I can suggest a brute force approach:
>> - Clone Solr's sources from appropriate branch 
>> https://github.com/apache/lucene-solr/tree/branch_7_3
>> - Log every searcher's holder increment/decrement operation in a way to 
>> catch every caller name (use Thread.currentThread().getStackTrace() or 
>> something) 
>> https://github.com/apache/lucene-solr/blob/branch_7_3/solr/core/src/java/org/apache/solr/util/RefCounted.java
>> - Build custom artefacts and upload them on prod
>> - After memory leak happened - analyse logs to see what part of 
>> functionality doesn't decrement searcher after counter was incremented. If 
>> searchers are leaked - there should be such code I guess.
>> 
>> This is not something someone would like to do, but it is what it is.
>> 
>> 
>> 
>> Thank you,
>> 
>> Andrey Kudryavtsev
>> 
>> 
>> 03.07.2018, 14:26, "Markus Jelsma" <markus.jel...@openindex.io>:
>>> Hello Erick,
>>> 
>>> Even the silliest ideas may help us, but unfortunately this is not the 
>>> case. All our Solr nodes run binaries from the same source from our central 
>>> build server, with the same libraries thanks to provisioning. Only schema 
>>> and config are different, but the <lib/> directive is the same all over.
>>> 
>>> Are there any other ideas, speculations, whatever, on why only our main 
>>> text collection leaks a SolrIndexSearcher instance on commit since 7.3.0 
>>> and every version up?
>>> 
>>> Many thanks?
>>> Markus
>>> 
>>> -----Original message-----
>>>>  From:Erick Erickson <erickerick...@gmail.com>
>>>>  Sent: Friday 29th June 2018 19:34
>>>>  To: solr-user <solr-user@lucene.apache.org>
>>>>  Subject: Re: 7.3 appears to leak
>>>> 
>>>>  This is truly puzzling then, I'm clueless. It's hard to imagine this
>>>>  is lurking out there and nobody else notices, but you've eliminated
>>>>  the custom code. And this is also very peculiar:
>>>> 
>>>>  * it occurs only in our main text search collection, all other
>>>>  collections are unaffected;
>>>>  * despite what i said earlier, it is so far unreproducible outside
>>>>  production, even when mimicking production as good as we can;
>>>> 
>>>>  Here's a tedious idea. Restart Solr with the -v option, I _think_ that
>>>>  shows you each and every jar file Solr loads. Is it "somehow" possible
>>>>  that your main collection is loading some jar from somewhere that's
>>>>  different than you expect? 'cause silly ideas like this are all I can
>>>>  come up with.
>>>> 
>>>>  Erick
>>>> 
>>>>  On Fri, Jun 29, 2018 at 9:56 AM, Markus Jelsma
>>>>  <markus.jel...@openindex.io> wrote:
>>>>  > Hello Erick,
>>>>  >
>>>>  > The custom search handler doesn't interact with SolrIndexSearcher, this 
>>>> is really all it does:
>>>>  >
>>>>  >   public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse 
>>>> rsp) throws Exception {
>>>>  >     super.handleRequestBody(req, rsp);
>>>>  >
>>>>  >     if (rsp.getToLog().get("hits") instanceof Integer) {
>>>>  >       rsp.addHttpHeader("X-Solr-Hits", 
>>>> String.valueOf((Integer)rsp.getToLog().get("hits")));
>>>>  >     }
>>>>  >     if (rsp.getToLog().get("hits") instanceof Long) {
>>>>  >       rsp.addHttpHeader("X-Solr-Hits", 
>>>> String.valueOf((Long)rsp.getToLog().get("hits")));
>>>>  >     }
>>>>  >   }
>>>>  >
>>>>  > I am not sure this qualifies as one more to go.
>>>>  >
>>>>  > Re: compiler warnings on resources, yes! This and tests failing due to 
>>>> resources leaks have always warned me when i forgot to release something 
>>>> or decrement a reference. But except for the above method (and the token 
>>>> filters which i really can't disable) are all that is left.
>>>>  >
>>>>  > I am quite desperate about this problem so although i am unwilling to 
>>>> disable stuff, i can do it if i must. But i so reason, yet, to remove the 
>>>> search handler or the token filter stuff, i mean, how could those leak a 
>>>> SolrIndexSearcher?
>>>>  >
>>>>  > Let me know :)
>>>>  >
>>>>  > Many thanks!
>>>>  > Markus
>>>>  >
>>>>  > -----Original message-----
>>>>  >> From:Erick Erickson <erickerick...@gmail.com>
>>>>  >> Sent: Friday 29th June 2018 18:46
>>>>  >> To: solr-user <solr-user@lucene.apache.org>
>>>>  >> Subject: Re: 7.3 appears to leak
>>>>  >>
>>>>  >> bq. The only custom stuff left is an extension of SearchHandler that
>>>>  >> only writes numFound to the response headers.
>>>>  >>
>>>>  >> Well, one more to go ;). It's incredibly easy to overlook
>>>>  >> innocent-seeming calls that increment the underlying reference count
>>>>  >> of some objects but don't decrement them, usually through a close
>>>>  >> call. Which isn't necessarily a close if the underlying reference
>>>>  >> count is still > 0.
>>>>  >>
>>>>  >> You may infer that I've been there and done that ;). Sometime the
>>>>  >> compiler warnings about "resource leak" can help pinpoint those too.
>>>>  >>
>>>>  >> Best,
>>>>  >> Erick
>>>>  >>
>>>>  >> On Fri, Jun 29, 2018 at 9:16 AM, Markus Jelsma
>>>>  >> <markus.jel...@openindex.io> wrote:
>>>>  >> > Hello Yonik,
>>>>  >> >
>>>>  >> > I took one node of the 7.2.1 cluster out of the load balancer so it 
>>>> would only receive shard queries, this way i could kind of 'safely' 
>>>> disable our custom components one by one, while keeping functionality in 
>>>> place by letting the other 7.2.1 nodes continue on with the full 
>>>> configuration.
>>>>  >> >
>>>>  >> > I am now at a point where literally all custom components are 
>>>> deleted or commented out in the config for the node running 7.4. The only 
>>>> custom stuff left is an extension of SearchHandler that only writes 
>>>> numFound to the response headers, and all the token filters in our schema.
>>>>  >> >
>>>>  >> > You were right, it was leaking exactly one SolrIndexSearcher 
>>>> instance on each commit. But, with all our stuff gone, the leak is still 
>>>> there! I triple checked it! Of course, the bastard is locally still not 
>>>> reproducible.
>>>>  >> >
>>>>  >> > So, what is next? I have no clues left.
>>>>  >> >
>>>>  >> > Many, many thanks,
>>>>  >> > Markus
>>>>  >> >
>>>>  >> > -----Original message-----
>>>>  >> >> From:Markus Jelsma <markus.jel...@openindex.io>
>>>>  >> >> Sent: Thursday 28th June 2018 23:52
>>>>  >> >> To: solr-user@lucene.apache.org
>>>>  >> >> Subject: RE: 7.3 appears to leak
>>>>  >> >>
>>>>  >> >> Hello Yonik,
>>>>  >> >>
>>>>  >> >> If leaking a whole SolrIndexSearcher would cause this problem, then 
>>>> the only custom component would be our copy/paste-and-enhance version of 
>>>> the elevator component, is the root of all problems. It is a direct copy 
>>>> of the 7.2 source where only things like getAnalyzedQuery, the 
>>>> ElevationObj and the loop over the map entries is changed.
>>>>  >> >>
>>>>  >> >> There are no changes to code related to the searcher. Other 
>>>> component where we get a RefCount of searcher is used without issues, we 
>>>> always decrement the reference after using it. But those components are 
>>>> not in use in this collection.
>>>>  >> >>
>>>>  >> >> The source has changed a lot with 7.4 but we still use the old 
>>>> code. I will investigate the component thoroughly, even revert to the old 
>>>> 7.2 vanilla component for a brief period in production for one machine. It 
>>>> may not be a problem if i don't let our load balancer access it directly, 
>>>> so it only serves shard queries.
>>>>  >> >>
>>>>  >> >> I will get back to this topic tomorrow!
>>>>  >> >>
>>>>  >> >> Many thanks,
>>>>  >> >> Markus
>>>>  >> >>
>>>>  >> >>
>>>>  >> >>
>>>>  >> >> -----Original message-----
>>>>  >> >> > From:Yonik Seeley <ysee...@gmail.com>
>>>>  >> >> > Sent: Thursday 28th June 2018 23:30
>>>>  >> >> > To: solr-user@lucene.apache.org
>>>>  >> >> > Subject: Re: 7.3 appears to leak
>>>>  >> >> >
>>>>  >> >> > > * SortedIntDocSet instances ánd ConcurrentLRUCache$CacheEntry 
>>>> instances are both leaked on commit;
>>>>  >> >> >
>>>>  >> >> > If these are actually filterCache entries being leaked, it stands 
>>>> to
>>>>  >> >> > reason that a whole searcher is being leaked somewhere.
>>>>  >> >> >
>>>>  >> >> > -Yonik
>>>>  >> >> >
>>>>  >> >>
>>>>  >>
>> 


Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to