Solrcloud 6.6 becomes nuts

2020-05-17 Thread Dominique Bejean
Hi, I have a six node Solrcoud that suddenly has its six nodes failed with OOM at the same time. This can happen even when the Solrcloud is not under heavy load and there is no indexing. I do not see any raison for this to happen. Here are the description of the issue. Thank you for your suggesti

RE: Filtering large amount of values

2020-05-17 Thread Rudenko, Artur
Hi Mikhail, Thank you for the help, with you suggestion we actually managed to improve the results. We now get and store the docValues in this method instead of inside collect() method: @Override protected void doSetNextReader(LeafReaderContext context) throws IOException { super.doSetNext

Re: Filtering large amount of values

2020-05-17 Thread Mikhail Khludnev
On Sun, May 17, 2020 at 4:57 PM Rudenko, Artur wrote: > Hi Mikhail, > > Thank you for the help, with you suggestion we actually managed to improve > the results. > > We now get and store the docValues in this method instead of inside > collect() method: > > @Override > protected void doSetNextRea

Re: Solrcloud 6.6 becomes nuts

2020-05-17 Thread Mikhail Khludnev
Hello, Dominique. What did it log? Which exception? Do you have a chance to review heap dump? What did consume whole heap? On Sun, May 17, 2020 at 11:05 AM Dominique Bejean wrote: > Hi, > > I have a six node Solrcoud that suddenly has its six nodes failed with OOM > at the same time. > This can

Re: Rule-Based Auth - update not working

2020-05-17 Thread Jason Gerlowski
Hi Isabelle, Two things to keep in mind with Solr's Rule-Based Authorization. 1. Each request is controlled by the first permission to that matches the request. 2. With the permissions you have present, Solr will check them in descending list order. (This isn't always true - collection-specific

Re: Rule-Based Auth - update not working

2020-05-17 Thread Jason Gerlowski
One slight correction: I missed that you actually do have a path/collection-specific permission in your list there. So Solr will check the permissions in descending list-order for most requests - the exception being /luke requests when the /luke permission filters to the top and is checked first.

Re: Solrcloud 6.6 becomes nuts

2020-05-17 Thread Shawn Heisey
On 5/17/2020 2:05 AM, Dominique Bejean wrote: One or two hours before the nodes stop with OOM, we see this scenario on all six nodes during the same five minutes time frame : * a little bit more young gc : from one each second (duration<0.05secs) to one each two or three seconds (duration <0.15 s

Re: Solrcloud 6.6 becomes nuts

2020-05-17 Thread Dominique Bejean
Mickhail, Thank you for your response. --- For the logs On not leader replica, there are no error in log, only WARN due to slow queries. On leader replica, there are these errors: * Twice per minute during all the day before the problem starts and also after the problem start RequestHandlerB

Re: Solrcloud 6.6 becomes nuts

2020-05-17 Thread Dominique Bejean
Hi Shawn, There is no OOM error in logs. I gave more details in response to Mickhail. The problem starts with full GC near 15h20 but Young GC changed a little starting 15h10. Here are the heap usage before and after during this period. https://www.eolya.fr/solr_issue_heap_before_after.png There

Re: Solrcloud 6.6 becomes nuts

2020-05-17 Thread Shawn Heisey
On 5/17/2020 4:18 PM, Dominique Bejean wrote: I was not thinking that queries using facet with fields with high number of unique value but with low hits count can be the origin of this problem. Performance for most things does not depend on numFound (hit count) or the rows parameter. The numb