Right. Each shard has to sort over 30M documents, ship the candidate 30M to the aggregator which sorts into the final top 10 (assuming rows=10). gah...
You want to see either the cursorMark stuff or the export handler, depending on whether the goal is to return one page at a time or the entire set. Note that export has some restrictions (i.e. it only returns docValues fields). The cursormark capability was explicitly added to handle this case, although it does _not_ handle something like "go to last page", rather it handles paging through to the last page (which for something like this would only be a program of some sort). BTW, an interesting trick for "go to last page" is to reverse the sort order, i.e. sort by score _ascending_. Then last becomes first...... In general, though, "go to last page" isn't all that useful considering what it takes to support it. https://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/ Best, Erick On Thu, Nov 17, 2016 at 8:39 AM, Susheel Kumar <susheel2...@gmail.com> wrote: > Hi Erick, you got it. I missed to put the rest of the query and the > parameter which caused the issue is start parameter. The start parameter > for this query was put like 30+ milllion by the user due to bad UI design > (deep pagination issue) and bringing the whole cluster down . > > Thnx > > On Thu, Nov 17, 2016 at 11:08 AM, Erick Erickson <erickerick...@gmail.com> > wrote: > >> That query frankly doesn't seem like it'd lead to OOM or run for a >> very long time unless there are (at least) hundreds of terms and a >> _lot_ of documents. Or you're trying to return a zillion rows. Or >> you're faceting on a high cardinality field. Or.... >> >> The terms should be being kept in MMapDirectory space. >> >> My guess is that you aren't showing the part that's really causing the >> problem, perhaps try peeling parts of the query off until you find the >> culprit? >> >> And if you're sorting, faceting or the like docValues will help >> prevent OOM problems. >> >> Best, >> Erick >> >> On Thu, Nov 17, 2016 at 7:17 AM, Davis, Daniel (NIH/NLM) [C] >> <daniel.da...@nih.gov> wrote: >> > Mikhail, >> > >> > If the query is not asynchronous, it would certainly be OK to stop the >> long-running query if the client socket is disconnected. I know that is a >> feature of the niche indexer used in the products of www.indexengines.com, >> because I wrote it. We did not have asynchronous queries, and because of >> the content and query-time deduplication, some queries could take hours >> -that's 72 billion objects on a 2U box for you. Hope they've added better >> index-time deduplication by now. >> > >> > Thanks, >> > >> > -dan >> > >> > -----Original Message----- >> > From: Mikhail Khludnev [mailto:m...@apache.org] >> > Sent: Thursday, November 17, 2016 6:55 AM >> > To: solr-user <solr-user@lucene.apache.org> >> > Subject: Re: How to stop long running/memory eating query >> > >> > There is a circuit breaker >> > https://cwiki.apache.org/confluence/display/solr/ >> Common+Query+Parameters#CommonQueryParameters-ThetimeAllowedParameter >> > If I'm right, it does not interrupt faceting. >> > >> > On Thu, Nov 17, 2016 at 2:07 PM, Susheel Kumar <susheel2...@gmail.com> >> > wrote: >> > >> >> Hello, >> >> >> >> We found a query which was running forever and thus causing OOM ( >> >> q=+AND++AND+Tom+AND+Jerry...). Is there any way similar to SQL/No SQL >> >> world where we can watch currently executed queries and able to kill >> them. >> >> This can be desiring feature in these situations and avoid whole >> >> cluster going down. Is there any existing JIRA/can create one. >> >> >> >> Also what would be the different ways we can examine and stop such >> >> queries to execute. >> >> >> >> Thanks, >> >> Susheel >> >> >> > >> > >> > >> > -- >> > Sincerely yours >> > Mikhail Khludnev >>