Some bots will do that, too. Maybe badly written ones, but we saw that at 
Netflix. It was causing search timeouts just before a peak traffic period, so 
we set a page limit in the front end, something like 200 pages.

It makes sense for that to be very slow, because a request for hit 28838540 
means that Solr has to calculate the relevance for 28838540 + 10 documents.

Fuad: Why are you benchmarking this? What user is looking at 20M documents? 

wunder

On Dec 24, 2009, at 10:44 AM, Erik Hatcher wrote:

> 
> On Dec 24, 2009, at 11:36 AM, Walter Underwood wrote:
>> When do users do a query like that? --wunder
> 
> Well, SolrEntityProcessor "users" do :)
> 
>  http://issues.apache.org/jira/browse/SOLR-1499
>  (which by the way I plan on polishing and committing over the holidays)
> 
>       Erik
> 
> 
> 
>> 
>> On Dec 24, 2009, at 8:09 AM, Fuad Efendi wrote:
>> 
>>> I used pagination for a while till found this...
>>> 
>>> 
>>> I have filtered query ID:[* TO *] returning 20 millions results (no
>>> faceting), and pagination always seemed to be fast. However, fast only with
>>> low values for start=12345. Queries like start=28838540 take 40-60 seconds,
>>> and even cause OutOfMemoryException.
>>> 
>>> I use highlight, faceting on nontokenized "Country" field, standard handler.
>>> 
>>> 
>>> It even seems to be a bug...
>>> 
>>> 
>>> Fuad Efendi
>>> +1 416-993-2060
>>> http://www.linkedin.com/in/liferay
>>> 
>>> Tokenizer Inc.
>>> http://www.tokenizer.ca/
>>> Data Mining, Vertical Search
>>> 
>> 
> 

Reply via email to