Conceptually asking for cods 900-1000 works something like this. Solr (well, 
Lucene actually) has to keep a sorted list 1,000 items long of scores and doc 
IDs because you can’t know whether doc N+1 will be in the list, or where. So 
the list manipulation is what takes the extra time. For even 1,000 docs, that 
shouldn’t be very much overhead, when it gets up in the 10s of K (or, I’ve seen 
millions) it’s _very_ noticeable.

With the example you’ve talked about, I doubt this is really a problem.

FWIW,
Erick

> On Jan 14, 2020, at 1:40 PM, Gael Jourdan-Weil 
> <gael.jourdan-w...@kelkoogroup.com> wrote:
> 
> Ok I understand better.
> Solr does not "read" the 1 to 900 docs to retrieve 901 to 1000 but it still 
> needs to compute some stuff (docset intersection or something like that, 
> right?) and sort, which is costly, and then "read" the docs.
> 
>> Are those 10 requests happening simultaneously, or consecutively?  If 
>> it's simultaneous, then they won't benefit from Solr caching.  Because 
>> Solr can cache certain things, it would probably be faster to make 10 
>> consecutive requests than 10 simultaneous.
> 
> The 10 requests are simultaneous which is I think an explanation of the 
> issues we encounter. If they were consecutive, I'd expect to take benefit of 
> the cache indeed.
> 
>> What are you trying to accomplish when you make these queries?  If we 
>> understand that, perhaps we can come up with something better.
> 
> Actually we are exposing a search engine and it's a behavior from some of our 
> clients.
> It's not a behavior we are deliberately doing or encouraging.
> But before discussing with them, we wanted to understand a bit better what in 
> Solr explain those response times.
> 
> Regards,
> Gaël
> 

Reply via email to