On 3/23/2018 3:47 PM, Webster Homer wrote:
> Just FYI I had a project recently where I tried to use cursorMark in
> Solrcloud and solr 7.2.0 and it was very unreliable. It couldn't even
> return consistent numberFound values. I posted about it in this forum.
> Using the start and rows arguments in SolrQuery did work reliably so I
> abandoned cursorMark as just too buggy
>
> I had originally wanted to try using streaming expressions, but they don't
> return results ordered by relevancy, a major limitation for a search
> engine, in my opinion.

The problems that can affect cursorMark are also problems when using
start/rows pagination.

You've mentioned relevancy ordering, so I think this is what you're
running into:

Trying to use relevancy ranking on SolrCloud with NRT replicas can break
pagination.  The problem happens both with cursorMark and start/rows. 
NRT replicas in a SolrCloud index can have different numbers of deleted
documents.  Even though deleted documents do not appear in search
results, they ARE still part of the index, and can affect scoring. 
Since SolrCloud load balances requests across replicas, page 1 may use
different replicas than page 2, and end up with different scoring, which
can affect the order of results and change which page number they end up
on.  Using TLOG or PULL replicas (available since 7.0) usually fixes
that problem, because different replicas are 100% identical with those
replica types.

Changing the index in the middle of trying to page through results can
also cause issues with pagination.

Thanks,
Shawn

Reply via email to