Shawn,
Thanks. It's been a while now, but we did find issues with both cursorMark
AND start/rows. the effect was much more obvious with cursorMark.
We were able to address this by switching to use TLOG replicas. These give
consistent results. It's nice to know that the cursorMark problems were
related to relevancy retrieval order.

We found one major drawback with TLOG replicas, and that was that CDCR was
broken for TLOG replicas. There is a Jira on this, and it is being
addressed. NRT may have a use case, but I think that reproducible correct
results should trump performance everytime. We use Solr as a search engine,
we almost always want to retrieve results in order of relevancy.

I think that we will phase out the use of NRT replicas in favor of TLOG
replicas

On Fri, Mar 23, 2018 at 7:04 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 3/23/2018 3:47 PM, Webster Homer wrote:
> > Just FYI I had a project recently where I tried to use cursorMark in
> > Solrcloud and solr 7.2.0 and it was very unreliable. It couldn't even
> > return consistent numberFound values. I posted about it in this forum.
> > Using the start and rows arguments in SolrQuery did work reliably so I
> > abandoned cursorMark as just too buggy
> >
> > I had originally wanted to try using streaming expressions, but they
> don't
> > return results ordered by relevancy, a major limitation for a search
> > engine, in my opinion.
>
> The problems that can affect cursorMark are also problems when using
> start/rows pagination.
>
> You've mentioned relevancy ordering, so I think this is what you're
> running into:
>
> Trying to use relevancy ranking on SolrCloud with NRT replicas can break
> pagination.  The problem happens both with cursorMark and start/rows.
> NRT replicas in a SolrCloud index can have different numbers of deleted
> documents.  Even though deleted documents do not appear in search
> results, they ARE still part of the index, and can affect scoring.
> Since SolrCloud load balances requests across replicas, page 1 may use
> different replicas than page 2, and end up with different scoring, which
> can affect the order of results and change which page number they end up
> on.  Using TLOG or PULL replicas (available since 7.0) usually fixes
> that problem, because different replicas are 100% identical with those
> replica types.
>
> Changing the index in the middle of trying to page through results can
> also cause issues with pagination.
>
> Thanks,
> Shawn
>
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Reply via email to