Shawn, Thanks. It's been a while now, but we did find issues with both cursorMark AND start/rows. the effect was much more obvious with cursorMark. We were able to address this by switching to use TLOG replicas. These give consistent results. It's nice to know that the cursorMark problems were related to relevancy retrieval order.
We found one major drawback with TLOG replicas, and that was that CDCR was broken for TLOG replicas. There is a Jira on this, and it is being addressed. NRT may have a use case, but I think that reproducible correct results should trump performance everytime. We use Solr as a search engine, we almost always want to retrieve results in order of relevancy. I think that we will phase out the use of NRT replicas in favor of TLOG replicas On Fri, Mar 23, 2018 at 7:04 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 3/23/2018 3:47 PM, Webster Homer wrote: > > Just FYI I had a project recently where I tried to use cursorMark in > > Solrcloud and solr 7.2.0 and it was very unreliable. It couldn't even > > return consistent numberFound values. I posted about it in this forum. > > Using the start and rows arguments in SolrQuery did work reliably so I > > abandoned cursorMark as just too buggy > > > > I had originally wanted to try using streaming expressions, but they > don't > > return results ordered by relevancy, a major limitation for a search > > engine, in my opinion. > > The problems that can affect cursorMark are also problems when using > start/rows pagination. > > You've mentioned relevancy ordering, so I think this is what you're > running into: > > Trying to use relevancy ranking on SolrCloud with NRT replicas can break > pagination. The problem happens both with cursorMark and start/rows. > NRT replicas in a SolrCloud index can have different numbers of deleted > documents. Even though deleted documents do not appear in search > results, they ARE still part of the index, and can affect scoring. > Since SolrCloud load balances requests across replicas, page 1 may use > different replicas than page 2, and end up with different scoring, which > can affect the order of results and change which page number they end up > on. Using TLOG or PULL replicas (available since 7.0) usually fixes > that problem, because different replicas are 100% identical with those > replica types. > > Changing the index in the middle of trying to page through results can > also cause issues with pagination. > > Thanks, > Shawn > > -- This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.emdgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer.