Did you try enabling distributed IDF (statsCache)? See: https://lucene.apache.org/solr/guide/6_6/distributed-requests.html
It's may not totally fix the issue, but it's worth trying. It does come with a performance penalty of course. Best, Erick On Mon, Feb 26, 2018 at 11:00 AM, Webster Homer <webster.ho...@sial.com> wrote: > Thanks Shawn, I had settled on this as a solution. > > All our use cases for Solr is to return results in order of relevancy to > the query, so having a deterministic sort would defeat that purpose. Since > we wanted to be able to return all the results for a query, I originally > looked at using the Streaming API, but that doesn't support returning > results sorted by relevancy > > I disagree with you about NRT replicas though. They may function as > designed, but since they cannot guarantee consistent results their design > is buggy, at least it is for a search engine. > > > On Mon, Feb 26, 2018 at 12:20 PM, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 2/26/2018 10:26 AM, Webster Homer wrote: >> > We need the results by relevancy so the application sorts the results by >> > score desc, and the unique id ascending as the tie breaker >> >> This is the reason for the discrepancy, and why the different replica >> types don't have the same issue. >> >> Each NRT replica can have different deleted documents than the others, >> just due to the way that NRT replicas work. Deleted documents affect >> relevancy scoring. When one replica has say 5000 deleted documents and >> another has 200, or has 5000 but they're different docs, a relevancy >> sort can end up different. So when Solr goes to one replica for page 1 >> and another for page 2 (which is expected due to SolrCloud's internal >> load balancing), you may end up with duplicate documents or documents >> missing. Because deleted documents are not counted or returned, >> numFound will be consistent, as long as the index doesn't change between >> the queries for pages. >> >> If you were using a deterministic sort rather than relevancy, this >> wouldn't be happening, because deleted documents have no influence on >> that kind of sort. >> >> With TLOG or PULL, the replicas are absolutely identical, so there is no >> difference, unless the index is changing as you page through the results. >> >> I think changing replica types is the only solution here. NRT replicas >> are working as they were designed -- there's no bug, even though >> problems like this do sometimes turn up. >> >> Thanks, >> Shawn >> >> > > -- > > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to > any other person. If you have received this transmission in error, please > notify the sender immediately and delete the message and any attachment > from your system. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not accept liability for any omissions or errors in this > message which may arise as a result of E-Mail-transmission or for damages > resulting from any unauthorized changes of the content of this message and > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not guarantee that this message is free of viruses and does > not accept liability for any damages caused by any virus transmitted > therewith. > > Click http://www.emdgroup.com/disclaimer to access the German, French, > Spanish and Portuguese versions of this disclaimer.