By 1 replica, I mean a single copy of the shard with no redundancy. We haven't encountered any problems with the testing environment solr instances, that weren't expected. At least that I'm aware of.
I do have the logs saved from the time frame the issue occurred in if those would be useful. We're running Solr 6.3.0 on Ubuntu 16.04 virtual machines. On Thu, Aug 3, 2017 at 9:18 AM Shawn Heisey <apa...@elyograg.org> wrote: > On 8/3/2017 6:30 AM, Chris Ulicny wrote: > > I've run into an issue in a test environment where a document exists, but > > fails to be retrieved consistently by /get requests. In a series of 10 > > requests for the specific document across a few minute timespan, one of > the > > middle requests returned a null document. > > > > Currently, nothing is updating existing records in the collection, so it > > couldn't have actually been deleted. > > > > The test cloud and collection have 3 nodes, 6 shards, and 1 replica per > > shard. Based on the fact that the node that was queried was not the node > > the document resided on, I assume that there may have been a temporary > > connectivity issue that we're unaware of and the request couldn't find > the > > document and returned null. > > When you say "1 replica" do you mean that there are two copies of each > shard (leader and replica) or one copy (no redundancy)? I ask because > this is a common point of confusion about SolrCloud terminology. If you > have two copies, then you have two replicas -- because the leader IS a > replica. > > If there are two copies, you might be in a situation where the two > copies are out of sync for some reason, and one copy has the document > but the other doesn't. Because SolrCloud load balances requests, > sometimes the query will be serviced by one copy, sometimes by the other. > > If there is only one copy of each shard, then I do not know how this > could happen, and it might indicate some kind of a problem with your > install. > > Thanks, > Shawn > >