Appreciate the response, to answer your questions.

* Do you see this happen often? How often?
It has happened twice in five days. The first two days after deployment.

* Are there any known network issues?
There are no obvious network issues but as these instances reside in AWS i
cannot rule it out network blips.

* Do you have any idea about the GC on those replicas?
I have been monitoring the memory usage and all instances are using no more
than 30% of its JVM memory allocation.




On 27 April 2015 at 21:36, Anshum Gupta <ans...@anshumgupta.net> wrote:

> Looks like LeaderInitiatedRecovery or LIR. When a leader receives a
> document (update) but fails to successfully forward it to a replica, it
> marks that replica as down and asks the replica to recover (hence the name,
> Leader Initiated Recovery). It could be due to multiple reasons e.g.
> network issue/GC. The replica generally comes back up and syncs with the
> leader transparently. As an end-user, you don't have to really worry much
> about this but if you want to dig deeper, here are a few questions that
> might help us in suggesting what to do/look at.
> * Do you see this happen often? How often?
> * Are there any known network issues?
> * Do you have any idea about the GC on those replicas?
>
>
> On Mon, Apr 27, 2015 at 1:25 PM, Amit L <amitlal...@gmail.com> wrote:
>
> > Hi,
> >
> > A few days ago I deployed a solr 4.9.0 cluster, which consists of 2
> > collections. Each collection has 1 shard with 3 replicates on 3 different
> > machines.
> >
> > On the first day I noticed this error appear on the leader. Full Log -
> > http://pastebin.com/wcPMZb0s
> >
> > 4/23/2015, 2:34:37 PM SEVERE SolrCmdDistributor
> > org.apache.solr.client.solrj.SolrServerException: IOException occured
> when
> > talking to server at:
> > http://production-solrcloud-004:8080/solr/bookings_shard1_replica2
> >
> > 4/23/2015, 2:34:37 PM WARNING DistributedUpdateProcessor
> > Error sending update
> >
> > 4/23/2015, 2:34:37 PM WARNING ZkController
> > Leader is publishing core=bookings_shard1_replica2 state=down on behalf
> of
> > un-reachable replica
> > http://production-solrcloud-004:8080/solr/bookings_shard1_replica2/;
> > forcePublishState? false
> >
> >
> > The other 2 replicas had 0 errors.
> >
> > I thought it may be a one off but the same error occured on day 2 which
> has
> > got me slighlty concerned. During these periods I didn't notice any
> issues
> > with the cluster and everything looks healthy in the cloud summary. All
> of
> > the instances are hosted on AWS.
> >
> > Any idea what may be causing this issue and what I can do to mitigate?
> >
> > Thanks
> > Amit
> >
>
>
>
> --
> Anshum Gupta
>

Reply via email to