On Wed, Mar 25, 2015 at 12:51 PM, Shai Erera <ser...@gmail.com> wrote:

> Thanks.
>
> Does Solr ever clean up those states? I.e. does it ever remove "down"
> replicas, or replicas belonging to non-live_nodes after some time? Or will
> these remain in the cluster state forever (assuming they never come back
> up)?
>

No, they remain there forever. You can still call the deletereplica API to
clean them up. There's even a param onyIfDown=true which will remove a
replica only if it's already 'down'.


>
> If they remain there, is there any penalty? E.g. Solr tries to send them
> updates, maybe tries to route search requests to? I'm talking about
> replicas that stay in ACTIVE state, but their nodes aren't under
> /live_nodes.
>

No, there is no penalty because we always check for the state=active and
the live-ness before routing any requests to a replica.


>
> Shai
>
> On Wed, Mar 25, 2015 at 8:05 PM, Shalin Shekhar Mangar <
> shalinman...@gmail.com> wrote:
>
> > Comments inline:
> >
> > On Wed, Mar 25, 2015 at 8:30 AM, Shai Erera <ser...@gmail.com> wrote:
> >
> > > Hi
> > >
> > > Is it possible for a replica to be DOWN, while the node it resides on
> is
> > > under /live_nodes? If so, what can lead to it, aside from someone
> > unloading
> > > a core.
> > >
> >
> > Yes, aside from someone unloading the index, this can happen in two ways
> 1)
> > during startup each core publishes it's state as 'down' before it enters
> > recovery, and 2) the leader force-publishes a replica as 'down' if it is
> > not able to forward updates to that replica (this mechanism is called
> > Leader-Initiated-Recovery or LIR in short)
> >
> > The #2 above can happen when the replica is partitioned from leader but
> > both are able to talk to ZooKeeper.
> >
> >
> > >
> > > I don't know if each SolrCore reports status to ZK independently, or
> it's
> > > done by the Solr process as a whole.
> > >
> > >
> > It is done on a per-core basis for now. But the 'live' node is maintained
> > one per Solr instance (JVM).
> >
> >
> > > Also, is it possible for a replica to report ACTIVE, while the node it
> > > lives on is no longer under /live_nodes? Are there any ZK timings that
> > can
> > > cause that?
> > >
> >
> > Yes, this can happen if the JVM crashed. A replica publishes itself as
> > 'down' on shutdown so if the graceful shutdown step is skipped then the
> > replica will continue to be 'active' in the cluster state. Even LIR
> doesn't
> > apply here because there's no point in the leader marking a node as
> 'down'
> > if it is not 'live' already.
> >
> >
> > >
> > > Shai
> > >
> >
> >
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
> >
>



-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to