Exactly how did you kill the instance? If I stop Solr gracefully (bin/solr 
stop…) it’s fine. If I do a "kill -9” on it, I see the same thing you do on 
master.

It’s a bit tricky. When a node goes away without a chance to gracefully shut 
down, there’s no chance to set the state in the collection’s “state.json” 
znode. However, the node will be removed from the “live_nodes” list and a 
replica is not truly active unless its state is “active” in the state.json file 
_and_ the node appears in live_nodes.

CLUSTERSTATUS pretty clearly understands this, but COLSTATUS apparently doesn’t.

I’ll raise a JIRA.

Thanks for letting us know

Erick

> On Oct 29, 2019, at 2:10 PM, Elizaveta Golova <egol...@uk.ibm.com> wrote:
> 
> colStatus (and clusterStatus) from the Collections api.
> https://lucene.apache.org/solr/guide/8_1/collections-api.html#colstatus
> 
> 
> Running something like this in the browser where the live solr node is 
> accessible on port 8983 (but points at a Docker container which is running 
> the Solr node):
> http://localhost:8983/solr/admin/collections?action=COLSTATUS&collection=coll
> 
> 
> 
> 
> -----Erick Erickson <erickerick...@gmail.com> wrote: -----
> To: solr-user@lucene.apache.org
> From: Erick Erickson <erickerick...@gmail.com>
> Date: 10/29/2019 05:39PM
> Subject: [EXTERNAL] Re: colStatus response not as expected with Solr 8.1.1 in 
> a distributed deployment
> 
> 
> Uhm, what is colStatus? You need to show us _exactly_ what Solr commands 
> you’re running for us to make any intelligent comments.
> 
>> On Oct 29, 2019, at 1:12 PM, Elizaveta Golova <egol...@uk.ibm.com> wrote:
>> 
>> Hi,
>> 
>> We're seeing an issue with colStatus in a distributed Solr deployment.
>> 
>> Scenario:
>> Collection with:
>> - 1 zk
>> - 2 solr nodes on different boxes (simulated using Docker containers)
>> - replication factor 5
>> 
>> When we take down one node, our clusterStatus response is as expected (only 
>> listing the live node as live, and anything on the "down" node shows the 
>> state as down).
>> 
>> Our colStatus response however continues to shows every shard as being 
>> "active" with the replica breakdown on every shard as "total" == "active", 
>> and "down" always being zero.
>> i.e.
>> "shards":{
>> "shard1":{
>> "state":"active",
>> "range":"80000000-ffffffff",
>> "replicas":{
>> "total":5,
>> "active":5,
>> "down":0,
>> "recovering":0,
>> "recovery_failed":0},
>> 
>> Even though we expect the "down" count to be either 3 or 2 depending on the 
>> shard (and thus "active" being of count 2 or 3 less than it is).
>> 
>> When testing this situation with both Solr nodes being on the same box, the 
>> colStatus response is as expected in regards to the replica counts.
>> 
>> Thanks!Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number 
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>> 
> 
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 
> 741598. 
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> 

Reply via email to