There may be other ways, easiest way is to write a script that gets the cluster
status, and for each collection per replica you will have these details:
"collections":{
“collection1":{
"pullReplicas":"0",
"replicationFactor":"1",
"shards":{
"shard1":{
"range":"80000000-8ccbffff",
"state":"active",
"replicas":{"core_node33":{
"core”:"collection1_shard1_replica_n30",
"base_url":"http://host:port/solr",
"node_name”:”host:port",
"state":"active",
"type":"NRT",
"force_set_state":"false",
"leader":"true"}}},
For each replica of each shard make a localized call for numRecords:
base_url/core/sleect?q=*:*&shard=shardX&distrib=false&rows=0
If you have replicas that disagree with each other with the number of records
per shard then u have an issue with replicas not being in sync for a collection.
This is what I meant when I said “replicas out of sync”.
Your situation was actually very simple :) one of you collections has less data.
You seem to have a sync requirement between collections which is interesting,
but thats beyond solr.
Your inter collection sync script needs some debugging most likely :)
> On Aug 12, 2020, at 4:29 PM, Jae Joo <[email protected]> wrote:
>
> Good question. How can I validate if the replicas are all synched?
>
>
> On Wed, Aug 12, 2020 at 7:28 PM Jae Joo <[email protected]> wrote:
>
>> numFound is same but different score.
>> <result name="response" numFound="755970" start="0" maxScore="4.70519">
>> <result name="response" numFound="755970" start="0" maxScore="4.70519">
>> <result name="response" numFound="755970" start="0" maxScore="4.70519">
>> <result name="response" numFound="755970" start="0" maxScore="4.7738605">
>> <result name="response" numFound="755970" start="0" maxScore="4.659804">
>> <result name="response" numFound="755970" start="0" maxScore="4.659804">
>> <result name="response" numFound="755970" start="0" maxScore="4.659804">
>>
>> On Wed, Aug 12, 2020 at 6:01 PM Aroop Ganguly
>> <[email protected]> wrote:
>>
>>> Try a simple test of querying each collection 5 times in a row, if the
>>> numFound are different for a single collection within tase 5 calls then u
>>> have it.
>>> Please try it, what you may think is sync’d may actually not be. How do
>>> you validate correct sync ?
>>>
>>>> On Aug 12, 2020, at 10:55 AM, Jae Joo <[email protected]> wrote:
>>>>
>>>> The replications are all synched and there are no updates while I was
>>>> testing.
>>>>
>>>>
>>>> On Wed, Aug 12, 2020 at 1:49 PM Aroop Ganguly
>>>> <[email protected]> wrote:
>>>>
>>>>> Most likely you have 1 or more collections behind the alias that have
>>>>> replicas out of sync :)
>>>>>
>>>>> Try querying each collection to find the one out of sync.
>>>>>
>>>>>> On Aug 12, 2020, at 10:47 AM, Jae Joo <[email protected]> wrote:
>>>>>>
>>>>>> I have 10 collections in single alias and having different result sets
>>>>> for
>>>>>> every time with the same query.
>>>>>>
>>>>>> Is it as designed or do I miss something?
>>>>>>
>>>>>> The configuration and schema for all 10 collections are identical.
>>>>>> Thanks,
>>>>>>
>>>>>> Jae
>>>>>
>>>>>
>>>
>>>