To add two more pieces of data:
1) This occurs with real, conditional queries as well (eg:
"q=key:timvaillancourt"), not just the "q=*:*" I provided in my email.
2) I've noticed when I bring a node of the SolrCloud down it is
remaining "state: active" in my /clusterstate.json - something is really
wrong with this cloud! Would a Zookeeper issue explain my varied results
when querying a core directly?
Thanks again!
Tim
On 04/12/13 02:17 PM, Tim Vaillancourt wrote:
Hey guys,
I'm looking into a strange issue on an unhealthy 4.3.1 SolrCloud with
3-node external Zookeeper and 1 collection (2 shards, 2 replicas).
Currently we are noticing inconsistent results from the SolrCloud when
performing the same simple /select query many times to our collection.
Almost every other query the numFound count (and the returned data)
jumps between two very different values.
Initially I suspected a replica in a shard of the collection was
inconsistent (and every other request hit that node) and started
performing the same /select query direct to the individual cores of
the SolrCloud collection on each instance, only to notice the same
problem - the count jumps between two very different values!
I may be incorrect here, but I assumed when querying a single core of
a SolrCloud collection, the SolrCloud routing is bypassed and I am
talking directly to a plain/non-SolrCloud core.
As you can see here, the count for 1 core of my SolrCloud collection
fluctuates wildly, and is only receiving updates and no deletes to
explain the jumps:
"solrcloud [tvaillancourt@prodapp solr_cloud]$ curl -s
'http://backend:8983/solr/app_shard2_replica2/select?q=*:*&wt=json&rows=0&indent=true'|grep
numFound
"response":{"numFound":123596839,"start":0,"maxScore":1.0,"docs":[]
solrcloud [tvaillancourt@prodapp solr_cloud]$ curl -s
'http://backend:8983/solr/app_shard2_replica2/select?q=*:*&wt=json&rows=0&indent=true'|grep
numFound
"response":{"numFound":84739144,"start":0,"maxScore":1.0,"docs":[]
solrcloud [tvaillancourt@prodapp solr_cloud]$ curl -s
'http://backend:8983/solr/app_shard2_replica2/select?q=*:*&wt=json&rows=0&indent=true'|grep
numFound
"response":{"numFound":123596839,"start":0,"maxScore":1.0,"docs":[]
solrcloud [tvaillancourt@prodapp solr_cloud]$ curl -s
'http://backend:8983/solr/app_shard2_replica2/select?q=*:*&wt=json&rows=0&indent=true'|grep
numFound
"response":{"numFound":84771358,"start":0,"maxScore":1.0,"docs":[]"
Could anyone help me understand why the same /select query direct to a
single core would return inconsistent, flapping results if there are
no deletes issued in my app to cause such jumps? Am I incorrect in my
assumption that I am querying the core "directly"?
An interesting observation is when I do an /admin/cores call to see
the docCount of the core's index, it does not fluctuate, only the
query result.
That was hard to explain, hopefully someone has some insight! :)
Thanks!
Tim