Ah, OK. I didn't get that when I read your first e-mail... Hmmm, this is still a puzzle then. Tail the respective Solr logs, you _should_ be seeing the sub-query go to each of them and the sub-query _should_ carry along all of the faceting information. Or this might just be a flat bug...
Best, Erick On Tue, Dec 16, 2014 at 2:46 PM, David Smith <dsmiths...@yahoo.com.invalid> wrote: > Hi Erick, > Thanks for your reply. > My test environment only has one shard and one replica per collection. So, I > think there is no possibility of replicas getting out of sync. Here is how I > create each (month-based) collection: > http://192.168.59.103:8983/solr/admin/collections?action=CREATE&name=2014_01&numShards=1&replicationFactor=1&maxShardsPerNode=1&collection.configName=main_confhttp://192.168.59.103:8983/solr/admin/collections?action=CREATE&name=2014_02&numShards=1&replicationFactor=1&maxShardsPerNode=1&collection.configName=main_confhttp://192.168.59.103:8983/solr/admin/collections?action=CREATE&name=2014_03&numShards=1&replicationFactor=1&maxShardsPerNode=1&collection.configName=main_conf...etc, > etc... > > Still, I think you are on to something. I had already noticed that querying > one collection at a time works. For example, if I change my query > oh-so-slightly from this: > > "....collection=2014_04,2014_03...." > > to this > > "...collection=2014_04...." > > Then, the results are correct 100% of the time. I think substantively this is > the same as specifying the name of the shard since, again, in my test > environment I only have one shard per collection anyway. > I should mention that the "2014_03" collection is empty. 0 documents. All 3 > documents which satisfy the facet range are in the "2014_04" collection. So, > it's a real head-scratcher that introducing that collection name into the > query makes the results misbehave. > Kind regards,David > On Tuesday, December 16, 2014 2:25 PM, Erick Erickson > <erickerick...@gmail.com> wrote: > > > bq: Facet counts include deleted documents until the segments merge > > Whoa! Facet counts do _not_ require segment merging to be accurate. > What merging does is remove the _term_ information associated with > deleted documents, and removes their contribution to the TF/IDF > scores. > > David: > Hmmm, what happens if you direct the query not only to a single > collection, but to a single shard? Add &distrib=false to the query and > point it to each of your replicas. (one collection at a time). The > expectation is that each replica for a slice within a collection has > identical documents. > > One possibility is that somehow your shards are out of sync on a > collection. So the internal load balancing that happens sometimes > sends the query to one replica and sometime to another. 2 replicas > (leader and follower) and 50% failure, coincidence? > > That just bumps the question up another level of course, the next > question is _why_ is the shard out of sync. So in that case I'd issue > a commit to all the collections on the off chance that somehow that > didn't happen and try again (very low probability that this is the > root cause, but you never know). > > but it sure sounds like one replica doesn't agree with another, so the > above will give us place to look. > > Best, > Erick > > > > On Tue, Dec 16, 2014 at 12:12 PM, David Smith > <dsmiths...@yahoo.com.invalid> wrote: >> Alex, >> Good suggestion, but in this case, no. This example is from a cleanroom >> type test environment where the collections have very recently been created, >> there are only 4 documents total across all collections, and no delete's >> have been issued. >> Kind regards, >> David >> >> >> On Tuesday, December 16, 2014 12:01 PM, Alexandre Rafalovitch >> <arafa...@gmail.com> wrote: >> >> >> Facet counts include deleted documents until the segments merge. Could that >> be an issue? >> >> Regards, >> Alex >> On 16/12/2014 12:18 pm, "David Smith" <dsmiths...@yahoo.com.invalid> wrote: >> >>> I have a prototype SolrCloud 4.10.2 setup with 13 collections (of 1 >>> replica, 1 shard each) and a separate 1-node Zookeeper 3.4.6. >>> The very first app test case I wrote is failing intermittently in this >>> environment, when I only have 4 documents ingested into the cloud. >>> I dug in and found when I query against multiple collections, using the >>> "collection=" parameter, the aggregates I request are correct about 50% of >>> the time. The other 50% of the time, the aggregate returned by Solr is not >>> correct. Note this is for the identical query. In other words, I can run >>> the same query multiple times in a row, and get different answers. >>> >>> The simplest version of the query that still exhibits the odd behavior is >>> as follows: >>> >>> http://192.168.59.103:8985/solr/query_handler/query?facet.range=eventDate&f.eventDate.facet.range.end=2014-12-31T23:59:59.999Z&f.eventDate.facet.range.gap=%2B1DAY&fl=eventDate,id&start=0&collection=2014_04,2014_03&rows=10&f.eventDate.facet.range.start=2014-01-01T00:00:00.000Z&q=*:*&f.eventDate.facet.mincount=1&facet=true >>> >>> When it SUCCEEDS, the aggregate correctly appears like this: >>> >>> "facet_counts":{ "facet_queries":{}, "facet_fields":{}, >>> "facet_dates":{}, "facet_ranges":{ "eventDate":{ "counts":[ >>> "2014-04-01T00:00:00Z",3], "gap":"+1DAY", >>> "start":"2014-01-01T00:00:00Z", "end":"2015-01-01T00:00:00Z"}}, >>> "facet_intervals":{}}} >>> >>> When it FAILS, note that the counts[] array is empty: >>> "facet_counts":{ "facet_queries":{}, "facet_fields":{}, >>> "facet_dates":{}, "facet_ranges":{ "eventDate":{ >>> "counts":[], "gap":"+1DAY", "start":"2014-01-01T00:00:00Z", >>> "end":"2015-01-01T00:00:00Z"}}, "facet_intervals":{}}} >>> >>> If I further simplify the query, by removing range options or reducing to >>> one (1) collection name, then the problem goes away. >>> >>> The solr logs are clean at INFO level, and there is no substantive >>> difference in log output when the query succeeds vs fails, leaving me >>> stumped where to look next. Suggestions welcome. >>> Regards, >>> David >>> >>> >>> >>> >>> >> >> > >