I am seeing some some odd behavior with range facets across multiple shards. 
When querying each node directly with distrib=false the facet returned matches 
what is expected. When doing the same query against the collection and it spans 
the two shards, the facet after and between buckets are wrong.


I can re-create a similar problem using the out of the box example scripts and 
data. I am running on Windows and tested both Solr 5.0.0 and 5.1.0. This is the 
steps to reproduce:


c:\solr-5.1.0\solr -e cloud

These are the selections I made:


(specify 1-4 nodes) [2]: 2
Please enter the port for node1 [8983]: 8983
Please enter the port for node2 [7574]: 7574
Please provide a name for your new collection: [gettingstarted] gettingstarted
How many shards would you like to split gettingstarted into? [2] 2
How many replicas per shard would you like to create? [2] 1
Please choose a configuration ...  [data_driven_schema_configs] 
sample_techproducts_configs


I then posted some of the sample XMLs:

C:\solr-5.1.0\example\exampledocs> java -Dc=gettingstarted -jar post.jar 
vidcard.xml, hd.xml, ipod_other.xml, ipod_video.xml, mem.xml, monitor.xml, 
monitor2.xml,mp500.xml, sd500.xml


This first query is against node1 with distrib=false:

http://localhost:8983/solr/gettingstarted/select/?q=*:*&wt=json&indent=true&distrib=false&facet=true&facet.range=price&f.price.facet.range.start=0.00&f.price.facet.range.end=100.00&f.price.facet.range.gap=20&f.price.facet.range.other=all&defType=edismax&q.op=AND

There are 7 Results (results ommited).
    "facet_ranges":{
      "price":{
        "counts":[
          "0.0",1,
          "20.0",0,
          "40.0",0,
          "60.0",0,
          "80.0",1],
        "gap":20.0,
        "start":0.0,
        "end":100.0,
        "before":0,
        "after":5,
        "between":2}},


This second query is against node2 with distrib=false:
http://localhost:7574/solr/gettingstarted/select/?q=*:*&wt=json&indent=true&distrib=false&facet=true&facet.range=price&f.price.facet.range.start=0.00&f.price.facet.range.end=100.00&f.price.facet.range.gap=20&f.price.facet.range.other=all&defType=edismax&q.op=AND

7 Results (one product does not have a price):
    "facet_ranges":{
      "price":{
        "counts":[
          "0.0",1,
          "20.0",0,
          "40.0",0,
          "60.0",1,
          "80.0",0],
        "gap":20.0,
        "start":0.0,
        "end":100.0,
        "before":0,
        "after":4,
        "between":2}},


Finally querying the entire collection:
http://localhost:7574/solr/gettingstarted/select/?q=*:*&wt=json&indent=true&facet=true&facet.range=price&f.price.facet.range.start=0.00&f.price.facet.range.end=100.00&f.price.facet.range.gap=20&f.price.facet.range.other=all&defType=edismax&q.op=AND

14 results (one without a price range):
    "facet_ranges":{
      "price":{
        "counts":[
          "0.0",2,
          "20.0",0,
          "40.0",0,
          "60.0",1,
          "80.0",1],
        "gap":20.0,
        "start":0.0,
        "end":100.0,
        "before":0,
        "after":5,
        "between":2}},


Notice that both the "after" and the "between" are wrong here. The actual 
buckets do correctly represent the right values but I would expect "between" to 
be 5 and "after" to be 13.


There appears to be a recently fixed issue 
(https://issues.apache.org/jira/browse/SOLR-6154) with range facet in 
distributed queries but it was related to buckets not always appearing with 
mincount=1 for the field. This looks like it is a different problem.


Anyone have any suggestions or notice anythign wrong with my query parameters? 
I can open a Jira ticket but wanted to run it by the larger audience first to 
see if I am missing anything obvious.


Thanks,

Will

Reply via email to