Hi,

We encountered an issue when using the refine parameter when subfaceting in
a range facet.
When enabling the refine option, the counts of the response are the double
of the counts of the response without refine option.
We are running Solr 6.6.1 in a cloud setup.

If I execute the query:

curl http://localhost:8899/solr/data/select -d '{ "params" :
{"wt":"json","rows":0,"json.facet":"
  {

    \"MaximumAge_f\":
    {
      \"type\":\"range\",
      \"field\":\"MaximumAge_f\",
      \"start\":0.0,
      \"end\":55000.0,
      \"gap\":1000.0,
      \"other\":\"between\",
      \"facet\":
      {
        \"Gender_sf\":
        {
          \"type\":\"terms\",
          \"field\":\"Gender_sf\",
          \"missing\":true,
*          \"refine\":true,*
          \"overrequest\":24,
          \"limit\":12,
          \"offset\":0
        }
      }
    }
  }",
  "q":"*:*"
}'

I get the following response:

  "facets": {
    "count": 379417,
    "MaximumAge_f": {
      "buckets": [
        {
          "val": 0,
          "count": 8252,
          "Gender_sf": {
            "buckets": [
              {
                "val": "All",
                "count": 8152
              },
              {
                "val": "Male",
                "count": 74
      {
              },
              {:wink
                "val": "Female",
                "count": 26
              }
            ],
            "missing": {
              "count": 0
            }
          }
        },
...

If I execute the same query WITHOUT refine: true in the subfacet, I get the
following response:

  "facets": {
    "count": 379417,
    "MaximumAge_f": {
      "buckets": [
        {
          "val": 0,
          "count": 4126,
          "Gender_sf": {
            "buckets": [
              {
                "val": "All",
                "count": 4076
              },
              {
                "val": "Male",
                "count": 37
              },
              {
                "val": "Female",
                "count": 13
              }
            ],
            "missing": {
              "count": 0
            }
          }
        },
...

There is a factor 2 difference for each count in each bucket.

If I perform the same queries with a larger range gap, e.g.
      \"start\":0.0,
      \"end\":55000.0,
      \"gap\":5000.0,
there is no difference between the response with and without refine: true.

Is this a known issue, or is there something we are overlooking?
And is there information on whether or not this behavior will be the same
in Solr 7?

Kind regards, Tom

Reply via email to