How can I group my Solr query results using a numeric field into x buckets,
where the bucket start and end values are determined when the query is run?

For example, if I want to count and group documents into 5 buckets by a
wordCount field, the results should be:
250-500 words: 3438 results
500-750 words: 4554 results
750-1000 words: 9854 results
1000-1250 words: 3439 results
1250-1500 words: 38 results

Solr's faceting API docs assume that the facet buckets are known in advance,
but this isn't possible for numeric fields because the lower and upper
buckets depend on the search results.

My current query (which doesn't work) is:


curl http://localhost:8983/solr/pages/query -d '
q=*:*&
rows=0&
json.facet={
  wordCount : {
    type: range,
    field : wordCount,
    start : max(wordCount),
    end : min(wordCount),
    gap : 1000
  }
}'

I have read this question, which suggests calculating the buckets in the
application code prior to sending them to Solr for counting. This is not
ideal because it involves querying the database multiple times, and also the
answer is several years out of date and since then Solr has added the JSON
faceting API, which allows more complicated faceting settings.

In SQL, this type of dynamic bucketing is possible with union queries, in
which each query in the union which calculates a specific bucket's lower and
upper bounds and counts the results in that bucket. So it seems weird that
in Solr, where a lot of effort has gone into making faceting easy, this kind
of query is not possible.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply via email to