On Dec 17, 2009, at 11:59 AM, Aleksander Stensby wrote:
A follow up question on this Hoss:
If I have a set of documents, let's say this email thread. Each
email has a
unique author. All emails in the thread are indexed with
"threadid=33" If I
want to count the number of unique authors in this email thread, I
could go
along the lines you mention at the end:
rows=0&threadid=33&facet=true&facet.field=author&limit=-1
then count all returned facets. This works, but becomes unfeasable
when the
number of unique author values in the index is large. Right?
So the limit=-1 solution is just not working for such fields. But
would work
well for "category" if the number of unique categories is low...
It's almost faster to retrieve all entries from the thread and count
programatically the number of unique authors... But obviouslly, I
don't want
to do that!
So, how would you go about to find the number of unique authors in
this
scenario?
One possible solution is "tree" faceting:
https://issues.apache.org/jira/browse/SOLR-792
&facet.tree=threadid,author
Could be a LARGE amount of data though!
Erik