Hi, I'm using the grouping feature of Solr to return a list of unique
documents together with a count of the duplicates.

Essentially I use Solr's signature algorithm to create the "signature"
field and use grouping on it.

To provide good numbers for paging through my result list, I'd like to
compute the total number of documents found (= matches) and the number
of unique documents (= ngroups). Unfortunately, enabling
"group.ngroups" considerably slows down the query (from 500ms to
23000ms for a result list of roughly 300000 documents).

Is there a faster way to compute the number of groups (or unique
values in the signature field) in the search result? My Solr instance
currently contains about 50 million documents and around 10% of them
are duplicates.

Thank you,
Michael

Reply via email to