Hi Toke, Thanks for your reply.
>StatsComponent has countDistinct which is accurate, but has a warning >that is might be very heavy. This means that by using this method, the speed will probably be slow as well? By the way, I found that all the slow queries occurs in searches which are returning more than 300,000 ngroups, and the speed is in proportion to the number of records. At 300,000 ngroups. I am still able to get a return within 3 seconds, but for a search with more than 6,000,000 ngroups, it will take almost 2 minutes for the search to return. Regards, Edwin On 11 March 2016 at 17:19, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > On Fri, 2016-03-11 at 12:11 +0800, Zheng Lin Edwin Yeo wrote: > > I would like to check, will using the results grouping with group.ngroups > > (which will include the number of groups that have matched the query) in > > the search affects the performance of the Solr? > > Yes. Calculating ngroups is done by collecting all the groups in a > shard. They are kept as BytesRefs, which means a lot of lookups plus > memory overhead proportional to the ngroups count. > > > I required the value of the number of groups that have matched the query. > > Besides this, is there other way which I can retrieve that value? > > JSON Facets has numBuckets which is fast, but might not be accurate: > https://issues.apache.org/jira/browse/SOLR-8741 > > StatsComponent has countDistinct which is accurate, but has a warning > that is might be very heavy. > > > If you want accurate counts and if you are using SolrCloud, each shard > must return the full list of values, independent of whether you use > grouping, faceting or stats. Depending on cardinality this can be very > heavy. > > > I have more than 10 million documents, with an index size of more than > > 500GB, and I'm using Solr 5.4.0. > > - Toke Eskildsen, State and University Library, Denmark > > >