In our case, we have less than 20 distinct groups, and a typical search result will return about 10 of those groups (usually 3 documents per group). We use default sorting by score. There are 12 million docs spread across 3 shards. We set group.facet=false. The wkcluster field is a string field indexed using DocValues. Each document will have a value for the wkcluster field. Sample query:
?q=*%3A*&rows=100&wt=xml&indent=true&group=true&group.field=wkcluster&group.limit=3&hl=false&facet=false&group.facet=false This query returned 18 groups and took about 1.7 seconds even after executing it a few times. The main drag we see is that there are 2 internal queries (on each shard) generated when we have group=true. They are essentially the same except for additional group.topgroups params in the 2nd query. These queries seem to be done serially, so it almost doubles the latency. I'm not sure if it's something we're doing (or not doing) in the query, or this is just the way it is. I don't think we can use the aforementioned block-join feature here, as it would be difficult for us to build document blocks based on the group (and there's been requirements to group on different fields). Unfortunately the grouping feature has been extremely popular in the production applications running on our search platform (we’re migrating from Fast, where grouping performance was quite good). We do have other performance issues (currently we are investigating an issue with a scale function) - we are hoping we can resolve those to such a point where the double query for grouping isn't so noticable. -----Original Message----- From: Joel Bernstein [mailto:joels...@gmail.com] Sent: Friday, January 30, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: Does DocValues improve Grouping performance ? A few questions so we can better understand the scale of grouping you're trying to accomplish: How many distinct groups do you typically have in a search result? How many distinct groups are there in the field you are grouping on? How many results are you trying to group in a query? Joel Bernstein Search Engineer at Heliosearch On Fri, Jan 30, 2015 at 4:10 PM, Cario, Elaine < elaine.ca...@wolterskluwer.com> wrote: > Hi Shamik, > > We use DocValues for grouping, and although I have nothing to compare > it to (we started with DocValues), we are also seeing similar poor > results as > you: easily 60% overhead compared to non-group queries. Looking > around for some solution, no quick fix is presenting itself unfortunately. > CollapsingQParserPlugin also is too limited for our needs. > > -----Original Message----- > From: Shamik Bandopadhyay [mailto:sham...@gmail.com] > Sent: Thursday, January 15, 2015 6:02 PM > To: solr-user@lucene.apache.org > Subject: Does DocValues improve Grouping performance ? > > Hi, > > Does use of DocValues provide any performance improvement for Grouping ? > I' looked into the blog which mentions improving Grouping performance > through DocValues. > > https://lucidworks.com/blog/fun-with-docvalues-in-solr-4-2/ > > Right now, Group by queries (which I can't sadly avoid) has become a > huge bottleneck. It has an overhead of 60-70% compared to the same > query san group by. Unfortunately, I'm not able to be > CollapsingQParserPlugin as it doesn't have a support similar to "group.facet" > feature. > > My understanding on DocValues is that it's intended for faceting and > sorting. Just wondering if anyone have tried DocValues for Grouping > and saw any improvements ? > > -Thanks, > Shamik >