Michael, Please check two questions inlined below
On Sat, Jan 31, 2015 at 10:14 PM, Michael Sokolov < msoko...@safaribooksonline.com> wrote: > We were using grouping (no DocValues, though) and recently switched to > using block-indexing and joins (see https://cwiki.apache.org/ > confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers). > We got a nice speedup on average (perhaps 2x faster) and an even better > improvement in the worst times; overall the performance is much more > predictable and better, and I suspect (haven't checked) that we may be > using less heap too. The block indexing is cutting edge, a little > complicated to get right, and I had to make some custom java code to get > things just the way I wanted, but for best performance it does seem to be > the way to go. > > Beware some gotchas: > > You have to reindex all the docs that participate in the parent-child > relation so that each parent-child block gets indexed at once. This might > cause difficulties, but for us and I suspect most people, it's the natural > thing to do anyway. > > You can only handle a single relation this way since you have to > restructure your index to use it; grouping is more flexible. > Michael, would you mind to comment which relations you need to model particularly? BJQ is definitely much restrictive than grouping, but still have some flexibility to cover the most frequent demands. > > Clients may not support the new block-indexing syntax (I think SolrJ has > it, but the python client we were using did not); > > Converting an existing index requires special care; you basically have to > delete all documents you are re-indexing > > The Solr query parsers don't support scoring the joined-from documents > (child docs in the to-parent query, parent docs in the to-child query). > This might not matter to you, but it was important for our use case > Would you mind to leave your vote https://issues.apache.org/jira/browse/SOLR-5662 it's not a big deal to implement. > So there are some kinks still, but if you can make it work for you, it > does seem to perform better than grouping. > > -Mike > > > On 1/30/2015 4:10 PM, Cario, Elaine wrote: > >> Hi Shamik, >> >> We use DocValues for grouping, and although I have nothing to compare it >> to (we started with DocValues), we are also seeing similar poor results as >> you: easily 60% overhead compared to non-group queries. Looking around for >> some solution, no quick fix is presenting itself unfortunately. >> CollapsingQParserPlugin also is too limited for our needs. >> >> -----Original Message----- >> From: Shamik Bandopadhyay [mailto:sham...@gmail.com] >> Sent: Thursday, January 15, 2015 6:02 PM >> To: solr-user@lucene.apache.org >> Subject: Does DocValues improve Grouping performance ? >> >> Hi, >> >> Does use of DocValues provide any performance improvement for >> Grouping ? >> I' looked into the blog which mentions improving Grouping performance >> through DocValues. >> >> https://lucidworks.com/blog/fun-with-docvalues-in-solr-4-2/ >> >> Right now, Group by queries (which I can't sadly avoid) has become a huge >> bottleneck. It has an overhead of 60-70% compared to the same query san >> group by. Unfortunately, I'm not able to be CollapsingQParserPlugin as it >> doesn't have a support similar to "group.facet" feature. >> >> My understanding on DocValues is that it's intended for faceting and >> sorting. Just wondering if anyone have tried DocValues for Grouping and saw >> any improvements ? >> >> -Thanks, >> Shamik >> > > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>