Thank you so much Erick. The CollapsingQparse was exactly what I needed. I'm able to group by the field and then do the collapse from products into items and get the correct answer. The Collapsing is also more appropriate for the general grouping we need to do all the time now as well, so we'll probably use that instead.
There was one part that took me a while to figure out though. And that was getting the unique counts of a field with facets from BEFORE the collapse kicked in. Just for anyone else who wants to know how to do this. I use a tag on the collapse fq like this: fq={!collapse field=int_item_id tag=collapse} And then by expressing my facet count query with excludeTags like this i'm able to get the pre_collapse_count: json.facet={'pre_collapse_count':{'type':'query','domain':{'excludeTags':'collapse'}}} Which works perfectly. You need Solr 5.4 for this to work with json facets according to http://yonik.com/multi-select-faceting/. But I think you can also do it with old facets by putting a {!ex=collapse}int_item_id in the facet.field property. On Thu, May 12, 2016 at 5:10 AM, Erick Erickson <erickerick...@gmail.com> wrote: > A couple of ideas. If this is 5x consider Streaming Aggregation. > The idea here is that you stream the docs back to a SolrJ client and > slice and dice them there. SA is designed to export 400K docs/sec, > but the returned values must be DocValues (i.e. no text types, strings > are OK). > > Have you seen the CollapsingQParserPlugin? That might help. > > Or push back at the product manager and say "why are we wasting > time supporting something nobody uses?" ;) > > Best, > Erick > > On Wed, May 11, 2016 at 1:45 AM, Callum Lamb <cl...@mintel.com> wrote: > > We have a horrible Solr query that groups by a field and then sorts by > > another. My understanding is that for this to happen it has to sort by > the > > grouping field, group it and then sort the resulting result set. It's > not a > > fast query. > > > > Unfortunately our documents now need to be grouped as well (product > > variants into items) and that grouping query needs to work on that > grouping > > instead. As far as I'm aware you can't do nested grouping in Solr. > > > > In summary we want to have product variants that get grouped into Items > and > > then they get grouped by field and then sorted by another. > > > > The solution doesn't need to be fast, it's a rarely ever used legacy part > > of our application that's basically never used and we just need it to > work. > > Our dataset isn't huge so it doesn't matter if Solr has to scan the > entire > > index (I think the query does this atm anyway). But downloading the > entire > > document set and doing the operations in ETL isn't something we really > want > > to dedicate time to unless it's impossible to represent this in Solr > > queries. > > > > Any ideas? > > > > Cheers, > > > > Callum. > > > > -- > > > > Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN > > Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 > > > > Contact details for our other offices can be found at > > http://www.mintel.com/office-locations. > > > > This email and any attachments may include content that is confidential, > > privileged > > or otherwise protected under applicable law. Unauthorised disclosure, > > copying, distribution > > or use of the contents is prohibited and may be unlawful. If you have > > received this email in error, > > including without appropriate authorisation, then please reply to the > > sender about the error > > and delete this email and any attachments. > > > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.