Sounds like it's worth a try! Thanks Andre. Tom On 5 Dec 2012, at 17:49, Andre Bois-Crettez <andre.b...@kelkoo.com> wrote:
> If you do grouping on source_id, it should be enough to request 3 times > more documents than you need, then reorder and drop the bottom. > > Is a 3x overhead acceptable ? > > > > On 12/05/2012 12:04 PM, Tom Mortimer wrote: >> Hi everyone, >> >> I've got a problem where I have docs with a source_id field, and there can >> be many docs from each source. Searches will typically return docs from many >> sources. I want to restrict the number of docs from each source in results, >> so there will be no more than (say) 3 docs from source_id=123 etc. >> >> Field collapsing is the obvious approach, but I want to get the results back >> in relevancy order, not grouped by source_id. So it looks like I'll have to >> fetch more docs than I need to and re-sort them. It might even be better to >> count source_ids in the client code and drop excess docs that way, but the >> potential overhead is large. >> >> Is there any way of doing this in Solr without hacking in a custom Lucene >> Collector? (which doesn't look all that straightforward). >> >> cheers, >> Tom >> >> >> -- >> André Bois-Crettez >> >> Search technology, Kelkoo >> http://www.kelkoo.com/ > > Kelkoo SAS > Société par Actions Simplifiée > Au capital de € 4.168.964,30 > Siège social : 8, rue du Sentier 75002 Paris > 425 093 069 RCS Paris > > Ce message et les pièces jointes sont confidentiels et établis à l'attention > exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce > message, merci de le détruire et d'en avertir l'expéditeur.