Sounds like it's worth a try! Thanks Andre.
Tom

On 5 Dec 2012, at 17:49, Andre Bois-Crettez <andre.b...@kelkoo.com> wrote:

> If you do grouping on source_id, it should be enough to request 3 times
> more documents than you need, then reorder and drop the bottom.
> 
> Is a 3x overhead acceptable ?
> 
> 
> 
> On 12/05/2012 12:04 PM, Tom Mortimer wrote:
>> Hi everyone,
>> 
>> I've got a problem where I have docs with a source_id field, and there can 
>> be many docs from each source. Searches will typically return docs from many 
>> sources. I want to restrict the number of docs from each source in results, 
>> so there will be no more than (say) 3 docs from source_id=123 etc.
>> 
>> Field collapsing is the obvious approach, but I want to get the results back 
>> in relevancy order, not grouped by source_id. So it looks like I'll have to 
>> fetch more docs than I need to and re-sort them. It might even be better to 
>> count source_ids in the client code and drop excess docs that way, but the 
>> potential overhead is large.
>> 
>> Is there any way of doing this in Solr without hacking in a custom Lucene 
>> Collector? (which doesn't look all that straightforward).
>> 
>> cheers,
>> Tom
>> 
>> 
>> --
>> André Bois-Crettez
>> 
>> Search technology, Kelkoo
>> http://www.kelkoo.com/
> 
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de € 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
> 
> Ce message et les pièces jointes sont confidentiels et établis à l'attention 
> exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
> message, merci de le détruire et d'en avertir l'expéditeur.

Reply via email to