Of course -- and now I feel silly for not having thought of that :-). Thanks!
On Jan 6, 2008 4:37 PM, Walter Underwood <[EMAIL PROTECTED]> wrote: > Field collapsing might work for you. I haven't looked at the details > of the implementation and it is still in development, but it is the > right sort of feature. You'd like to see the top N matches for > each value of the author field, right? > > wunder > > On 1/6/08 3:25 PM, "Charles Hornberger" <[EMAIL PROTECTED]> > wrote: > > > I've got a problem that I'm not quite sure how to solve and am wondering > if > > anyone has any insight or similar experience to share. > > > > Here's the situation: Documents in our Solr index include a field > > identifying their author (we have 1000s of authors). When displaying an > > individual document, we also want to display a list of related documents > by > > other authors*, so we do a search using the current document's title, > author > > name, summary, and keywords as the query. Sometimes the search yields a > > results set in which all of the top n documents (in reality, n is ~10) > are > > from one author. > > > > Apparently, people don't like this. > > > > So what is being asked for is a result set in which no more than m > (where m > > is probably 3) of the top n are from any single author. (It's not that > we > > want to exclude documents m+1, m+2, etc. by each author from the result > set > > entirely; we just don't want them in the top n.) > > > > More generically, I can imagine this as a feature that might be > occasionally > > useful, e.g. as a kind of "diversity boost function" to be used when > scoring > > results, where you specify the fields for which you want to enforce > > diversity (e.g., author name, genre, color, etc.), and provide your > values > > for n and m, and Solr, uhm, obliges. :-) > > > > Any tips or ideas on how to proceed? (We're using Solr 1.2 so we don't > have > > MoreLikeThis, but we can upgrade to a newer version if it's likely that > > MoreLikeThis can provide what we're looking for.) > > > > -Charlie > > > > * In fact, we wouldn't mind if additional documents by the same author > were > > included, but we found that when we didn't exclude the original author > from > > the result set, we almost always had the same problem: The first n > documents > > were always by the original author. > >