: Some searches will obviously be saturated by docs from any given author if
: they've simply written more.
:
: I'd like to give a negative boost to these matches, there-by making sure that
: 1 Author doesn't saturate the results just because they've written 500
: documents, compared to others who may have only written 2-3 documents.
:
: The actual author value doesn't matter, I just want to bring down the score of
: docs by any common author to give more varied results.
:
: What's the easiest approach for this, and is it even possible at query time?
: I could do this at index time but would prefer a Solr solution.
w/o a custom plugin, the only way i know of to do something like this
would be to index a numeric "author_prolificness" field in each doc and
use that as the basis of a function query.
but honestly: i *really* don't think you want to do this - not if you are
dealing with real user queries (maybe if this is for some syntheticly
generated "related documents" or "interesting documents" query)
Imagine a user is searching for a *very* specific title (ie: "Nightfall")
by a very prolific author ("Isaac Asimov). What your'e describing would
penalize the desired match just because the author is prolific -- even if
the user types in the exact title of a document, so that some much more
esoteric document with the same title by an author who has written nothing
else ("Stephen Leather") would likely score higher.
I mean: if someone types in "Romeo and Juliet" do you really want to score
documents by "Shakespeare" lower then documents by "Stanley W. Wells" just
because Wells has written fewer total books?
-Hoss