On 5-Oct-07, at 2:06 PM, Kyle Banerjee wrote:

Howdy all,

We are attempting to provide access to about 8 million records of
highly variable quality and length. In a nutshell, we are trying to
find a way to deprioritize "suspect" records without discriminating
against useful records that happen to be short. We do not wish to
eliminate suspect records from the results -- just deprioritize them a
bit.

We have been indexing a field that marks a record as likely to be good
or bad, and I'm trying to figure out the most efficient way to use it
(should I be trying this at all?). As a newbie, my first inclination
was to OR the search terms with the same terms combined with a "good
record marker" with a modest boost.

However, this method seems really clunky, and I'm wondering if there's
a better way to accomplish what we're trying to do. Thanks,

If you know at index time that the document is shady, the easiest way to de-emphasize it globally is to set the document boost to some value other than one.

<doc boost="0.5">...

cheers,
-Mike

Reply via email to