Re: Index-time Boosting

Yonik Seeley Fri, 20 Oct 2006 10:43:09 -0700

On 10/20/06, Mike Klaas <[EMAIL PROTECTED]> wrote:

Index-time boosts can be set per-document or per-document-field.
There is no facility for setting the boost of a part of text added to
a field (as you suggest above) (which is really a shame, as such
functionality would lend huge flexibility to index-time boosing!).


I wonder what the index-size cost would be though...
Anyway, there has been discussion of flexible indexing on the Lucene
list in the past few months, with one application being
boost-per-position.

You must do this for every document.  (Be careful for multi-valued
fields--you should only set the boost for _one_ value input to the
field).


Good point... I believe they are all multiplied together in Lucene.

There are a few optimizations in solr that only
trigger when boosts are one, but I'm not sure exactly what those are.


There were optimizations that hoisted mandatory boolean clauses with a
zero boost into a cached filter (I got that optimization from
Doug/Nutch).  That optimization is no longer in the normal code paths
that return DocSets/DocLists, and it probably doesn't matter given
that one can now explicitly specify filter queries themselves via fq
params.

Is fq documented anywhere???  It's very useful for speeding up complex
queries since they are cached independently from the main query.
Just yesterday I sped up some queries from an average latency of .550
seconds to .004 seconds by pulling out some mandatory clauses that
matched the majority of documents in the index into a fq.

Finally, it can be much faster to search a single field rather than
multiple fields.  One hacky way of achieving this is to make a field
which receives a single copy of contents and eight copies of title.
This is imperfect, as it messes up length normalization and
summarizing.


And you can't make the title field count 8 times as much :-)

I've seen people simply *add* the title field multiple times to the
general search field in an attempt to boost it.  I can't say how well
it worked.

-Yonik

Re: Index-time Boosting

Reply via email to