Re: Index-time field boost with DIH

Erick Erickson Fri, 16 Mar 2012 06:33:34 -0700

I'd go ahead and do the query time boosts. The "penalty" will
be a single multiplication per doc (I think), and probably not
noticeable. And it's much more flexible/easier...


Best
Erick

On Thu, Mar 15, 2012 at 9:21 PM, Arcadius Ahouansou
<arcad...@menelic.com> wrote:
> Hello.
>
> I have an SQL database with documents having an ID, TITLE and SUMMARY.
> I am using the DIH to index the data.
>
> In the DIH dataConfig, for every document, I would like to do something
> like:
>
> <field column="TITLE" name="title"* boost="2.0"* />
>
> In other words,  "A match on any document's title is worth twice as much as
> a match on other fields"
>
> In my schema, I have omitNorms set to false.
>
> 1) How can I do this in the DIH?
>
> 2) Apart from omitNorms making the index bigger,  I thought that index-time
> boost would give us more performance than doing the very same boosting at
> query time over and over again.
> Is that correct?
>
> 3) I also came across the Lucene FAQ at
> http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_the_difference_between_field_.28or_document.29_boosting_and_query_boosting.3F
>
> where the following interesting statement seems to contradict what I'm
> trying to achieve:
>
> *Index time field boosts are worthless if you set them on every document. *
>
> Any hint would be much appreciated.
>
>
> Thanks.
>
> Arcadius.

Re: Index-time field boost with DIH

Reply via email to