Is it necessary that a document 1 year old be more relevant than one
that's 1 year and 1 hour old? In other words, can the boosting be
logarithmic wrt time instead of linear?

A schema design tip: you can store a separate date field which is
rounded down to the hour. This will make for a much smaller term
dictionary and therefore faster searching & range queries.

On Mon, Jun 7, 2010 at 4:08 AM, Asif Rahman <a...@newscred.com> wrote:
> I still need a relatively precise boost.  No less precise than hourly.  I
> think that would make for a pretty messy field query.
>
>
> On Mon, Jun 7, 2010 at 2:15 AM, Lance Norskog <goks...@gmail.com> wrote:
>
>> If you are unhappy with the performance overhead of a function boost,
>> you can push it into a field query by boosting date ranges.
>>
>> You would group in date ranges: documents in September would be
>> boosted 1.0, October 2.0, November 3.0 etc.
>>
>>
>> On 6/5/10, Asif Rahman <a...@newscred.com> wrote:
>> > Thanks everyone for your help so far.  I'm still trying to get to the
>> bottom
>> > of whether switching over to index-time boosts will give me a performance
>> > improvement, and if so if it will be noticeable.  This is all under the
>> > assumption that I can achieve the scoring functionality that I need with
>> > either index-time or search-time boosting (given the loss of precision.
>>  I
>> > can always dust off the old profiler to see what's going on with the
>> > search-time boosts, but testing the index-time boosts will require a full
>> > reindex, which could take days with our dataset.
>> >
>> > On Sat, Jun 5, 2010 at 9:17 AM, Robert Muir <rcm...@gmail.com> wrote:
>> >
>> >> On Fri, Jun 4, 2010 at 7:50 PM, Asif Rahman <a...@newscred.com> wrote:
>> >>
>> >> > Perhaps I should have been more specific in my initial post.  I'm
>> doing
>> >> > date-based boosting on the documents in my index, so as to assign a
>> >> higher
>> >> > score to more recent documents.  Currently I'm using a boost function
>> to
>> >> > achieve this.  I'm wondering if there would be a performance
>> improvement
>> >> if
>> >> > instead of using the boost function at search time, I indexed the
>> >> documents
>> >> > with a date-based boost.
>> >> >
>> >> >
>> >> Asif, without knowing more details, before you look at performance you
>> >> might
>> >> want to consider the relevance impacts of switching to index-time
>> boosting
>> >> for your use case too.
>> >>
>> >> You can read more about the differences here:
>> >> http://lucene.apache.org/java/3_0_1/scoring.html
>> >>
>> >> But I think the most important for this date-influenced use case is:
>> >>
>> >> "Indexing time boosts are preprocessed for storage efficiency and
>> written
>> >> to
>> >> the directory (when writing the document) in a single byte (!)"
>> >>
>> >> If you do this as an index-time boost, your boosts will lose lots of
>> >> precision for this reason.
>> >>
>> >> --
>> >> Robert Muir
>> >> rcm...@gmail.com
>> >>
>> >
>> >
>> >
>> > --
>> > Asif Rahman
>> > Lead Engineer - NewsCred
>> > a...@newscred.com
>> > http://platform.newscred.com
>> >
>>
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>>
>
>
>
> --
> Asif Rahman
> Lead Engineer - NewsCred
> a...@newscred.com
> http://platform.newscred.com
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to