Is it necessary that a document 1 year old be more relevant than one that's 1 year and 1 hour old? In other words, can the boosting be logarithmic wrt time instead of linear?
A schema design tip: you can store a separate date field which is rounded down to the hour. This will make for a much smaller term dictionary and therefore faster searching & range queries. On Mon, Jun 7, 2010 at 4:08 AM, Asif Rahman <a...@newscred.com> wrote: > I still need a relatively precise boost. No less precise than hourly. I > think that would make for a pretty messy field query. > > > On Mon, Jun 7, 2010 at 2:15 AM, Lance Norskog <goks...@gmail.com> wrote: > >> If you are unhappy with the performance overhead of a function boost, >> you can push it into a field query by boosting date ranges. >> >> You would group in date ranges: documents in September would be >> boosted 1.0, October 2.0, November 3.0 etc. >> >> >> On 6/5/10, Asif Rahman <a...@newscred.com> wrote: >> > Thanks everyone for your help so far. I'm still trying to get to the >> bottom >> > of whether switching over to index-time boosts will give me a performance >> > improvement, and if so if it will be noticeable. This is all under the >> > assumption that I can achieve the scoring functionality that I need with >> > either index-time or search-time boosting (given the loss of precision. >> I >> > can always dust off the old profiler to see what's going on with the >> > search-time boosts, but testing the index-time boosts will require a full >> > reindex, which could take days with our dataset. >> > >> > On Sat, Jun 5, 2010 at 9:17 AM, Robert Muir <rcm...@gmail.com> wrote: >> > >> >> On Fri, Jun 4, 2010 at 7:50 PM, Asif Rahman <a...@newscred.com> wrote: >> >> >> >> > Perhaps I should have been more specific in my initial post. I'm >> doing >> >> > date-based boosting on the documents in my index, so as to assign a >> >> higher >> >> > score to more recent documents. Currently I'm using a boost function >> to >> >> > achieve this. I'm wondering if there would be a performance >> improvement >> >> if >> >> > instead of using the boost function at search time, I indexed the >> >> documents >> >> > with a date-based boost. >> >> > >> >> > >> >> Asif, without knowing more details, before you look at performance you >> >> might >> >> want to consider the relevance impacts of switching to index-time >> boosting >> >> for your use case too. >> >> >> >> You can read more about the differences here: >> >> http://lucene.apache.org/java/3_0_1/scoring.html >> >> >> >> But I think the most important for this date-influenced use case is: >> >> >> >> "Indexing time boosts are preprocessed for storage efficiency and >> written >> >> to >> >> the directory (when writing the document) in a single byte (!)" >> >> >> >> If you do this as an index-time boost, your boosts will lose lots of >> >> precision for this reason. >> >> >> >> -- >> >> Robert Muir >> >> rcm...@gmail.com >> >> >> > >> > >> > >> > -- >> > Asif Rahman >> > Lead Engineer - NewsCred >> > a...@newscred.com >> > http://platform.newscred.com >> > >> >> >> -- >> Lance Norskog >> goks...@gmail.com >> > > > > -- > Asif Rahman > Lead Engineer - NewsCred > a...@newscred.com > http://platform.newscred.com > -- Lance Norskog goks...@gmail.com