Re: Index-time vs. search-time boosting performance

2010-06-09 Thread Lance Norskog
Is it necessary that a document 1 year old be more relevant than one that's 1 year and 1 hour old? In other words, can the boosting be logarithmic wrt time instead of linear? A schema design tip: you can store a separate date field which is rounded down to the hour. This will make for a much small

Re: Index-time vs. search-time boosting performance

2010-06-07 Thread Asif Rahman
I still need a relatively precise boost. No less precise than hourly. I think that would make for a pretty messy field query. On Mon, Jun 7, 2010 at 2:15 AM, Lance Norskog wrote: > If you are unhappy with the performance overhead of a function boost, > you can push it into a field query by bo

Re: Index-time vs. search-time boosting performance

2010-06-06 Thread Lance Norskog
If you are unhappy with the performance overhead of a function boost, you can push it into a field query by boosting date ranges. You would group in date ranges: documents in September would be boosted 1.0, October 2.0, November 3.0 etc. On 6/5/10, Asif Rahman wrote: > Thanks everyone for your

Re: Index-time vs. search-time boosting performance

2010-06-05 Thread Asif Rahman
Thanks everyone for your help so far. I'm still trying to get to the bottom of whether switching over to index-time boosts will give me a performance improvement, and if so if it will be noticeable. This is all under the assumption that I can achieve the scoring functionality that I need with eit

Re: Index-time vs. search-time boosting performance

2010-06-05 Thread Robert Muir
On Fri, Jun 4, 2010 at 7:50 PM, Asif Rahman wrote: > Perhaps I should have been more specific in my initial post. I'm doing > date-based boosting on the documents in my index, so as to assign a higher > score to more recent documents. Currently I'm using a boost function to > achieve this. I'm

Re: Index-time vs. search-time boosting performance

2010-06-05 Thread Asif Rahman
> From: Asif Rahman [a...@newscred.com] > Sent: Friday, June 04, 2010 11:31 PM > To: solr-user@lucene.apache.org > Subject: Re: Index-time vs. search-time boosting performance > > It seems like it would be far more efficient to calculate the boost factor > once and

RE: Index-time vs. search-time boosting performance

2010-06-04 Thread Jonathan Rochkind
ndex-time boost not neccesarily unreasonable. From: Asif Rahman [a...@newscred.com] Sent: Friday, June 04, 2010 11:31 PM To: solr-user@lucene.apache.org Subject: Re: Index-time vs. search-time boosting performance It seems like it would be far more efficie

Re: Index-time vs. search-time boosting performance

2010-06-04 Thread Asif Rahman
It seems like it would be far more efficient to calculate the boost factor once and store it rather than calculating it for each request in real-time. Some of our queries match tens of thousands if not hundreds of thousands of documents in a 15GB index. However, I'm not well-versed in lucene inter

Re: Index-time vs. search-time boosting performance

2010-06-04 Thread Jay Hill
I've done a lot of recency boosting to documents, and I'm wondering why you would want to do that at index time. If you are continuously indexing new documents, what was "recent" when it was indexed becomes, over time "less recent". Are you unsatisfied with your current performance with the boost f

Re: Index-time vs. search-time boosting performance

2010-06-04 Thread Asif Rahman
Perhaps I should have been more specific in my initial post. I'm doing date-based boosting on the documents in my index, so as to assign a higher score to more recent documents. Currently I'm using a boost function to achieve this. I'm wondering if there would be a performance improvement if ins

Re: Index-time vs. search-time boosting performance

2010-06-04 Thread Erick Erickson
Index time boosting is different than search time boosting, so asking about performance is irrelevant. Paraphrasing Hossman from years ago on the Lucene list (from memory). ...index time boosting is a way of saying this documents' title is more important than other documents' titles. Search time