On Thu, Jun 24, 2010 at 3:17 PM, Blargy <zman...@hotmail.com> wrote: > > Can someone explain how I can override the default behavior of the tf > contributing a higher score for documents with repeated words? > > For example: > > Query: "foo" > Doc1: "foo bar" score 1.0 > Doc2: "foo foo bar" score 1.1 > > Doc2 contains "foo" twice so it is scored higher. How can I override this > behavior?
Depends on the larger context of what you are trying to do. Do you still want the idf and length norm relevancy factors? If not, use a filter, or boost the particular clause with 0. -Yonik http://www.lucidimagination.com