I hit ctrl-S by mistake. This is the method you are after: http://lucene.apache.org/java/2_3_2/api/core/org/apache/lucene/search/DefaultSimilarity.html#tf(float)
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Otis Gospodnetic <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, September 30, 2008 4:40:08 PM > Subject: Re: Searching Question > > The easiest thing is to look at Lucene javadoc and look for Similarity and > DefaultSimilarity classes. Then have a peek at Lucene contrib to get some > other > examples of custom Similarity. You'll just need to override one method, for > example: > > > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- > > From: Jake Conk > > To: solr-user@lucene.apache.org > > Sent: Tuesday, September 30, 2008 3:11:01 PM > > Subject: Re: Searching Question > > > > How would I write a custom Similarity factor that overrides the TF > > function? Is there some documentation on that somewhere? > > > > On Sat, Sep 27, 2008 at 5:14 AM, Grant Ingersoll wrote: > > > > > > On Sep 26, 2008, at 2:10 PM, Otis Gospodnetic wrote: > > > > > >> It might be easiest to store the thread ID and the number of replies in > > >> the thread in each post Document in Solr. > > > > > > Yeah, but that would mean updating every document in a thread every time a > > > new reply is added. > > > > > > I still keep going back to the solution as putting all the replies in a > > > single document, and then using a custom Similarity factor that overrides > > > the TF function and/or the length normalization. Still, this suffers from > > > having to update the document for every new reply. > > > > > > Let's take a step back... > > > > > > Can I ask why you want the scoring this way? What have you seen in your > > > results that leads you to believe it is the correct way? Note, I'm not > > > trying to convince you it's wrong, I just want to better understand what's > > > going on. > > > > > > > > >> > > >> > > >> Otherwise it sounds like you'll have to combine some search results or > > >> data post-search. > > >> > > >> Otis > > >> -- > > >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > >> > > >> > > >> > > >> ----- Original Message ---- > > >>> > > >>> From: Jake Conk > > >>> To: solr-user@lucene.apache.org > > >>> Sent: Friday, September 26, 2008 1:50:37 PM > > >>> Subject: Re: Searching Question > > >>> > > >>> Grant, > > >>> > > >>> Each post is its own document but I can merge them all into a single > > >>> document under one thread if that will allow me to do what I want. > > >>> The number of replies is stored both in Solr and the DB. > > >>> > > >>> Thanks, > > >>> > > >>> - JC > > >>> > > >>> On Fri, Sep 26, 2008 at 5:24 AM, Grant Ingersoll wrote: > > >>>> > > >>>> Is a thread and all of it's posts a single document? In other words, > > >>>> how > > >>>> are you modeling your posts as Solr documents? Also, where are you > > >>>> keeping > > >>>> track of the number of replies? Is that in Solr or in a DB? > > >>>> > > >>>> -Grant > > >>>> > > >>>> On Sep 25, 2008, at 8:51 PM, Jake Conk wrote: > > >>>> > > >>>>> Hello, > > >>>>> > > >>>>> We are using Solr for our new forums search feature. If possible when > > >>>>> searching for the word "Halo" we would like threads that contain the > > >>>>> word "Halo" the most with the least amount of posts in that thread to > > >>>>> have a higher score. > > >>>>> > > >>>>> For instance, if we have a thread with 10 posts and the word "Halo" > > >>>>> shows up 5 times then that should have a lower score than a thread > > >>>>> that has the word "Halo" 3 times within its posts and has 5 replies. > > >>>>> Basically the thread that shows the search string most frequently > > >>>>> amongst the number of posts in the thread should be the one with the > > >>>>> highest score. > > >>>>> > > >>>>> Is something like this possible? > > >>>>> > > >>>>> Thanks, > > >>>>> > > >>>>> > > >