I hit ctrl-S by mistake.  This is the method you are after:

http://lucene.apache.org/java/2_3_2/api/core/org/apache/lucene/search/DefaultSimilarity.html#tf(float)


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Otis Gospodnetic <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, September 30, 2008 4:40:08 PM
> Subject: Re: Searching Question
> 
> The easiest thing is to look at Lucene javadoc and look for Similarity and 
> DefaultSimilarity classes.  Then have a peek at Lucene contrib to get some 
> other 
> examples of custom Similarity.  You'll just need to override one method, for 
> example:
> 
> 
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> ----- Original Message ----
> > From: Jake Conk 
> > To: solr-user@lucene.apache.org
> > Sent: Tuesday, September 30, 2008 3:11:01 PM
> > Subject: Re: Searching Question
> > 
> > How would I write a custom Similarity factor that overrides the TF
> > function? Is there some documentation on that somewhere?
> > 
> > On Sat, Sep 27, 2008 at 5:14 AM, Grant Ingersoll wrote:
> > >
> > > On Sep 26, 2008, at 2:10 PM, Otis Gospodnetic wrote:
> > >
> > >> It might be easiest to store the thread ID and the number of replies in
> > >> the thread in each post Document in Solr.
> > >
> > > Yeah, but that would mean updating every document in a thread every time a
> > > new reply is added.
> > >
> > > I still keep going back to the solution as putting all the replies in a
> > > single document, and then using a custom Similarity factor that overrides
> > > the TF function and/or the length normalization.  Still, this suffers from
> > > having to update the document for every new reply.
> > >
> > > Let's take a step back...
> > >
> > > Can I ask why you want the scoring this way?  What have you seen in your
> > > results that leads you to believe it is the correct way?  Note, I'm not
> > > trying to convince you it's wrong, I just want to better understand what's
> > > going on.
> > >
> > >
> > >>
> > >>
> > >> Otherwise it sounds like you'll have to combine some search results or
> > >> data post-search.
> > >>
> > >> Otis
> > >> --
> > >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > >>
> > >>
> > >>
> > >> ----- Original Message ----
> > >>>
> > >>> From: Jake Conk 
> > >>> To: solr-user@lucene.apache.org
> > >>> Sent: Friday, September 26, 2008 1:50:37 PM
> > >>> Subject: Re: Searching Question
> > >>>
> > >>> Grant,
> > >>>
> > >>> Each post is its own document but I can merge them all into a single
> > >>> document under one  thread if that will allow me to do what I want.
> > >>> The number of replies is stored both in Solr and the DB.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> - JC
> > >>>
> > >>> On Fri, Sep 26, 2008 at 5:24 AM, Grant Ingersoll wrote:
> > >>>>
> > >>>> Is a thread and all of it's posts a single document?  In other words,
> > >>>> how
> > >>>> are you modeling your posts as Solr documents?  Also, where are you
> > >>>> keeping
> > >>>> track of the number of replies?  Is that in Solr or in a DB?
> > >>>>
> > >>>> -Grant
> > >>>>
> > >>>> On Sep 25, 2008, at 8:51 PM, Jake Conk wrote:
> > >>>>
> > >>>>> Hello,
> > >>>>>
> > >>>>> We are using Solr for our new forums search feature. If possible when
> > >>>>> searching for the word "Halo" we would like threads that contain the
> > >>>>> word "Halo" the most with the least amount of posts in that thread to
> > >>>>> have a higher score.
> > >>>>>
> > >>>>> For instance, if we have a thread with 10 posts and the word "Halo"
> > >>>>> shows up 5 times then that should have a lower score than a thread
> > >>>>> that has the word "Halo" 3 times within its posts and has 5 replies.
> > >>>>> Basically the thread that shows the search string most frequently
> > >>>>> amongst the number of posts in the thread should be the one with the
> > >>>>> highest score.
> > >>>>>
> > >>>>> Is something like this possible?
> > >>>>>
> > >>>>> Thanks,
> > >>>>>
> > >>>>>
> > >

Reply via email to