The easiest thing is to look at Lucene javadoc and look for Similarity and 
DefaultSimilarity classes.  Then have a peek at Lucene contrib to get some 
other examples of custom Similarity.  You'll just need to override one method, 
for example:


 --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Jake Conk <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, September 30, 2008 3:11:01 PM
> Subject: Re: Searching Question
> 
> How would I write a custom Similarity factor that overrides the TF
> function? Is there some documentation on that somewhere?
> 
> On Sat, Sep 27, 2008 at 5:14 AM, Grant Ingersoll wrote:
> >
> > On Sep 26, 2008, at 2:10 PM, Otis Gospodnetic wrote:
> >
> >> It might be easiest to store the thread ID and the number of replies in
> >> the thread in each post Document in Solr.
> >
> > Yeah, but that would mean updating every document in a thread every time a
> > new reply is added.
> >
> > I still keep going back to the solution as putting all the replies in a
> > single document, and then using a custom Similarity factor that overrides
> > the TF function and/or the length normalization.  Still, this suffers from
> > having to update the document for every new reply.
> >
> > Let's take a step back...
> >
> > Can I ask why you want the scoring this way?  What have you seen in your
> > results that leads you to believe it is the correct way?  Note, I'm not
> > trying to convince you it's wrong, I just want to better understand what's
> > going on.
> >
> >
> >>
> >>
> >> Otherwise it sounds like you'll have to combine some search results or
> >> data post-search.
> >>
> >> Otis
> >> --
> >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>
> >>
> >>
> >> ----- Original Message ----
> >>>
> >>> From: Jake Conk 
> >>> To: solr-user@lucene.apache.org
> >>> Sent: Friday, September 26, 2008 1:50:37 PM
> >>> Subject: Re: Searching Question
> >>>
> >>> Grant,
> >>>
> >>> Each post is its own document but I can merge them all into a single
> >>> document under one  thread if that will allow me to do what I want.
> >>> The number of replies is stored both in Solr and the DB.
> >>>
> >>> Thanks,
> >>>
> >>> - JC
> >>>
> >>> On Fri, Sep 26, 2008 at 5:24 AM, Grant Ingersoll wrote:
> >>>>
> >>>> Is a thread and all of it's posts a single document?  In other words,
> >>>> how
> >>>> are you modeling your posts as Solr documents?  Also, where are you
> >>>> keeping
> >>>> track of the number of replies?  Is that in Solr or in a DB?
> >>>>
> >>>> -Grant
> >>>>
> >>>> On Sep 25, 2008, at 8:51 PM, Jake Conk wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> We are using Solr for our new forums search feature. If possible when
> >>>>> searching for the word "Halo" we would like threads that contain the
> >>>>> word "Halo" the most with the least amount of posts in that thread to
> >>>>> have a higher score.
> >>>>>
> >>>>> For instance, if we have a thread with 10 posts and the word "Halo"
> >>>>> shows up 5 times then that should have a lower score than a thread
> >>>>> that has the word "Halo" 3 times within its posts and has 5 replies.
> >>>>> Basically the thread that shows the search string most frequently
> >>>>> amongst the number of posts in the thread should be the one with the
> >>>>> highest score.
> >>>>>
> >>>>> Is something like this possible?
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>
> >

Reply via email to