How would I write a custom Similarity factor that overrides the TF
function? Is there some documentation on that somewhere?

On Sat, Sep 27, 2008 at 5:14 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
>
> On Sep 26, 2008, at 2:10 PM, Otis Gospodnetic wrote:
>
>> It might be easiest to store the thread ID and the number of replies in
>> the thread in each post Document in Solr.
>
> Yeah, but that would mean updating every document in a thread every time a
> new reply is added.
>
> I still keep going back to the solution as putting all the replies in a
> single document, and then using a custom Similarity factor that overrides
> the TF function and/or the length normalization.  Still, this suffers from
> having to update the document for every new reply.
>
> Let's take a step back...
>
> Can I ask why you want the scoring this way?  What have you seen in your
> results that leads you to believe it is the correct way?  Note, I'm not
> trying to convince you it's wrong, I just want to better understand what's
> going on.
>
>
>>
>>
>> Otherwise it sounds like you'll have to combine some search results or
>> data post-search.
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>> ----- Original Message ----
>>>
>>> From: Jake Conk <[EMAIL PROTECTED]>
>>> To: solr-user@lucene.apache.org
>>> Sent: Friday, September 26, 2008 1:50:37 PM
>>> Subject: Re: Searching Question
>>>
>>> Grant,
>>>
>>> Each post is its own document but I can merge them all into a single
>>> document under one  thread if that will allow me to do what I want.
>>> The number of replies is stored both in Solr and the DB.
>>>
>>> Thanks,
>>>
>>> - JC
>>>
>>> On Fri, Sep 26, 2008 at 5:24 AM, Grant Ingersoll wrote:
>>>>
>>>> Is a thread and all of it's posts a single document?  In other words,
>>>> how
>>>> are you modeling your posts as Solr documents?  Also, where are you
>>>> keeping
>>>> track of the number of replies?  Is that in Solr or in a DB?
>>>>
>>>> -Grant
>>>>
>>>> On Sep 25, 2008, at 8:51 PM, Jake Conk wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We are using Solr for our new forums search feature. If possible when
>>>>> searching for the word "Halo" we would like threads that contain the
>>>>> word "Halo" the most with the least amount of posts in that thread to
>>>>> have a higher score.
>>>>>
>>>>> For instance, if we have a thread with 10 posts and the word "Halo"
>>>>> shows up 5 times then that should have a lower score than a thread
>>>>> that has the word "Halo" 3 times within its posts and has 5 replies.
>>>>> Basically the thread that shows the search string most frequently
>>>>> amongst the number of posts in the thread should be the one with the
>>>>> highest score.
>>>>>
>>>>> Is something like this possible?
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>

Reply via email to