On Sep 26, 2008, at 2:10 PM, Otis Gospodnetic wrote:
It might be easiest to store the thread ID and the number of replies
in the thread in each post Document in Solr.
Yeah, but that would mean updating every document in a thread every
time a new reply is added.
I still keep going back to the solution as putting all the replies in
a single document, and then using a custom Similarity factor that
overrides the TF function and/or the length normalization. Still,
this suffers from having to update the document for every new reply.
Let's take a step back...
Can I ask why you want the scoring this way? What have you seen in
your results that leads you to believe it is the correct way? Note,
I'm not trying to convince you it's wrong, I just want to better
understand what's going on.
Otherwise it sounds like you'll have to combine some search results
or data post-search.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
From: Jake Conk <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Friday, September 26, 2008 1:50:37 PM
Subject: Re: Searching Question
Grant,
Each post is its own document but I can merge them all into a single
document under one thread if that will allow me to do what I want.
The number of replies is stored both in Solr and the DB.
Thanks,
- JC
On Fri, Sep 26, 2008 at 5:24 AM, Grant Ingersoll wrote:
Is a thread and all of it's posts a single document? In other
words, how
are you modeling your posts as Solr documents? Also, where are
you keeping
track of the number of replies? Is that in Solr or in a DB?
-Grant
On Sep 25, 2008, at 8:51 PM, Jake Conk wrote:
Hello,
We are using Solr for our new forums search feature. If possible
when
searching for the word "Halo" we would like threads that contain
the
word "Halo" the most with the least amount of posts in that
thread to
have a higher score.
For instance, if we have a thread with 10 posts and the word "Halo"
shows up 5 times then that should have a lower score than a thread
that has the word "Halo" 3 times within its posts and has 5
replies.
Basically the thread that shows the search string most frequently
amongst the number of posts in the thread should be the one with
the
highest score.
Is something like this possible?
Thanks,