Re: How to Sort By a PageRank-Like Complicated Strategy?

Bing Li Sat, 28 Jan 2012 19:34:03 -0800

Dear Shashi,

As I learned, big data, such as Lucene index, was not suitable to be
updated frequently. Frequent updating must affect the performance and
consistency when Lucene index must be replicated in a large scale cluster.
It is expected such a search engine must work in a write-once & read-many
environment, right? That's what HDFS (Hadoop Distributed File System)
provides. According to my experience, it is really slow when updating a
Lucene Index.


Why did you say I could update Lucene index frequently?

Thanks so much!
Bing

On Mon, Jan 23, 2012 at 11:02 PM, Shashi Kant <sk...@sloan.mit.edu> wrote:

> You can update the document in the index quite frequently. IDNK what
> your requirement is, another option would be to boost query time.
>
> On Sun, Jan 22, 2012 at 5:51 AM, Bing Li <lbl...@gmail.com> wrote:
> > Dear Shashi,
> >
> > Thanks so much for your reply!
> >
> > However, I think the value of PageRank is not a static one. It must
> update
> > on the fly. As I know, Lucene index is not suitable to be updated too
> > frequently. If so, how to deal with that?
> >
> > Best regards,
> > Bing
> >
> >
> > On Sun, Jan 22, 2012 at 12:43 PM, Shashi Kant <sk...@sloan.mit.edu>
> wrote:
> >>
> >> Lucene has a mechanism to "boost" up/down documents using your custom
> >> ranking algorithm. So if you come up with something like Pagerank
> >> you might do something like doc.SetBoost(myboost), before writing to
> >> index.
> >>
> >>
> >>
> >> On Sat, Jan 21, 2012 at 5:07 PM, Bing Li <lbl...@gmail.com> wrote:
> >> > Hi, Kai,
> >> >
> >> > Thanks so much for your reply!
> >> >
> >> > If the retrieving is done on a string field, not a text field, a
> >> > complete
> >> > matching approach should be used according to my understanding, right?
> >> > If
> >> > so, how does Lucene rank the retrieved data?
> >> >
> >> > Best regards,
> >> > Bing
> >> >
> >> > On Sun, Jan 22, 2012 at 5:56 AM, Kai Lu <lukai1...@gmail.com> wrote:
> >> >
> >> >> Solr is kind of retrieval step, you can customize the score formula
> in
> >> >> Lucene. But it supposes not to be too complicated, like it's better
> can
> >> >> be
> >> >> factorization. It also regards to the stored information, like
> >> >> TF,DF,position, etc. You can do 2nd phase rerank to the top N data
> you
> >> >> have
> >> >> got.
> >> >>
> >> >> Sent from my iPad
> >> >>
> >> >> On Jan 21, 2012, at 1:33 PM, Bing Li <lbl...@gmail.com> wrote:
> >> >>
> >> >> > Dear all,
> >> >> >
> >> >> > I am using SolrJ to implement a system that needs to provide users
> >> >> > with
> >> >> > searching services. I have some questions about Solr searching as
> >> >> follows.
> >> >> >
> >> >> > As I know, Lucene retrieves data according to the degree of keyword
> >> >> > matching on text field (partial matching).
> >> >> >
> >> >> > But, if I search data by string field (complete matching), how does
> >> >> Lucene
> >> >> > sort the retrieved data?
> >> >> >
> >> >> > If I want to add new sorting ways, Solr's function query seems to
> >> >> > support
> >> >> > this feature.
> >> >> >
> >> >> > However, for a complicated ranking strategy, such PageRank, can
> Solr
> >> >> > provide an interface for me to do that?
> >> >> >
> >> >> > My ranking ways are more complicated than PageRank. Now I have to
> >> >> > load
> >> >> all
> >> >> > of matched data from Solr first by keyword and rank them again in
> my
> >> >> > ways
> >> >> > before showing to users. It is correct?
> >> >> >
> >> >> > Thanks so much!
> >> >> > Bing
> >> >>
> >
> >
>

Re: How to Sort By a PageRank-Like Complicated Strategy?

Reply via email to