Solr also can query link(url) text and rank them higher if we specify url in qf 
field. Only problem is that why it does not rank pages with both words higher 
when mm is set as 
1<-1. It seems to me that this is a bug.

Thanks.
Alex.

 
 

 

-----Original Message-----
From: Ted Dunning <ted.dunn...@gmail.com>
To: solr-user <solr-user@lucene.apache.org>
Sent: Sat, Nov 5, 2011 8:59 pm
Subject: Re: how to achieve google.com like results for phrase queries


Google achieves their results by using data not found in the web pages
themselves.  This additional data critically includes link text, but also
is derived from behavioral information.



On Sat, Nov 5, 2011 at 5:07 PM, <alx...@aim.com> wrote:

> Hi Erick,
>
> The term  "newspaper latimes" is not found in latimes.com. However,
> google places it in the first place. My guess is that mm parameter must
>  not be set as 2&lt;-1 in order to achieve google.com like ranking for
> two word phrase queries.
>
> My goal is to set mm parameter in such a way that latimes.com will be
> ranked in 1-3rd places and sites with both words will be placed after them.
> As I wrote in my previous letter
> setting mm as 1&lt;-1 solves this issue partially. Problem in this case is
> that sites with both words are placed at the bottom or are not in the
> search results at all.
>
> Thanks.
> Alex.
>
>
>
>
>
>
> -----Original Message-----
> From: Erick Erickson <erickerick...@gmail.com>
> To: solr-user <solr-user@lucene.apache.org>
> Sent: Sat, Nov 5, 2011 9:01 am
> Subject: Re: how to achieve google.com like results for phrase queries
>
>
> First, the default query operator is ignored by edismax, so that's
> not doing anything.
>
> Why would you expect "newspaper latimes" to be found at all in
> "latimes.com"? What
> proof do you have that the two terms are even in the "latimes.com"
> document?
>
> You can look at the Query Elevation Component to force certain known
> documents to the top of the results based on the search terms, but that's
> not a very elegant solution.
>
> What business requirement are you trying to accomplish here? Because as
> asked, there's really not enough information to provide a meaningful
> suggestion.
>
> Best
> Erick
>
> On Thu, Nov 3, 2011 at 7:30 PM,  <alx...@aim.com> wrote:
> > Hello,
> >
> > I use nutch-1.3 crawled results in solr-3.4. I noticed that for two word
> phrases like newspaper latimes, latimes.com is not in results at all.
> > This may be due to the dismax def type that I use in  request handler
> >
> > <str name="defType">dismax</str>
> > <str name="qf">url^1.5 id^1.5 content^ title^1.2</str>
> > <str name="pf">url^1.5 id^1.5 content^0.5 title^1.2</str>
> >
> >
> >  with mm as
> > <str name="mm">2&lt;-1 5&lt;-2 6&lt;90%</str>
> >
> > However, changing it to
> > <str name="mm">1&lt;-1 2&lt;-1 5&lt;-2 6&lt;90%</str>
> >
> > and q.op to OR or AND
> >
> > do not solve the problem. In this case latimes.com is ranked higher,
> but still
> is not in the first place.
> > Also in this case results with both words are ranked very low, almost at
> the
> end.
> >
> > We need to be able to achieve the case when latimes.com is placed in
> the first
> place then results with both words and etc.
> >
> > Any ideas how to modify config to this end?
> >
> > Thanks in advance.
> > Alex.
> >
> >
>
>
>
>

 

Reply via email to