Hey All - 

I’m a Solr newbie in need of some help.

I’m using Apache Nutch to crawl a site and populate a Solr core, which we then 
use to query search results. I’ve got it all up and running, but the Solr 
scoring results I get don’t seem to make any sense. Let’s take the following 
query as an example:

content:devlearn 2014 registration information

I have a page with a title of "DevLearn 2014 Conference & Expo - Registration 
Information” and a url of 
"www.mydomain.com/DevLearn/content/3426/devlearn-2014-conference--expo--registration-information/“
 which has multiple instances of all terms in the content field. I would expect 
this document to be returned at the top of the list, since in addition to being 
in the content field, all terms are in both the title and the url, which I’m 
boosting for. Instead, it returns as number 3320 in the results with a score of 
0. Meanwhile, 3319 other pages return with higher scores, and all of these have 
fewer instances of the terms in the content field, and one or fewer of the 
terms in the title or url.

Below is the select requestHandler section from my solrconfig.xml which shows 
the query select defaults. Let me know if I should include more of this file or 
any other information:

<requestHandler name="/select" class="solr.SearchHandler">
  
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <int name="rows">10</int>
       <str name="df">text</str>
           
       <str name="hl">on</str>
       <str name="hl.fl">content</str>
       <str name="hl.encoder">html</str>
       <str name="hl.simple.pre">&lt;strong&gt;</str>
       <str name="hl.simple.post">&lt;/strong&gt;</str>
       <str name="f.content.hl.snippets">1</str>
       <str name="f.content.hl.fragsize">200</str>
       <str name="f.content.hl.alternateField">content</str>
       <str name="f.content.hl.maxAlternateFieldLength">750</str>

       <str name="defType">edismax</str>
       <str name="qf">
          content^0.5 url^10.0 title^10.0
       </str>
       <str name="df">content</str>
       <str name="mm">100%</str>
       <str name="q.alt">*:*</str>
       <str name="rows">10</str>
       <str name="fl">*,score</str>
       <str name="pf">
           content^0.5 url^10.0 title^10.0
       </str>
       <str name="ps">100</str>

     </lst>
</requestHandler>





Reply via email to