Short summary:

 * Multiple simultaneous phrase boosts with different ps2 parameters
   are working very nicely for me on a few million doc QA system.

 * I've submitted an updated patch to Jira incorporating feedback
   from the jira comments.   Will be testing it more this week.
   https://issues.apache.org/jira/browse/SOLR-2058

On 2010-08-19 Ron Mayer wrote:
> Chris Hostetter wrote:
>> [Yonik Seeley wrote]
>> : Perhaps fold it into the pf/pf2 syntax?
>> : pf=text~1^2  // proposed syntax...
>>
>> Big +1 to this idea ... 
> ...
> I added a ticket here: https://issues.apache.org/jira/browse/SOLR-2058
> and attached my patch to that ticket.

Just wanted to comment that has been working extremely well for me; with
multiple simultaneous phrase boosts with different slops at the same time.

I also cleaned up the patch based on comments in Jira and submitted a
newer version.

In particular, I find if I use the following:
    * a high boost(500)     on pf  with slop of 0
    * a moderate boost (50) on pf  with a slop of 50
    * a moderate boost (50) on pf2 with a slop of 0
    * a low boost (10)      on pf2 with a slop of 10
it's doing a great job of getting the most relevant document
in the #1 spot (thanks to the slop=0 boosts), and a very good
job at getting the entire first page of results filled with
highly relevant documents (thanks to the shingles and more
liberal phrase-slop boosts).


I'm even having some luck with a whole bunch of those clauses like
the following that has a variety of phrase slops on a variety
of fields:
http://app2.fli:28983/solr/core0/select?pf=source_doc~1^500+text_stem~1^500+source_doc~50^50+text_stem~20^50&defType=edismax&hl.maxAnalyzedChars=500000&q.alt=*%3A*&ps=1&qt=fliqs&pf2=text_stem^50+text_stem~10^10+text_unstem~10^10&start=0&q=red+baseball+cap+black+leather+jacket&mm=100%25&debugQuery=on&fl=id,score
which seems to be returning quickly enough on my collection of 4 million 
documents:

<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">287</int>
−
<lst name="params">
<str name="mm">100%</str>
<str name="pf2">text_stem^50 text_stem~10^10 text_unstem~10^10</str>
<str name="q.alt">*:*</str>
<str name="hl.maxAnalyzedChars">500000</str>
<str name="defType">edismax</str>
−
<str name="pf">
source_doc~1^500 text_stem~1^500 source_doc~50^50 text_stem~20^50
</str>
<str name="debugQuery">on</str>
<str name="fl">id,score</str>
<str name="start">0</str>
<str name="q">red baseball cap black leather jacket</str>
<str name="qt">fliqs</str>
<str name="ps">1</str>
</lst>
</lst>
...
<str name="parsedquery">
+((DisjunctionMaxQuery((text_stem:red^0.5)~0.01)
   DisjunctionMaxQuery((text_stem:basebal^0.5)~0.01)
   DisjunctionMaxQuery((text_stem:cap^0.5)~0.01)
   DisjunctionMaxQuery((text_stem:black^0.5)~0.01)
   DisjunctionMaxQuery((text_stem:leather^0.5)~0.01)
   DisjunctionMaxQuery((text_stem:jacket^0.5)~0.01)
  )~6)
   DisjunctionMaxQuery((source_doc:"red baseball cap black leather 
jacket"~50^50.0)~0.01)
   DisjunctionMaxQuery((source_doc:"red baseball cap black leather 
jacket"~1^500.0 | text_stem:"red basebal cap black leather 
jacket"~1^500.0)~0.01)
   DisjunctionMaxQuery((text_stem:"red basebal cap black leather 
jacket"~20^50.0)~0.01)
  (DisjunctionMaxQuery((text_stem:"red basebal"~1^50.0)~0.01)
   DisjunctionMaxQuery((text_stem:"basebal cap"~1^50.0)~0.01)
   DisjunctionMaxQuery((text_stem:"cap black"~1^50.0)~0.01)
   DisjunctionMaxQuery((text_stem:"black leather"~1^50.0)~0.01)
   DisjunctionMaxQuery((text_stem:"leather jacket"~1^50.0)~0.01)
  )
  (DisjunctionMaxQuery((text_unstem:"red baseball"~10^10.0 | text_stem:"red 
basebal"~10^10.0)~0.01)
   DisjunctionMaxQuery((text_unstem:"baseball cap"~10^10.0 | text_stem:"basebal 
cap"~10^10.0)~0.01)
   DisjunctionMaxQuery((text_unstem:"cap black"~10^10.0 | text_stem:"cap 
black"~10^10.0)~0.01)
   DisjunctionMaxQuery((text_unstem:"black leather"~10^10.0 | text_stem:"black 
leather"~10^10.0)~0.01)
   DisjunctionMaxQuery((text_unstem:"leather jacket"~10^10.0 | 
text_stem:"leather jacket"~10^10.0)~0.01)
  )
</str>


Reply via email to