If you just discovered the omitTf parameter because of this post, please
be aware that I've not really explained it's purpose properly and note
that using it will prevent phrase queries from working. See this thread
for clarification on it's use here:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200903.mbox/%3
c897559.95769...@web50301.mail.re2.yahoo.com%3e

-- Dean

-----Original Message-----
From: Dean Missikowski (Consultant), CLSA 
Sent: 16/03/2009 10:30 AM
To: solr-user@lucene.apache.org
Subject: RE: How to correctly boost results in Solr Dismax query

Hi,

My experience is that the BQ parameter can be used with any query type.
You can define boosts on the query fields (qf) that are used with the
query terms (q) in your query, AND you can define additional boosts for
fields that are not used with the query terms through the bq or bf
parameters. 

I think the relative weight that assigning a particular boost to a field
via BQ has on the overall scoring needs to take into consideration the
other fields in your query. If you're searching on titles, you might
want to consider setting omitNorms=true (means don't generate length
normalization vectors) for title in your schema.xml, and if you're using
Solr 1.4 omitTf=true (means don't generate term frequency vectors), so
that results aren't skewed by short and long titles, or titles that
contain multiple occurrences of the same term (setting these requires
you to reindex). I think this should have the effect of making BQ boosts
like &bq=media:DVD^2&bq=media:BLU-RAY^1.5 more effective. 

-- Dean

-----Original Message-----
From: Pete Smith [mailto:pete.sm...@lovefilm.com] 
Sent: 13/03/2009 7:11 PM
To: solr-user@lucene.apache.org
Subject: Re: How to correctly boost results in Solr Dismax query

Hi,

On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote:
> bq works only with q.alt query and not with q queries. So, in your
case you
> would be using qf parameter for field boosting, you will have to give
both
> the fields in qf parameter i.e. both title and media.
> 
> try this
> 
> <str name=qf>media^1.0 title^100.0</str>

But with that, how will it know to rank media:DVD higher than
media:BLU-RAY?

Cheers,
Pete


> Pete Smith-3 wrote:
> > 
> > Hi Amit,
> > 
> > Thanks again for your reply. I am understanding it a bit better but
I
> > think it would help if I posted an example. Say I have three
records:
> > 
> > <doc>
> > <long name="id">1</long>
> > <str name="media">BLU-RAY</str>
> > <str name="title">Indiana Jones and the Kingdom of the Crystal
> > Skull</str>
> > </doc>
> > <doc>
> > <long name="id">2</long>
> > <str name="media">DVD</str>
> > <str name="title">Indiana Jones and the Kingdom of the Crystal
> > Skull</str>
> > </doc>
> > <doc>
> > <long name="id">3</long>
> > <str name="media">DVD</str>
> > <str name="title">Casino Royale</str>
> > </doc>
> > 
> > Now, if I search for indiana: select?q=indiana
> > 
> > I want the first two rows to come back (not the third as it does not
> > contain 'indiana'). I would like record 2 to be scored higher than
> > record 1 as it's media type is DVD.
> > 
> > At the moment I have in my config:
> > 
> > <str name="qf">title</str>
> > 
> > And i was trying to boost by media having a specific value by using
'bq'
> > but from what you told me that is incorrect.
> > 
> > Cheers,
> > Pete
> > 
> > 
> > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote:
> >> Pete,
> >> 
> >> Sorry, if wasnt clear. Here is the explanation.
> >> 
> >> Suppose you have 2 records and they have films and media as 2
columns.
> >> 
> >> Now first record has values like films="Indiana" and media="blue
ray"
> >> and 2nd record has values like films="Bond" and media="Indiana"
> >> 
> >> Values for qf parameters
> >> 
> >> <str name="qf">media^2.0 films^1.0</str>
> >> 
> >> Now, search for q=Indiana .. it should display both of the records
but
> >> record #2 will display above than the 1st.
> >> 
> >> Let me know if you still have questions.
> >> 
> >> Cheers,
> >> amit
> >> 
> >> 
> >> Pete Smith-3 wrote:
> >> > 
> >> > Hi Amit,
> >> > 
> >> > Thanks very much for your reply. What you said makes things a bit
> >> > clearer but I am still a bit confused.
> >> > 
> >> > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote:
> >> >> If you want to boost the records with their field value then you
must
> >> use
> >> >> q
> >> >> query parameter instead of q.alt. 'q' parameter actually uses qf
> >> >> parameters
> >> >> from solrConfig for field boosting.
> >> > 
> >> >>From the documentation for Dismax queries, I thought that "q" is
simply
> >> > a keyword parameter:
> >> > 
> >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler:
> >> > q
> >> > The guts of the search defining the main "query". This is
designed to
> >> be
> >> > support raw input strings provided by users with no special
escaping.
> >> > '+' and '-' characters are treated as "mandatory" and
"prohibited"
> >> > modifiers for the subsequent terms. Text wrapped in balanced
quote
> >> > characters '"' are treated as phrases, any query containing an
odd
> >> > number of quote characters is evaluated as if there were no quote
> >> > characters at all. Wildcards in this "q" parameter are not
supported. 
> >> > 
> >> > And I thought 'qf' is a list of fields and boost scores:
> >> > 
> >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler:
> >> > qf (Query Fields)
> >> > List of fields and the "boosts" to associate with each of them
when
> >> > building DisjunctionMaxQueries from the user's query. The format
> >> > supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which
indicates that
> >> > fieldOne has a boost of 2.3, fieldTwo has the default boost, and
> >> > fieldThree has a boost of 0.4 ... this indicates that matches in
> >> > fieldOne are much more significant than matches in fieldTwo,
which are
> >> > more significant than matches in fieldThree. 
> >> > 
> >> > But if I want to, say, search for films with 'indiana' in the
title,
> >> > with media=DVD scoring higher than media=BLU-RAY then do I need
to do
> >> > something like:
> >> > 
> >> > solr/select?q=indiana
> >> > 
> >> > And in my config:
> >> > 
> >> > <str name="qf">media^2</str>
> >> > 
> >> > But I don't see where the actual *contents* of the media field
would
> >> > determine the boost.
> >> > 
> >> > Sorry if I have misunderstood what you mean.
> >> > 
> >> > Cheers,
> >> > Pete
> >> > 
> >> >> Pete Smith-3 wrote:
> >> >> > 
> >> >> > Hi,
> >> >> > 
> >> >> > I have managed to build an index in Solr which I can search on
> >> keyword,
> >> >> > produce facets, query facets etc. This is all working great. I
have
> >> >> > implemented my search using a dismax query so it searches
> >> predetermined
> >> >> > fields.
> >> >> > 
> >> >> > However, my results are coming back sorted by score which
appears to
> >> be
> >> >> > calculated by keyword relevancy only. I would like to adjust
the
> >> score
> >> >> > where fields have pre-determined values. I think I can do this
with
> >> >> > boost query and boost functions but the documentation here:
> >> >> > 
> >> >> >
> >> >>
> >>
http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09b
dab971309135c7aea22fb3
> >> >> > 
> >> >> > Is not particularly helpful. I tried adding adding a bq
argument to
> >> my
> >> >> > search: 
> >> >> > 
> >> >> > &bq=media:DVD^2
> >> >> > 
> >> >> > (yes, this is an index of films!) but I find when I start
adding
> >> more
> >> >> > and more:
> >> >> > 
> >> >> > &bq=media:DVD^2&bq=media:BLU-RAY^1.5
> >> >> > 
> >> >> > I find the negative results - e.g. films that are DVD but are
not
> >> >> > BLU-RAY get negatively affected in their score. In the end it
all
> >> seems
> >> >> > to even out and my score is as it was before i started
boosting.
> >> >> > 
> >> >> > I must be doing this wrong and I wonder whether "boost
function"
> >> comes
> >> >> > in somewhere. Any ideas on how to correctly use boost?
> >> >> > 
> >> >> > Cheers,
> >> >> > Pete
> >> >> > 
> >> >> > -- 
> >> >> > Pete Smith
> >> >> > Developer
> >> >> > 
> >> >> > No.9 | 6 Portal Way | London | W3 6RU |
> >> >> > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111
> >> >> > 
> >> >> > LOVEFiLM.com
> >> >> > 
> >> >> > 
> >> >> 
> >> > -- 
> >> > Pete Smith
> >> > Developer
> >> > 
> >> > No.9 | 6 Portal Way | London | W3 6RU |
> >> > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111
> >> > 
> >> > LOVEFiLM.com
> >> > 
> >> > 
> >> 
> > -- 
> > Pete Smith
> > Developer
> > 
> > No.9 | 6 Portal Way | London | W3 6RU |
> > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111
> > 
> > LOVEFiLM.com
> > 
> > 
> 
-- 
Pete Smith
Developer

No.9 | 6 Portal Way | London | W3 6RU |
T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111

LOVEFiLM.com

CLSA CLEAN & GREEN: Please consider our environment before printing this
email.
The content of this communication is subject to CLSA Legal and
Regulatory Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to
you upon request.


Reply via email to