Thank you Dean. I thought I was on the right track with BQ but it was
the skewing of results that was frustrating me. I'll try out your
suggestion.

Cheers,
Pete

On Mon, 2009-03-16 at 10:29 +0800, Dean Missikowski (Consultant), CLSA
wrote:
> Hi,
> 
> My experience is that the BQ parameter can be used with any query type.
> You can define boosts on the query fields (qf) that are used with the
> query terms (q) in your query, AND you can define additional boosts for
> fields that are not used with the query terms through the bq or bf
> parameters. 
> 
> I think the relative weight that assigning a particular boost to a field
> via BQ has on the overall scoring needs to take into consideration the
> other fields in your query. If you're searching on titles, you might
> want to consider setting omitNorms=true (means don't generate length
> normalization vectors) for title in your schema.xml, and if you're using
> Solr 1.4 omitTf=true (means don't generate term frequency vectors), so
> that results aren't skewed by short and long titles, or titles that
> contain multiple occurrences of the same term (setting these requires
> you to reindex). I think this should have the effect of making BQ boosts
> like &bq=media:DVD^2&bq=media:BLU-RAY^1.5 more effective. 
> 
> -- Dean
> 
> -----Original Message-----
> From: Pete Smith [mailto:pete.sm...@lovefilm.com] 
> Sent: 13/03/2009 7:11 PM
> To: solr-user@lucene.apache.org
> Subject: Re: How to correctly boost results in Solr Dismax query
> 
> Hi,
> 
> On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote:
> > bq works only with q.alt query and not with q queries. So, in your
> case you
> > would be using qf parameter for field boosting, you will have to give
> both
> > the fields in qf parameter i.e. both title and media.
> > 
> > try this
> > 
> > <str name=qf>media^1.0 title^100.0</str>
> 
> But with that, how will it know to rank media:DVD higher than
> media:BLU-RAY?
> 
> Cheers,
> Pete
> 
> 
> > Pete Smith-3 wrote:
> > > 
> > > Hi Amit,
> > > 
> > > Thanks again for your reply. I am understanding it a bit better but
> I
> > > think it would help if I posted an example. Say I have three
> records:
> > > 
> > > <doc>
> > > <long name="id">1</long>
> > > <str name="media">BLU-RAY</str>
> > > <str name="title">Indiana Jones and the Kingdom of the Crystal
> > > Skull</str>
> > > </doc>
> > > <doc>
> > > <long name="id">2</long>
> > > <str name="media">DVD</str>
> > > <str name="title">Indiana Jones and the Kingdom of the Crystal
> > > Skull</str>
> > > </doc>
> > > <doc>
> > > <long name="id">3</long>
> > > <str name="media">DVD</str>
> > > <str name="title">Casino Royale</str>
> > > </doc>
> > > 
> > > Now, if I search for indiana: select?q=indiana
> > > 
> > > I want the first two rows to come back (not the third as it does not
> > > contain 'indiana'). I would like record 2 to be scored higher than
> > > record 1 as it's media type is DVD.
> > > 
> > > At the moment I have in my config:
> > > 
> > > <str name="qf">title</str>
> > > 
> > > And i was trying to boost by media having a specific value by using
> 'bq'
> > > but from what you told me that is incorrect.
> > > 
> > > Cheers,
> > > Pete
> > > 
> > > 
> > > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote:
> > >> Pete,
> > >> 
> > >> Sorry, if wasnt clear. Here is the explanation.
> > >> 
> > >> Suppose you have 2 records and they have films and media as 2
> columns.
> > >> 
> > >> Now first record has values like films="Indiana" and media="blue
> ray"
> > >> and 2nd record has values like films="Bond" and media="Indiana"
> > >> 
> > >> Values for qf parameters
> > >> 
> > >> <str name="qf">media^2.0 films^1.0</str>
> > >> 
> > >> Now, search for q=Indiana .. it should display both of the records
> but
> > >> record #2 will display above than the 1st.
> > >> 
> > >> Let me know if you still have questions.
> > >> 
> > >> Cheers,
> > >> amit
> > >> 
> > >> 
> > >> Pete Smith-3 wrote:
> > >> > 
> > >> > Hi Amit,
> > >> > 
> > >> > Thanks very much for your reply. What you said makes things a bit
> > >> > clearer but I am still a bit confused.
> > >> > 
> > >> > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote:
> > >> >> If you want to boost the records with their field value then you
> must
> > >> use
> > >> >> q
> > >> >> query parameter instead of q.alt. 'q' parameter actually uses qf
> > >> >> parameters
> > >> >> from solrConfig for field boosting.
> > >> > 
> > >> >>From the documentation for Dismax queries, I thought that "q" is
> simply
> > >> > a keyword parameter:
> > >> > 
> > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler:
> > >> > q
> > >> > The guts of the search defining the main "query". This is
> designed to
> > >> be
> > >> > support raw input strings provided by users with no special
> escaping.
> > >> > '+' and '-' characters are treated as "mandatory" and
> "prohibited"
> > >> > modifiers for the subsequent terms. Text wrapped in balanced
> quote
> > >> > characters '"' are treated as phrases, any query containing an
> odd
> > >> > number of quote characters is evaluated as if there were no quote
> > >> > characters at all. Wildcards in this "q" parameter are not
> supported. 
> > >> > 
> > >> > And I thought 'qf' is a list of fields and boost scores:
> > >> > 
> > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler:
> > >> > qf (Query Fields)
> > >> > List of fields and the "boosts" to associate with each of them
> when
> > >> > building DisjunctionMaxQueries from the user's query. The format
> > >> > supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which
> indicates that
> > >> > fieldOne has a boost of 2.3, fieldTwo has the default boost, and
> > >> > fieldThree has a boost of 0.4 ... this indicates that matches in
> > >> > fieldOne are much more significant than matches in fieldTwo,
> which are
> > >> > more significant than matches in fieldThree. 
> > >> > 
> > >> > But if I want to, say, search for films with 'indiana' in the
> title,
> > >> > with media=DVD scoring higher than media=BLU-RAY then do I need
> to do
> > >> > something like:
> > >> > 
> > >> > solr/select?q=indiana
> > >> > 
> > >> > And in my config:
> > >> > 
> > >> > <str name="qf">media^2</str>
> > >> > 
> > >> > But I don't see where the actual *contents* of the media field
> would
> > >> > determine the boost.
> > >> > 
> > >> > Sorry if I have misunderstood what you mean.
> > >> > 
> > >> > Cheers,
> > >> > Pete
> > >> > 
> > >> >> Pete Smith-3 wrote:
> > >> >> > 
> > >> >> > Hi,
> > >> >> > 
> > >> >> > I have managed to build an index in Solr which I can search on
> > >> keyword,
> > >> >> > produce facets, query facets etc. This is all working great. I
> have
> > >> >> > implemented my search using a dismax query so it searches
> > >> predetermined
> > >> >> > fields.
> > >> >> > 
> > >> >> > However, my results are coming back sorted by score which
> appears to
> > >> be
> > >> >> > calculated by keyword relevancy only. I would like to adjust
> the
> > >> score
> > >> >> > where fields have pre-determined values. I think I can do this
> with
> > >> >> > boost query and boost functions but the documentation here:
> > >> >> > 
> > >> >> >
> > >> >>
> > >>
> http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09b
> dab971309135c7aea22fb3
> > >> >> > 
> > >> >> > Is not particularly helpful. I tried adding adding a bq
> argument to
> > >> my
> > >> >> > search: 
> > >> >> > 
> > >> >> > &bq=media:DVD^2
> > >> >> > 
> > >> >> > (yes, this is an index of films!) but I find when I start
> adding
> > >> more
> > >> >> > and more:
> > >> >> > 
> > >> >> > &bq=media:DVD^2&bq=media:BLU-RAY^1.5
> > >> >> > 
> > >> >> > I find the negative results - e.g. films that are DVD but are
> not
> > >> >> > BLU-RAY get negatively affected in their score. In the end it
> all
> > >> seems
> > >> >> > to even out and my score is as it was before i started
> boosting.
> > >> >> > 
> > >> >> > I must be doing this wrong and I wonder whether "boost
> function"
> > >> comes
> > >> >> > in somewhere. Any ideas on how to correctly use boost?
> > >> >> > 
> > >> >> > Cheers,
> > >> >> > Pete
> > >> >> > 
> > >> >> > -- 
> > >> >> > Pete Smith
> > >> >> > Developer
> > >> >> > 
> > >> >> > No.9 | 6 Portal Way | London | W3 6RU |
> > >> >> > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111
> > >> >> > 
> > >> >> > LOVEFiLM.com
> > >> >> > 
> > >> >> > 
> > >> >> 
> > >> > -- 
> > >> > Pete Smith
> > >> > Developer
> > >> > 
> > >> > No.9 | 6 Portal Way | London | W3 6RU |
> > >> > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111
> > >> > 
> > >> > LOVEFiLM.com
> > >> > 
> > >> > 
> > >> 
> > > -- 
> > > Pete Smith
> > > Developer
> > > 
> > > No.9 | 6 Portal Way | London | W3 6RU |
> > > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111
> > > 
> > > LOVEFiLM.com
> > > 
> > > 
> > 
> -- 
> Pete Smith
> Developer
> 
> No.9 | 6 Portal Way | London | W3 6RU |
> T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111
> 
> LOVEFiLM.com
> 
> CLSA CLEAN & GREEN: Please consider our environment before printing this 
> email.
> The content of this communication is subject to CLSA Legal and Regulatory 
> Notices. 
> These can be viewed at https://www.clsa.com/disclaimer.html or sent to you 
> upon request.
> 
> 
-- 
Pete Smith
Developer

No.9 | 6 Portal Way | London | W3 6RU |
T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111

LOVEFiLM.com

Reply via email to