Thank you Dean. I thought I was on the right track with BQ but it was the skewing of results that was frustrating me. I'll try out your suggestion.
Cheers, Pete On Mon, 2009-03-16 at 10:29 +0800, Dean Missikowski (Consultant), CLSA wrote: > Hi, > > My experience is that the BQ parameter can be used with any query type. > You can define boosts on the query fields (qf) that are used with the > query terms (q) in your query, AND you can define additional boosts for > fields that are not used with the query terms through the bq or bf > parameters. > > I think the relative weight that assigning a particular boost to a field > via BQ has on the overall scoring needs to take into consideration the > other fields in your query. If you're searching on titles, you might > want to consider setting omitNorms=true (means don't generate length > normalization vectors) for title in your schema.xml, and if you're using > Solr 1.4 omitTf=true (means don't generate term frequency vectors), so > that results aren't skewed by short and long titles, or titles that > contain multiple occurrences of the same term (setting these requires > you to reindex). I think this should have the effect of making BQ boosts > like &bq=media:DVD^2&bq=media:BLU-RAY^1.5 more effective. > > -- Dean > > -----Original Message----- > From: Pete Smith [mailto:pete.sm...@lovefilm.com] > Sent: 13/03/2009 7:11 PM > To: solr-user@lucene.apache.org > Subject: Re: How to correctly boost results in Solr Dismax query > > Hi, > > On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote: > > bq works only with q.alt query and not with q queries. So, in your > case you > > would be using qf parameter for field boosting, you will have to give > both > > the fields in qf parameter i.e. both title and media. > > > > try this > > > > <str name=qf>media^1.0 title^100.0</str> > > But with that, how will it know to rank media:DVD higher than > media:BLU-RAY? > > Cheers, > Pete > > > > Pete Smith-3 wrote: > > > > > > Hi Amit, > > > > > > Thanks again for your reply. I am understanding it a bit better but > I > > > think it would help if I posted an example. Say I have three > records: > > > > > > <doc> > > > <long name="id">1</long> > > > <str name="media">BLU-RAY</str> > > > <str name="title">Indiana Jones and the Kingdom of the Crystal > > > Skull</str> > > > </doc> > > > <doc> > > > <long name="id">2</long> > > > <str name="media">DVD</str> > > > <str name="title">Indiana Jones and the Kingdom of the Crystal > > > Skull</str> > > > </doc> > > > <doc> > > > <long name="id">3</long> > > > <str name="media">DVD</str> > > > <str name="title">Casino Royale</str> > > > </doc> > > > > > > Now, if I search for indiana: select?q=indiana > > > > > > I want the first two rows to come back (not the third as it does not > > > contain 'indiana'). I would like record 2 to be scored higher than > > > record 1 as it's media type is DVD. > > > > > > At the moment I have in my config: > > > > > > <str name="qf">title</str> > > > > > > And i was trying to boost by media having a specific value by using > 'bq' > > > but from what you told me that is incorrect. > > > > > > Cheers, > > > Pete > > > > > > > > > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote: > > >> Pete, > > >> > > >> Sorry, if wasnt clear. Here is the explanation. > > >> > > >> Suppose you have 2 records and they have films and media as 2 > columns. > > >> > > >> Now first record has values like films="Indiana" and media="blue > ray" > > >> and 2nd record has values like films="Bond" and media="Indiana" > > >> > > >> Values for qf parameters > > >> > > >> <str name="qf">media^2.0 films^1.0</str> > > >> > > >> Now, search for q=Indiana .. it should display both of the records > but > > >> record #2 will display above than the 1st. > > >> > > >> Let me know if you still have questions. > > >> > > >> Cheers, > > >> amit > > >> > > >> > > >> Pete Smith-3 wrote: > > >> > > > >> > Hi Amit, > > >> > > > >> > Thanks very much for your reply. What you said makes things a bit > > >> > clearer but I am still a bit confused. > > >> > > > >> > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: > > >> >> If you want to boost the records with their field value then you > must > > >> use > > >> >> q > > >> >> query parameter instead of q.alt. 'q' parameter actually uses qf > > >> >> parameters > > >> >> from solrConfig for field boosting. > > >> > > > >> >>From the documentation for Dismax queries, I thought that "q" is > simply > > >> > a keyword parameter: > > >> > > > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > > >> > q > > >> > The guts of the search defining the main "query". This is > designed to > > >> be > > >> > support raw input strings provided by users with no special > escaping. > > >> > '+' and '-' characters are treated as "mandatory" and > "prohibited" > > >> > modifiers for the subsequent terms. Text wrapped in balanced > quote > > >> > characters '"' are treated as phrases, any query containing an > odd > > >> > number of quote characters is evaluated as if there were no quote > > >> > characters at all. Wildcards in this "q" parameter are not > supported. > > >> > > > >> > And I thought 'qf' is a list of fields and boost scores: > > >> > > > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > > >> > qf (Query Fields) > > >> > List of fields and the "boosts" to associate with each of them > when > > >> > building DisjunctionMaxQueries from the user's query. The format > > >> > supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which > indicates that > > >> > fieldOne has a boost of 2.3, fieldTwo has the default boost, and > > >> > fieldThree has a boost of 0.4 ... this indicates that matches in > > >> > fieldOne are much more significant than matches in fieldTwo, > which are > > >> > more significant than matches in fieldThree. > > >> > > > >> > But if I want to, say, search for films with 'indiana' in the > title, > > >> > with media=DVD scoring higher than media=BLU-RAY then do I need > to do > > >> > something like: > > >> > > > >> > solr/select?q=indiana > > >> > > > >> > And in my config: > > >> > > > >> > <str name="qf">media^2</str> > > >> > > > >> > But I don't see where the actual *contents* of the media field > would > > >> > determine the boost. > > >> > > > >> > Sorry if I have misunderstood what you mean. > > >> > > > >> > Cheers, > > >> > Pete > > >> > > > >> >> Pete Smith-3 wrote: > > >> >> > > > >> >> > Hi, > > >> >> > > > >> >> > I have managed to build an index in Solr which I can search on > > >> keyword, > > >> >> > produce facets, query facets etc. This is all working great. I > have > > >> >> > implemented my search using a dismax query so it searches > > >> predetermined > > >> >> > fields. > > >> >> > > > >> >> > However, my results are coming back sorted by score which > appears to > > >> be > > >> >> > calculated by keyword relevancy only. I would like to adjust > the > > >> score > > >> >> > where fields have pre-determined values. I think I can do this > with > > >> >> > boost query and boost functions but the documentation here: > > >> >> > > > >> >> > > > >> >> > > >> > http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09b > dab971309135c7aea22fb3 > > >> >> > > > >> >> > Is not particularly helpful. I tried adding adding a bq > argument to > > >> my > > >> >> > search: > > >> >> > > > >> >> > &bq=media:DVD^2 > > >> >> > > > >> >> > (yes, this is an index of films!) but I find when I start > adding > > >> more > > >> >> > and more: > > >> >> > > > >> >> > &bq=media:DVD^2&bq=media:BLU-RAY^1.5 > > >> >> > > > >> >> > I find the negative results - e.g. films that are DVD but are > not > > >> >> > BLU-RAY get negatively affected in their score. In the end it > all > > >> seems > > >> >> > to even out and my score is as it was before i started > boosting. > > >> >> > > > >> >> > I must be doing this wrong and I wonder whether "boost > function" > > >> comes > > >> >> > in somewhere. Any ideas on how to correctly use boost? > > >> >> > > > >> >> > Cheers, > > >> >> > Pete > > >> >> > > > >> >> > -- > > >> >> > Pete Smith > > >> >> > Developer > > >> >> > > > >> >> > No.9 | 6 Portal Way | London | W3 6RU | > > >> >> > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 > > >> >> > > > >> >> > LOVEFiLM.com > > >> >> > > > >> >> > > > >> >> > > >> > -- > > >> > Pete Smith > > >> > Developer > > >> > > > >> > No.9 | 6 Portal Way | London | W3 6RU | > > >> > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 > > >> > > > >> > LOVEFiLM.com > > >> > > > >> > > > >> > > > -- > > > Pete Smith > > > Developer > > > > > > No.9 | 6 Portal Way | London | W3 6RU | > > > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 > > > > > > LOVEFiLM.com > > > > > > > > > -- > Pete Smith > Developer > > No.9 | 6 Portal Way | London | W3 6RU | > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 > > LOVEFiLM.com > > CLSA CLEAN & GREEN: Please consider our environment before printing this > email. > The content of this communication is subject to CLSA Legal and Regulatory > Notices. > These can be viewed at https://www.clsa.com/disclaimer.html or sent to you > upon request. > > -- Pete Smith Developer No.9 | 6 Portal Way | London | W3 6RU | T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 LOVEFiLM.com