Hi Tom, This is great. Should make it to the documentations.
Regards, Aloke On Thu, Dec 13, 2012 at 1:23 PM, Burgmans, Tom < tom.burgm...@wolterskluwer.com> wrote: > I am also busy with getting this clear. Here are my notes so far (by > copying and writing myself): > > > > queryWeight = the impact of the query against the field > implementation: boost(query)*idf*queryNorm > > > boost(query) = boost of the field at query-time > Implication: hits in fields with higher boost get a higher score > Rationale: a term in field A could be more relevant than the same > term in field B > > > idf = inverse document frequency = measure of how often the term > appears across the index for this field > implementation: log(numDocs/(docFreq+1))+1 > Implication: the greater the occurrence of a term in different > documents, the lower its score > Rationale: common terms are less important than uncommon ones > numDocs = the total number of documents in the index, not including > those that are marked as deleted but have not yet been purged. This is a > constant (the same value for all documents in the index). > docFreq = the number of documents in the index which contain the term > in this field. This is a constant (the same value for all documents in the > index containing this field) > > > queryNorm = normalization factor so that queries can be compared > implementation: 1/sqrt(sumOfSquaredWeights) > Implication: doesn't impact the relevancy of this result > Rationale: queryNorm is not related to the relevance of the > document, but rather tries to make scores between different queries > comparable. This value is equal for all results of the query > > > fieldWeight = the score of a term matching the field > implementation: tf*idf*fieldNorm > > > tf = term frequency in a field = measure of how often a term appears > in the field > implementation: sqrt(freq) > Implication: the more frequent a term occurs in a field, the > greater its score > Rationale: fields which contains more of a term are generally more > relevant > freq = termFreq = amount of times the term occurs in the field for > this document > > > fieldNorm = impact of a hit in this field > implementation: lengthNorm*boost(index) > lengthNorm = measure of the importance of a term according to the > total number of terms in the field > implementation: 1/sqrt(numTerms) > Implication: a term matched in fields with less terms have a > higher score > Rationale: a term in a field with less terms is more important > than one with more > numTerms = amount of terms in a field > boost (index) = boost of the field at index-time > Implication: hits in fields with higher boost get a higher score > Rationale: a term in field A could be more relevant than the same > term in field B > > > maxDocs = the number of documents in the index, including those that > are marked as deleted but have not yet been purged. This is a constant (the > same value for all documents in the index) > Implication: (probably) doesn't play a role in the scoring > calculation > > > coord = number of terms in the query that were found in the document > (omitted if equal to 1) > implementation: overlap/maxOverlap > Implication: of the terms in the query, a document that contains > more terms will have a higher score > Rationale: documents that match the most optional terms score > highest > overlap = the number of query terms matched in the document > maxOverlap = the total number of terms in the query > > > FunctionQuery = could be any kind of custom ranking function, which > outcome is added to, or multiplied with the default rank score. > Implication: various > > > Look at the EXPLAIN information to see how the final score is calculated. > > Tom > > > -----Original Message----- > From: Sangeetha [mailto:sangeetha...@gmail.com] > Sent: Thursday 13 December 2012 08:33 > To: solr-user@lucene.apache.org > Subject: score calculation > > > I want to know how score is calculated? > > what is fieldweight, fieldNorm, queryWeight and queryNorm. And what is the > formula to get the final score using fieldweight, fieldNorm, queryWeight > ,queryNorm, idf and tf. > > Can anyone explain or provide some links? > > Thanks, > Sangeetha > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/score-calculation-tp4026669.html > Sent from the Solr - User mailing list archive at Nabble.com. > > This email and any attachments may contain confidential or privileged > information > and is intended for the addressee only. If you are not the intended > recipient, please > immediately notify us by email or telephone and delete the original email > and attachments > without using, disseminating or reproducing its contents to anyone other > than the intended > recipient. Wolters Kluwer shall not be liable for the incorrect or > incomplete transmission of > of this email or any attachments, nor for unauthorized use by its > employees. > > Wolters Kluwer nv has its registered address in Alphen aan den Rijn, The > Netherlands, and is registered > with the Trade Registry of the Dutch Chamber of Commerce under number > 33202517. >