On (2) these are BM25 parameters. There are several articles that discuss BM25 in depth
https://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-relevation/ https://www.elastic.co/blog/practical-bm25-part-2-the-bm25-algorithm-and-its-variables On Tue, Jan 1, 2019 at 6:04 PM Lavanya Thirumalaisami <lav...@yahoo.co.in.invalid> wrote: > > Hi, > > I am trying to debug a query to find out why one documentgets more score > than the other. The below are two similar products. > > Below is the debug results I get from Solr admin console. > > "Doc1": "\n15.20965 = sum of:\n 4.7573533 = max of:\n 4.7573533= > weight(All:2x in 962) [], result of:\n 4.7573533 = > score(doc=962,freq=2.0 =termFreq=2.0\n), product of:\n 3.4598935 = > idf(docFreq=1346, docCount=42836)\n 1.375 = tfNorm, computed > from:\n 2.0 = termFreq=2.0\n 1.2 = parameter > k1\n 0.0 = parameter b (norms omitted forfield)\n 10.452296 = max > of:\n 5.9166136 = weight(All:powerpoint in 962)[], result of:\n > 5.9166136 =score(doc=962,freq=2.0 = termFreq=2.0\n), product of:\n > 4.302992 = idf(docFreq=579,docCount=42836)\n 1.375 = tfNorm,computed > from:\n 2.0 =termFreq=2.0\n 1.2 = parameterk1\n > 0.0 = parameter b (normsomitted for field)\n 10.452296 > =weight(All:\"socket outlet\" in 962) [], result of:\n 10.452296 = > score(doc=962,freq=2.0 =phraseFreq=2.0\n), product of:\n 7.60167 = > idf(), sum of:\n 3.5370626 = idf(docFreq=1246, > docCount=42836)\n 4.064607 = > idf(docFreq=735,docCount=42836)\n 1.375 = tfNorm,computed > from:\n 2.0 =phraseFreq=2.0\n 1.2 = > parameterk1\n 0.0 = parameter b (normsomitted for field)\n", > > "Doc15":"\n13.258003 = sum of:\n 5.7317085 = max of:\n 5.7317085 = > weight(All:doubl in 2122) [],result of:\n 5.7317085 > =score(doc=2122,freq=2.0 = termFreq=2.0\n), product of:\n 4.168515 = > idf(docFreq=663,docCount=42874)\n 1.375 = tfNorm,computed > from:\n 2.0 =termFreq=2.0\n 1.2 = parameterk1\n > 0.0 = parameter b (normsomitted for field)\n 4.7657394 =weight(All:2x in > 2122) [], result of:\n 4.7657394 = score(doc=2122,freq=2.0 = > termFreq=2.0\n), productof:\n 3.4659925 =idf(docFreq=1339, > docCount=42874)\n 1.375 = tfNorm, computed from:\n 2.0 = > termFreq=2.0\n 1.2= parameter k1\n 0.0 = parameterb > (norms omitted for field)\n 5.390302= weight(All:2g in 2122) [], result > of:\n 5.390302 = score(doc=2122,freq=2.0 = termFreq=2.0\n), product > of:\n 3.9202197 = idf(docFreq=850,docCount=42874)\n 1.375 = > tfNorm,computed from:\n 2.0 = termFreq=2.0\n 1.2 = > parameter k1\n 0.0 = parameter b (norms omitted forfield)\n > 7.526294 = max of:\n 5.8597584 = weight(All:powerpoint in 2122)[], > result of:\n 5.8597584 =score(doc=2122,freq=2.0 = termFreq=2.0\n), > product of:\n 4.2616425 = idf(docFreq=604,docCount=42874)\n > 1.375 = tfNorm,computed from:\n 2.0 = termFreq=2.0\n 1.2 > = parameter k1\n 0.0 = parameter b (norms omitted forfield)\n > 7.526294 =weight(All:\"socket outlet\" in 2122) [], result of:\n > 7.526294 = score(doc=2122,freq=1.0 =phraseFreq=1.0\n), product > of:\n 7.526294 = idf(), sum of:\n 3.4955401 = > idf(docFreq=1300, docCount=42874)\n 4.030754 = > idf(docFreq=761,docCount=42874)\n 1.0 = tfNorm,computed > from:\n 1.0 =phraseFreq=1.0\n 1.2 = > parameterk1\n 0.0 = parameter b (normsomitted for field)\n", > > > > My Questions > > 1. IDF : I understand from solr documents that IDFis calculated for > each separate shards, I have added the following stats cacheconfig to > solrconfig.xml and reloaded collection > > <statsCacheclass="org.apache.solr.search.stats.ExactStatsCache"/> > > But even after that there is no change incalculated IDF. > > 2. What are parameter b and parameter K1? > > 3. Why there are lots of parameters included in myDoc15 rather than > Doc1? > > Is there any documentations I can refer to understand thesolr query > calculations in depth. > > We are using Solr 6.1in Cloud with 3 zookeepers and 3 masters and 3 > replicas. > > Regards, > Lavanya > -- *Doug Turnbull **| CTO* | OpenSource Connections <http://opensourceconnections.com>, LLC | 240.476.9983 Author: Relevant Search <http://manning.com/turnbull> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.