I am indexing documents using the domin:id format ex id = k-690kohler!670614
This ensures that all k-690kohler documents are indexed to the same shard.
This does cause numDocs that are not perfectly distributed across shards
probably even worse than the default sharding algorithm.

Here is the search on Solr Cloud
http://solrsolr/productindex/productQuery?q=categories_82_is:108996&bf=linear(popularity_82_i,1,2)^3&debugQuery=true

And on Solr 3.6
http://solr-2-build.sys.id.build.com:8080/solr-build/select?q.alt=categoryId:108996&qt=dismax&bf=linear(popularity,1,2)^3&debugQuery=true&fl=id,productID,manufacturer

Here is the debug output from Solr Cloud

<lst name="explain">
<str name="921rusticware!1210842">
48481.992 = (MATCH) sum of: 4.7323933 = (MATCH)
weight(categories_82_is:`#8;#0;#6;SD in 248779) [DefaultSimilarity], result
of: 4.7323933 = score(doc=248779,freq=1.0 = termFreq=1.0 ), product of:
0.8745785 = queryWeight, product of: 5.411056 = idf(docFreq=3181,
maxDocs=262058) 0.16162805 = queryNorm 5.411056 = fieldWeight in 248779,
product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.411056 =
idf(docFreq=3181, maxDocs=262058) 1.0 = fieldNorm(doc=248779) 48477.26 =
(MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of:
99977.0 = 1.0*float(int(popularity_82_i)=99975)+2.0 3.0 = boost 0.16162805 =
queryNorm
</str>
<str name="4706baldwin!1223898">
48380.168 = (MATCH) sum of: 4.7323933 = (MATCH)
weight(categories_82_is:`#8;#0;#6;SD in 67238) [DefaultSimilarity], result
of: 4.7323933 = score(doc=67238,freq=1.0 = termFreq=1.0 ), product of:
0.8745785 = queryWeight, product of: 5.411056 = idf(docFreq=3181,
maxDocs=262058) 0.16162805 = queryNorm 5.411056 = fieldWeight in 67238,
product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.411056 =
idf(docFreq=3181, maxDocs=262058) 1.0 = fieldNorm(doc=67238) 48375.438 =
(MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of:
99767.0 = 1.0*float(int(popularity_82_i)=99765)+2.0 3.0 = boost 0.16162805 =
queryNorm
</str>
<str name="yb5405moen!1748274">
48278.34 = (MATCH) sum of: 4.7323933 = (MATCH)
weight(categories_82_is:`#8;#0;#6;SD in 123982) [DefaultSimilarity], result
of: 4.7323933 = score(doc=123982,freq=1.0 = termFreq=1.0 ), product of:
0.8745785 = queryWeight, product of: 5.411056 = idf(docFreq=3181,
maxDocs=262058) 0.16162805 = queryNorm 5.411056 = fieldWeight in 123982,
product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.411056 =
idf(docFreq=3181, maxDocs=262058) 1.0 = fieldNorm(doc=123982) 48273.61 =
(MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of:
99557.0 = 1.0*float(int(popularity_82_i)=99555)+2.0 3.0 = boost 0.16162805 =
queryNorm
</str>
<str name="bp53005amerock!1721790">
48262.008 = (MATCH) sum of: 4.7675867 = (MATCH)
weight(categories_82_is:`#8;#0;#6;SD in 108146) [DefaultSimilarity], result
of: 4.7675867 = score(doc=108146,freq=1.0 = termFreq=1.0 ), product of:
0.8758082 = queryWeight, product of: 5.4436426 = idf(docFreq=3131,
maxDocs=266484) 0.16088642 = queryNorm 5.4436426 = fieldWeight in 108146,
product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.4436426 =
idf(docFreq=3131, maxDocs=266484) 1.0 = fieldNorm(doc=108146) 48257.24 =
(MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of:
99982.0 = 1.0*float(int(popularity_82_i)=99980)+2.0 3.0 = boost 0.16088642 =
queryNorm
</str>
<str name="bp29340amerock!1721865">
48208.918 = (MATCH) sum of: 4.7675867 = (MATCH)
weight(categories_82_is:`#8;#0;#6;SD in 108031) [DefaultSimilarity], result
of: 4.7675867 = score(doc=108031,freq=1.0 = termFreq=1.0 ), product of:
0.8758082 = queryWeight, product of: 5.4436426 = idf(docFreq=3131,
maxDocs=266484) 0.16088642 = queryNorm 5.4436426 = fieldWeight in 108031,
product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.4436426 =
idf(docFreq=3131, maxDocs=266484) 1.0 = fieldNorm(doc=108031) 48204.15 =
(MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of:
99872.0 = 1.0*float(int(popularity_82_i)=99870)+2.0 3.0 = boost 0.16088642 =
queryNorm
</str>
<str name="bp53001amerock!1314101">
48176.516 = (MATCH) sum of: 4.7323933 = (MATCH)
weight(categories_82_is:`#8;#0;#6;SD in 47622) [DefaultSimilarity], result
of: 4.7323933 = score(doc=47622,freq=1.0 = termFreq=1.0 ), product of:
0.8745785 = queryWeight, product of: 5.411056 = idf(docFreq=3181,
maxDocs=262058) 0.16162805 = queryNorm 5.411056 = fieldWeight in 47622,
product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.411056 =
idf(docFreq=3181, maxDocs=262058) 1.0 = fieldNorm(doc=47622) 48171.785 =
(MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of:
99347.0 = 1.0*float(int(popularity_82_i)=99345)+2.0 3.0 = boost 0.16162805 =
queryNorm
</str>


And here is the debug output from Solr 3.6
<lst name="explain">
<str name="bp53005amerock">
15421.395 = (MATCH) sum of: 1.6594616 = (MATCH)
weight(categoryId:`#8;#0;#6;SD in 45538), product of: 0.29207912 =
queryWeight(categoryId:`#8;#0;#6;SD), product of: 5.681548 =
idf(docFreq=4636, maxDocs=500504) 0.05140837 = queryNorm 5.681548 = (MATCH)
fieldWeight(categoryId:`#8;#0;#6;SD in 45538), product of: 1.0 =
tf(termFreq(categoryId:`#8;#0;#6;SD)=1) 5.681548 = idf(docFreq=4636,
maxDocs=500504) 1.0 = fieldNorm(field=categoryId, doc=45538) 15419.735 =
(MATCH) FunctionQuery(1.0*float(int(popularity))+2.0), product of: 99982.0 =
1.0*float(int(popularity)=99980)+2.0 3.0 = boost 0.05140837 = queryNorm
</str>
<str name="921rusticware">
15420.623 = (MATCH) sum of: 1.6594616 = (MATCH)
weight(categoryId:`#8;#0;#6;SD in 2394), product of: 0.29207912 =
queryWeight(categoryId:`#8;#0;#6;SD), product of: 5.681548 =
idf(docFreq=4636, maxDocs=500504) 0.05140837 = queryNorm 5.681548 = (MATCH)
fieldWeight(categoryId:`#8;#0;#6;SD in 2394), product of: 1.0 =
tf(termFreq(categoryId:`#8;#0;#6;SD)=1) 5.681548 = idf(docFreq=4636,
maxDocs=500504) 1.0 = fieldNorm(field=categoryId, doc=2394) 15418.964 =
(MATCH) FunctionQuery(1.0*float(int(popularity))+2.0), product of: 99977.0 =
1.0*float(int(popularity)=99975)+2.0 3.0 = boost 0.05140837 = queryNorm
</str>
<str name="bp29340amerock">
15404.43 = (MATCH) sum of: 1.6594616 = (MATCH)
weight(categoryId:`#8;#0;#6;SD in 154688), product of: 0.29207912 =
queryWeight(categoryId:`#8;#0;#6;SD), product of: 5.681548 =
idf(docFreq=4636, maxDocs=500504) 0.05140837 = queryNorm 5.681548 = (MATCH)
fieldWeight(categoryId:`#8;#0;#6;SD in 154688), product of: 1.0 =
tf(termFreq(categoryId:`#8;#0;#6;SD)=1) 5.681548 = idf(docFreq=4636,
maxDocs=500504) 1.0 = fieldNorm(field=categoryId, doc=154688) 15402.7705 =
(MATCH) FunctionQuery(1.0*float(int(popularity))+2.0), product of: 99872.0 =
1.0*float(int(popularity)=99870)+2.0 3.0 = boost 0.05140837 = queryNorm
</str>
<str name="4706baldwin">
15388.235 = (MATCH) sum of: 1.6594616 = (MATCH)
weight(categoryId:`#8;#0;#6;SD in 38679), product of: 0.29207912 =
queryWeight(categoryId:`#8;#0;#6;SD), product of: 5.681548 =
idf(docFreq=4636, maxDocs=500504) 0.05140837 = queryNorm 5.681548 = (MATCH)
fieldWeight(categoryId:`#8;#0;#6;SD in 38679), product of: 1.0 =
tf(termFreq(categoryId:`#8;#0;#6;SD)=1) 5.681548 = idf(docFreq=4636,
maxDocs=500504) 1.0 = fieldNorm(field=categoryId, doc=38679) 15386.576 =
(MATCH) FunctionQuery(1.0*float(int(popularity))+2.0), product of: 99767.0 =
1.0*float(int(popularity)=99765)+2.0 3.0 = boost 0.05140837 = queryNorm
</str>
<str name="bp1586amerock">
15372.042 = (MATCH) sum of: 1.6594616 = (MATCH)
weight(categoryId:`#8;#0;#6;SD in 112748), product of: 0.29207912 =
queryWeight(categoryId:`#8;#0;#6;SD), product of: 5.681548 =
idf(docFreq=4636, maxDocs=500504) 0.05140837 = queryNorm 5.681548 = (MATCH)
fieldWeight(categoryId:`#8;#0;#6;SD in 112748), product of: 1.0 =
tf(termFreq(categoryId:`#8;#0;#6;SD)=1) 5.681548 = idf(docFreq=4636,
maxDocs=500504) 1.0 = fieldNorm(field=categoryId, doc=112748) 15370.383 =
(MATCH) FunctionQuery(1.0*float(int(popularity))+2.0), product of: 99662.0 =
1.0*float(int(popularity)=99660)+2.0 3.0 = boost 0.05140837 = queryNorm
</str>
<str name="yb5405moen">
15355.849 = (MATCH) sum of: 1.6594616 = (MATCH)
weight(categoryId:`#8;#0;#6;SD in 3515), product of: 0.29207912 =
queryWeight(categoryId:`#8;#0;#6;SD), product of: 5.681548 =
idf(docFreq=4636, maxDocs=500504) 0.05140837 = queryNorm 5.681548 = (MATCH)
fieldWeight(categoryId:`#8;#0;#6;SD in 3515), product of: 1.0 =
tf(termFreq(categoryId:`#8;#0;#6;SD)=1) 5.681548 = idf(docFreq=4636,
maxDocs=500504) 1.0 = fieldNorm(field=categoryId, doc=3515) 15354.189 =
(MATCH) FunctionQuery(1.0*float(int(popularity))+2.0), product of: 99557.0 =
1.0*float(int(popularity)=99555)+2.0 3.0 = boost 0.05140837 = queryNorm
</str>


The problem was noticed when the bp53005amerock didnt' show in the first
position in Solr Cloud. The popularity values are the same and this is just
simple field search the the TF should always be 1. The only discrepancy I
can see is in the IDF value as the maxDocs and docFreq values are different
per shard which would account for the scoring differences between the two
indexes.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cloud-Distributed-IDF-tp4097127p4097262.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to