: Sorry for leaving the Solr version out in my previous email, I'm using : Solr 4.10.3 running on Centos7, with the following JRE: Oracle : Corporation OpenJDK 64-Bit Server VM (1.7.0_75 24.75-b04)
I can't reproduce Using Solr 4.10.3 (or 4.10.4 - mistread your email the first time) Are you certain you didn't *build* this index with a different Similarity configured? or did you perhaps build it with an older version of Solr that might have had a bug in it? Here's what i tried... applied this patch to the example configs based on the fieldType you specified... hossman@tray:~/lucene/lucene_solr_4_10_3_tag$ svn diff Index: solr/example/solr/collection1/conf/schema.xml =================================================================== --- solr/example/solr/collection1/conf/schema.xml (revision 1679472) +++ solr/example/solr/collection1/conf/schema.xml (working copy) @@ -46,6 +46,21 @@ --> <schema name="example" version="1.5"> + + <fieldType name="hoss_type" class="solr.TextField" sortMissingLast="true"> + <analyzer> + <charFilter class="solr.HTMLStripCharFilterFactory"/> + <tokenizer class="solr.StandardTokenizerFactory"/> + <filter class="solr.ASCIIFoldingFilterFactory"/> + <filter class="solr.StopFilterFactory" + ignoreCase="true" words="stopwords.txt"/> + <filter class="solr.LowerCaseFilterFactory"/> + <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> + </analyzer> + </fieldType> + + <field name="hoss_test" type="hoss_type" stored="true" indexed="true" multiValued="true"/> + <!-- attribute "name" is the name of this schema and is only used for display purposes. version="x.y" is Solr's version number for the schema syntax and semantics. It should not normally be changed by applications. ...started up "java -jar start.jar" and then wrote & ran this script to generate a doc with the number of unique terms in my field that you mentioned & indexed it... hossman@tray:~/tmp$ cat make-big-field.pl #/usr/bin/perl print qq{<add><doc><field name="id">hoss</field><field name="hoss_test">\n}; for (1..119669) { print "term${_} "; } print qq{</field></doc></add>\n}; hossman@tray:~/tmp$ perl make-big-field.pl > tmp.xml hossman@tray:~/tmp$ curl -X POST -H 'Content-Type: application/xml' --data-binary @tmp.xml "http://localhost:8983/solr/collection1/update?commit=true" <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"><int name="status">0</int><int name="QTime">877</int></lst> </response> Then confirmed i got a very small fieldNorm when querying against this field... hossman@tray:~/tmp$ curl 'http://localhost:8983/solr/collection1/select?q=hoss_test:term1&debug=results&wt=json&indent=true&fl=id&omitHeader=true' { "response":{"numFound":1,"start":0,"docs":[ { "id":"hoss"}] }, "debug":{ "explain":{ "hoss":"\n7.491524E-4 = (MATCH) weight(hoss_test:term1 in 0) [DefaultSimilarity], result of:\n 7.491524E-4 = fieldWeight in 0, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 0.30685282 = idf(docFreq=1, maxDocs=1)\n 0.0024414062 = fieldNorm(doc=0)\n"}}} -Hoss http://www.lucidworks.com/