Sorry , I was too cryptic. I you follow this link
http://projecte01.development.barcelonamedia.org/fonetic/ you will see a "Top Words" list (in Spanish and stemmed) in the list there is the word "si" which is in 20649 documents. If you click at this word, the system will perform the query (x) content:si, with no answers at all The same for "la" it is in 17881 documents, but the query "content:la" will give no answers at all the facets list is generated by the query http://projecte01.development.barcelonamedia.org/solr/select/?&rows=0&start=0&q=*:*&facet=true&facet.limit=-1&facet.field=content&facet.field=entities_misc&wt=json&json.wrf=jsonp1246437157825&jsoncallback=jsonp1246437157825&_=1246437158023 but the question is why these two words (among others) are there if they are stop words? To see what's going on on the index I have tested with the analyzer http://projecte01.development.barcelonamedia.org/solr/admin/analysis.jsp If I select the field content and I write the text "las cosas que si no pasan la proxima vez si que no veràs" i get the following tokens at the end of the analyzer las cosa pasan proxima vez sí verà where que, si, no, la are removed as treated as stop words. but... in the schema browser http://projecte01.development.barcelonamedia.org/solr/admin/schema.jsp in the field content "que" is the 3rd word "no" the 4th "si" and "la" are between the top 40 terms... the analyzer for the content can be seen in this page and has the following analyzers Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory Filters: 1. org.apache.solr.analysis.StopFilterFactory args:{enablePositionIncrements: true words: stopwords.txt ignoreCase: true } 2. org.apache.solr.analysis.WordDelimiterFilterFactory args:{catenateWords: 1 catenateNumbers: 1 splitOnCaseChange: 1 catenateAll: 0 generateNumberParts: 1 generateWordParts: 1 } 3. org.apache.solr.analysis.LowerCaseFilterFactory args:{} 4. org.apache.solr.analysis.SnowballPorterFilterFactory args:{languange: Spanish } 5. org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} The field is indexed, tokenized, stored and termvectors are stored. So, why the stopwords are in the index? -- View this message in context: http://www.nabble.com/facets-and-stopwords-tp23952823p24286283.html Sent from the Solr - User mailing list archive at Nabble.com.