Another option : assuming themes_raw is type 'string' (couldn't get that nugget of info for 100%) it could be that you're seeing a difference in nr of results between the 110 for fq:themes_raw and 321 from your db, because fieldtype:string (thus themes_raw) is case-sensitive while (depending on your db-setup) querying your db is case-insensitive, which could explain the larger nr of hits for your db as well.
Cheers, Geert-Jan 2010/11/10 Jonathan Rochkind <rochk...@jhu.edu> > I've had that sort of thing happen from 'corrupting' my index, by changing > my schema.xml without re-indexing. > > If you change field types or other things in schema.xml, you need to > reindex all your data. (You can add brand new fields or types without having > to re-index, but most other changes will require a re-index). > > Could that be it? > > > PeterKerk wrote: > >> LOL, very clever indeed ;) >> >> The thing is: when I select the amount of records matching the theme >> 'Hotel >> en Restaurant' in my db, I end up with 321 records. So that is correct. I >> dont know where the 370 is coming from. >> >> Now when I change the query to this: &fq=themes_raw:Hotel en Restaurant I >> end up with 110 records...(another number even :s) >> >> What I did notice, is that this only happens on multi-word facets "Hotel >> en >> Restaurant" being a 3 word facet. The facets work correct on a facet named >> "Cafe", so I suspect it has something to do with the tokenization. >> >> As you can see, I'm using "text" and "string". >> For compleness Im posting definition of those in my schema.xml as well: >> >> <fieldType name="text" class="solr.TextField" >> positionIncrementGap="100"> >> <analyzer type="index"> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> >> <!-- in this example, we will only use synonyms at query time >> <filter class="solr.SynonymFilterFactory" >> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> >> --> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords_dutch.txt"/> >> <filter class="solr.WordDelimiterFilterFactory" >> generateWordParts="1" generateNumberParts="1" catenateWords="1" >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.EnglishPorterFilterFactory" >> protected="protwords.txt"/> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" >> ignoreCase="true" expand="true"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords_dutch.txt"/> >> <filter class="solr.WordDelimiterFilterFactory" >> generateWordParts="1" generateNumberParts="1" catenateWords="0" >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.EnglishPorterFilterFactory" >> protected="protwords.txt"/> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >> </analyzer> >> </fieldType> >> >> >> <fieldType name="string" class="solr.StrField" sortMissingLast="true" >> omitNorms="true" /> >> >> >