I've had that sort of thing happen from 'corrupting' my index, by
changing my schema.xml without re-indexing.
If you change field types or other things in schema.xml, you need to
reindex all your data. (You can add brand new fields or types without
having to re-index, but most other changes will require a re-index).
Could that be it?
PeterKerk wrote:
LOL, very clever indeed ;)
The thing is: when I select the amount of records matching the theme 'Hotel
en Restaurant' in my db, I end up with 321 records. So that is correct. I
dont know where the 370 is coming from.
Now when I change the query to this: &fq=themes_raw:Hotel en Restaurant
I end up with 110 records...(another number even :s)
What I did notice, is that this only happens on multi-word facets "Hotel en
Restaurant" being a 3 word facet. The facets work correct on a facet named
"Cafe", so I suspect it has something to do with the tokenization.
As you can see, I'm using "text" and "string".
For compleness Im posting definition of those in my schema.xml as well:
<fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true" />