I have a site which lists companies.
I'm looking to improve my search, but I want to know which available
analysers and tokenizers I should use for which scenario, and if it's at all
possible.
I want users to be able to search on the company title on for example a
company called "The Royal Garden"
The logic for this search should be as follows, "The Royal Garden", should
be found on queries:
"the royal garden"
"royal garden"
"the roy"
"The royal"
"RoYAl"
"garden"
So case insensitive, matching on parts of words.
However, a query "the royal" should not return companies like:
"the wall"
"the room"
"the restaurant"
So words like "the", but also "a" should be ignored if these are the only
match in the searchquery.
I now have this:
<fieldType name="searchtext" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<field name="title_search" type="searchtext" indexed="true"
stored="true"/>
I'm testing on http://localhost:8983/solr/#/bm/analysis but I'm stuck.
Also, I would think my scenario is pretty common and lots of users have
already configured their Solr search to be flexible and powerful...any good
search configurations would be welcome!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Flexible-search-field-analyser-tokenizer-configuration-tp4161624.html
Sent from the Solr - User mailing list archive at Nabble.com.