If you have multi-word synonyms you could use - tokenizerFactory="solr.KeywordTokenizerFactory" - in the SynonymFilterFactory filter factory declaration. This assumes that your tokenizer for that field allows for keeping the phrases as a single token (achieved by using solr.KeywordTokenizerFactory instead of Standard Tokenizer), if it is not then you might miss the synonym setting altogether. See the configuration below
<analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.TrimFilterFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.SynonymFilterFactory" tokenizerFactory="solr.KeywordTokenizerFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> Then you can use synonyms like Barack Obama,Barak Obama,Barack H. Obama,Barack Hussein Obama, Barak Hussein Obama => Barack Obama Ravi Kiran Bhaskar On Thu, Aug 18, 2011 at 3:21 PM, Markus Jelsma <markus.jel...@openindex.io> wrote: > How about escaping white\ space? > > cheers > >> Hmmm, why doesn't the multi word synonym syntax in your >> synonym.txt handle this case? Or am I missing something >> totally? >> >> Best >> Erick >> >> On Wed, Aug 17, 2011 at 10:02 PM, Will Milspec <will.mils...@gmail.com> > wrote: >> > Hi all, >> > >> > This may be obvious. My question pertains to use of tokenizerFactory >> > together with SynonymFilterFactory. Which tokenizerFactory does one use >> > to treat "synonyms with spaces" as one token, >> > >> > Example these two entries are synonyms: "lms", "learning management >> > system" >> > >> > index time expansion would expand "lms" to these terms >> > "lms" >> > "learning management system" >> > >> > i.e. not like this: >> > "lms" >> > "learning" >> > "management" >> > "system" >> > >> > Excerpt from the wiki article: >> > >> > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters >> > <quote> >> > The optional *tokenizerFactory* parameter names a tokenizer factory class >> > to analyze synonyms (see >> > https://issues.apache.org/jira/browse/SOLR-319), which can help with the >> > synonym+stemming problem described in >> > http://search-lucene.com/m/hg9ri2mDvGk1 . >> > </quote> >> > >> > thanks, >> > >> > will >