Thanks for your response. When I don't include the KeywordTokenizerFactory in 
the SynonymFilter definition, I get additional term values that I don't want.

e.g. synonyms.txt looks like:
simple syrup,sugar syrup,stock syrup

A document with a value containing 'simple syrup' can now be found when 
searching for just 'stock'.

So the problem I am trying to address with KeywordTokenizerFactory, is to 
prevent my multi word synonyms from getting broken down into single words.

Thanks
Zac

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, February 05, 2012 8:07 AM
To: solr-user@lucene.apache.org
Subject: Re: Multi word synonyms

I'm not quite sure what you're trying to do with KeywordTokenizerFactory in 
your SynonymFilter definition, but if I use the defaults, then the all-phrase 
form works just fine.

So the question is "what problem are you trying to address by using 
KeywordTokenizerFactory?"

Best
Erick

On Sun, Feb 5, 2012 at 8:21 AM, O. Klein <kl...@octoweb.nl> wrote:
> Your query analyser will tokenize "simple sirup" into "simple" and "sirup"
> and wont match on "simple syrup" in the synonyms.txt
>
> So you have to change the query analyzer into KeywordTokenizerFactory 
> as well.
>
> It might be idea to make a field for synonyms only with this tokenizer 
> and another field to search on and use dismax. Never tried this though.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Multi-word-synonyms-tp3716292p37172
> 15.html Sent from the Solr - User mailing list archive at Nabble.com.


Reply via email to