subject:"Re\: about analyzer and tokenizer"

Re: about analyzer and tokenizer

2014-05-26 Thread rachun

Thank you very much for your suggestion both of you. I will try more to figure out which way will be match with my case. Chun. -- View this message in context: http://lucene.472066.n3.nabble.com/about-analyzer-and-tokenizer-tp4138129p4138227.html Sent from the Solr - User mailing list archive

Re: about analyzer and tokenizer

2014-05-26 Thread Jack Krupansky

Unfortunately Solr and Lucene do not provide a truly clean out of the box solution for this obvious use case, but you can approximate it by using index-time synonyms, so that "mac book" will also index as "macbook" and "macbook" will also index as "mac book". Your SYNONYMS.TXT file would contai

Re: about analyzer and tokenizer

2014-05-26 Thread Dmitry Kan

Hi Chun, You can use the edge ngram filter [1] on your tokens, that will produce all possible letter sequences in a certain (configurable) range, like: ma, ac, bo, ok, mac, aac, boo, ook, book etc. Then when querying, both mac and book should hit in the sequence and you should get the macbook hit