Re: Searching for Partial Words

2012-11-08 Thread Jack Krupansky
r 06, 2012 7:35 AM To: solr-user@lucene.apache.org Subject: Re: Searching for Partial Words Thanks Jack. In the configuration below: What are the possible values for "side"? If I understand it correctly, minGramSize=3 and side=front, will include eng* but not en*. Is t

Re: Searching for Partial Words

2012-11-08 Thread Sohail Aboobaker
Yes, that is true. We are looking for partial word matches. It seems like we can achieve this by using edge ngram for prefixes and adding wild card at the end for ignoring suffix. If we set the edge ngram to 3. "eng" will match ResidentEng but not ResidentEngineer. But a search for "eng*" will matc

Re: Searching for Partial Words

2012-11-08 Thread Amit Nithian
Look at the normal ngram tokenizer. "Engine" with ngram size 3 would yield "eng" "ngi" "gin" "ine" so a search for engi should match. You can play around with the min/max values. Edge ngram is useful for prefix matching but sounds like you want intra-word matching too? ("eng" should match " Residen

Re: Searching for Partial Words

2012-11-06 Thread Sohail Aboobaker
Thanks Jack. In the configuration below: What are the possible values for "side"? If I understand it correctly, minGramSize=3 and side=front, will include eng* but not en*. Is this correct? So, the minGramSize is for number of characters allowed in the specified side. Does it

Re: Searching for Partial Words

2012-11-06 Thread Jack Krupansky
Add an "edge" n-gram filter (EdgeNGramFilterFactory) to your "index" analyzer. This will add all the prefixes of words to the index, so that a query of "engi" will be equivalent to but much faster than the wildcard engi*. You can specify a minimum size, such as 3 or 4 to eliminate tons of too-s