Whoa! My first bit of advice is to spend some time getting familiar with the admin>>analysis page, because I suspect you're not doing what you expect.
1> KeywordTokenizer does NOT break up the input stream, so an input of "sony xperia c price" gets tokenized as "sony xperia c price", NOT the words "sony" "xperia" "c" and "price". 2> You use PatternReplace to remove the punctuation etc. 3> You use EdgeNGrams to create tokens like s, so, son, sony. But then you do NOT use EdgeNGrams in your query section. So your queries are probably not very robust. The NGrams are why your matching is odd. At the end of all this, you have a single string that gets n-grammed, then an additional PatternReplace is done. I don't think, for instance, that you will be unable to search for "xperia" and get a hit. I rather doubt that's what you want, but you know better than me. So it looks to me like you started out using KeywordTokenizer and then added a bunch of filters to try to make your results what you expect. It's possible that the decision to use KeywordTokenizer led you down an overly-complex path. I'd start with one of the other tokenizers that breaks things up on input, e.t. StandardTokenizer, WhitespaceTokenizer, etc., and build up the analysis chain (e.g. Filters) again, although I notice you have some CJK characters in your PatternReplace, so whitespace may not be suitable. If you are analyzing CJK text, there are tokenizers built for that. All that said, you know your problem space waaaay better than me, so this may all be complete nonsense..... Best, Erick On Sun, Feb 9, 2014 at 9:17 AM, kumar <pavan2...@gmail.com> wrote: > Hi, > > Whenever user types the search query like > > > "sony xperia c" it has to match the results like > > sony xperia c price > sony xperia c reviews > sony xperia c photos > > but my search query displays > > Sony xperia act mobiles > sony xperia ace mobiles > sony xperia abc mobiles > > > > Can anybody help me how to do it. > > My schema is like the following.... > > > > <field name="my_title" type="text_full" indexed="true" stored="false" > multiValued="true" omitNorms="true" omitTermFreqAndPositions="true" /> > > > > <fieldType name="text_full" class="solr.TextField"> > <analyzer type="index"> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping-ISOLatin1Accent.txt"/> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="([\.,;:-_])" replacement=" " replace="all"/> > <filter class="solr.EdgeNGramFilterFactory" maxGramSize="30" > minGramSize="1"/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="([^\w\d\*繥ǘŠ])" replacement="" replace="all"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true" /> > </analyzer> > <analyzer type="query"> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping-ISOLatin1Accent.txt"/> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="([\.,;:-_])" replacement=" " replace="all"/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="([^\w\d\*繥ǘŠ])" replacement="" replace="all"/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="^(.{30})(.*)?" replacement="$1" replace="all"/> > <filter class="solr.SynonymFilterFactory" ignoreCase="true" > synonyms="synonyms_fsw.txt" expand="true" /> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true" /> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt" /> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > > > > > > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Exact-matches-tp4116340.html > Sent from the Solr - User mailing list archive at Nabble.com.