Hi, You need an analyzer that injects these five tokens in your example:
john....@gmail.com => john doe @ gmail com If you use autoGeneratePhraseQueries = true, then all of your three needs will be satisfied. Don't use quotes in your query. Just q=@gmail.com not q="@gmail.com " I would go with custom tokenizer in your case but it could be simulated using MappingCharFilter with WhitespaceTokenizer. "." => " " "@" => " @ " <!-- charFilter + WhitespaceTokenizer --> <fieldType name="text_char_norm" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> <analyzer> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping.txt"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> --- On Wed, 3/13/13, adfel70 <adfe...@gmail.com> wrote: > From: adfel70 <adfe...@gmail.com> > Subject: Re: searching exact phrase with stop word returns bad results > To: solr-user@lucene.apache.org > Date: Wednesday, March 13, 2013, 11:54 AM > Am I the first needing this > behaivour? > Have you seen any set of tokenizer-filters for a similar > requirement? > > > > Upayavira wrote > > Exact phrase search isn't exact phrase search as you > are thinking of it. > > A phrase search for "foo bar" searches for the terms > foo and bar, and > > then checks whether they are one position apart. If > punctuation has been > > removed during analysis, it *cannot* play a part in a > search of any > > kind. > > > > You may be able to achieve what you want with a > PatternTokenizer rather > > than whitespace and removing the > WordDelimiterFilterFactory. > > > > Upayavira > > > > On Wed, Mar 13, 2013, at 08:41 AM, adfel70 wrote: > >> I want the following behaivour. > >> if " > > > john.doe@ > > > " is indexed to the field > >> 1. searching 'john' or 'doe' or 'gmail.com' will > retreive the doc. > >> 2. searching '"@gmail.com' will retreive the doc. > >> 3. searching '"gmail.com@"' will not retreive the > doc. > >> > >> All I can accomplish, but 3. > >> because the word delimiter removes '@', when I > search "@gmail.com" or > >> "gmail.com@" its like searching "gmail.com" which > causes unrequired > >> results. > >> This is an exact phrase search, so I would expect > only docs with the > >> exact > >> phrase I search (including punctuations ) to be > retrieved. > >> > >> How can I achieve this? > >> > >> Thanks. > >> > >> > >> > >> Jack Krupansky-2 wrote > >> > The Word Delimiter Filter will remove all > punctuation characters. That > >> is > >> > its function. > >> > > >> > Maybe you should first describe in simple > English what your token/term > >> > rules > >> > are, and then it would be more clear what > tokenizer and filters would > >> be > >> > most appropriate. > >> > > >> > -- Jack Krupansky > >> > > >> > -----Original Message----- > >> > From: adfel70 > >> > Sent: Tuesday, March 12, 2013 3:14 AM > >> > To: > >> > >> > solr-user@.apache > >> > >> > Subject: Re: searching exact phrase with stop > word returns bad results > >> > > >> > I see that there is not token with @. > >> > the question is why. > >> > this is my field type: > >> > > > <fieldtype name="email_type" class="solr.TextField" > >> > > > positionIncrementGap="100" > autoGeneratePhraseQueries="false" > >> > omitNorms="true"> > >> > > >> > > > <analyzer> > >> > > > <tokenizer > class="solr.WhitespaceTokenizerFactory"/> > >> > > >> > > > <filter class="solr.LowerCaseFilterFactory"/> > >> > > >> > > > <filter class="solr.WordDelimiterFilterFactory" > >> > > > preserveOriginal="1" generateWordParts="1" > generateNumberParts="1" > >> > catenateWords="0" catenateNumbers="0" > catenateAll="0" > >> > splitOnCaseChange="0"/> > >> > > >> > > > </analyzer> > >> > > >> > > > </fieldtype> > >> > any idea? > >> > > >> > > >> > > >> > Erick Erickson wrote > >> >> Take a look at admin/analysis for the > field in question, feed it > >> values > >> >> and > >> >> see how they are tokenized. My guess is > that the token in the index is > >> > > >> >> abc@ > >> > > >> >> (single token), which of course > won't match the fragment "@ > >> >> gmail.com" (assuming gmail.com@ is a > typo)... > >> >> > >> >> Best > >> >> Erick > >> >> > >> >> > >> >> On Wed, Mar 6, 2013 at 5:43 AM, adfel70 > < > >> > > >> >> adfel70@ > >> > > >> >> > wrote: > >> >> > >> >>> Hi > >> >>> > >> >>> I have emails indexed with the default > text_general fieldType. > >> >>> > >> >>> I find that if the email " > >> > > >> >> abc@ > >> > > >> >> " is indexed, and I search for > >> >>> "gmail.com@" (exact phrase search) I > can a result, while I should not > >> >>> get > >> >>> one. > >> >>> > >> >>> Any idea how to solve this? > >> >>> > >> >>> thanks. > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> View this message in context: > >> >>> > >> http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180.html > >> >>> Sent from the Solr - User mailing list > archive at Nabble.com. > >> >>> > >> > > >> > > >> > > >> > > >> > > >> > -- > >> > View this message in context: > >> > > >> http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180p4046560.html > >> > Sent from the Solr - User mailing list archive > at Nabble.com. > >> > >> > >> > >> > >> > >> -- > >> View this message in context: > >> http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180p4046904.html > >> Sent from the Solr - User mailing list archive at > Nabble.com. > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180p4046926.html > Sent from the Solr - User mailing list archive at > Nabble.com. >