The Word Delimiter Filter will remove all punctuation characters. That is
its function.
Maybe you should first describe in simple English what your token/term rules
are, and then it would be more clear what tokenizer and filters would be
most appropriate.
-- Jack Krupansky
-----Original Message-----
From: adfel70
Sent: Tuesday, March 12, 2013 3:14 AM
To: solr-user@lucene.apache.org
Subject: Re: searching exact phrase with stop word returns bad results
I see that there is not token with @.
the question is why.
this is my field type:
<fieldtype name="email_type" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="false"
omitNorms="true">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
preserveOriginal="1" generateWordParts="1" generateNumberParts="1"
catenateWords="0" catenateNumbers="0" catenateAll="0"
splitOnCaseChange="0"/>
</analyzer>
</fieldtype>
any idea?
Erick Erickson wrote
Take a look at admin/analysis for the field in question, feed it values
and
see how they are tokenized. My guess is that the token in the index is
abc@
(single token), which of course won't match the fragment "@
gmail.com" (assuming gmail.com@ is a typo)...
Best
Erick
On Wed, Mar 6, 2013 at 5:43 AM, adfel70 <
adfel70@
> wrote:
Hi
I have emails indexed with the default text_general fieldType.
I find that if the email "
abc@
" is indexed, and I search for
"gmail.com@" (exact phrase search) I can a result, while I should not get
one.
Any idea how to solve this?
thanks.
--
View this message in context:
http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180.html
Sent from the Solr - User mailing list archive at Nabble.com.
--
View this message in context:
http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180p4046560.html
Sent from the Solr - User mailing list archive at Nabble.com.