Thank you! it works very well. I think that the field type suggested by you will index words like DOT, AT, com also
In order to prevent these words from getting indexed, I have changed the field type to <fieldType name="email" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.PatternReplaceFilterFactory" pattern="\." replacement=" DOT " replace="all" /> <filter class="solr.PatternReplaceFilterFactory" pattern="@" replacement=" AT " replace="all" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> </analyzer> </fieldType> I have added the words dot, com to the stoplist file (at was already there). Is this correct? -- View this message in context: http://old.nabble.com/Question-on-Tokenizing-email-address-tp27518673p27527033.html Sent from the Solr - User mailing list archive at Nabble.com.