Re: Question on Tokenizing email address

2010-02-11 Thread Jan Høydahl / Cominvent
/> > >words="stopwords.txt" enablePositionIncrements="true" /> > > > > I have added the words dot, com to the stoplist file (at was already there). > > Is this correct? > > -- > View this message in context: > http://old.nabble.com/Question-on-Tokenizing-email-address-tp27518673p27527033.html > Sent from the Solr - User mailing list archive at Nabble.com. >

Re: Question on Tokenizing email address

2010-02-09 Thread abhishes
I have added the words dot, com to the stoplist file (at was already there). Is this correct? -- View this message in context: http://old.nabble.com/Question-on-Tokenizing-email-address-tp27518673p27527033.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Question on Tokenizing email address

2010-02-09 Thread Jan Høydahl / Cominvent
Hi, To match 1, 2, 3, 4 below you could use a fieldtype based on TextField, with just a simple WordDelimiterFactory. However, this would also match abc-def, def.alpha, xyz-com and a...@def, because all punctuation is treated the same. To avoid this, you could do some custom handling of "-", "."

Question on Tokenizing email address

2010-02-09 Thread Abhishek Srivastava
Hello Everyone, I have a field in my solr schema which stores emails. The way I want the emails to be tokenized is like this. if the email address is abc@alpha-xyz.com User should be able to search on 1. abc@alpha-xyz.com (whole address) 2. abc 3. def 4. alpha-xyz Which tokenizer should