Maybe it would be simplest to use a PatternReplaceCharFilter to eliminate
the ".jpg", and then use the StandardTokenizer, or use the white space
tokenizer and the Word Delimiter Filter.
-- Jack Krupansky
-----Original Message-----
From: RL
Sent: Tuesday, October 30, 2012 3:57 AM
To: solr-user@lucene.apache.org
Subject: Tokenizer question
I could not find a solution to that in the documentation or the mailing
list,
so here's my question.
I have files following the pattern: firstname_lastname_employeenumber.jpg
I'm able to search for the single terms firstname or lastname or the
employeenumber using a solr.PatternTokenizerFactory. Where I split at
underscore and dot.
But, now I also want to search for firstname_lastname or
lastname_employeenumber
Which does not work because the underscore was tokenized and is not part of
the indexed token anymore.
Any suggestions how to do that?
Thanks in advance.
RL
--
View this message in context:
http://lucene.472066.n3.nabble.com/Tokenizer-question-tp4016932.html
Sent from the Solr - User mailing list archive at Nabble.com.