Re: Tokenizer question

Jack Krupansky Tue, 30 Oct 2012 13:59:52 -0700

Maybe it would be simplest to use a PatternReplaceCharFilter to eliminatethe ".jpg", and then use the StandardTokenizer, or use the white spacetokenizer and the Word Delimiter Filter.


-- Jack Krupansky

-----Original Message-----From: RL

Sent: Tuesday, October 30, 2012 3:57 AM
To: [email protected]
Subject: Tokenizer question

I could not find a solution to that in the documentation or the mailinglist,

so here's my question.

I have files following the pattern: firstname_lastname_employeenumber.jpg

I'm able to search for the single terms firstname or lastname or the
employeenumber using a solr.PatternTokenizerFactory. Where I split at
underscore and dot.

But, now I also want to search for firstname_lastname or
lastname_employeenumber
Which does not work because the underscore was tokenized and is not part of
the indexed token anymore.


Any suggestions how to do that?

Thanks in advance.

RL



--

View this message in context:http://lucene.472066.n3.nabble.com/Tokenizer-question-tp4016932.htmlSent from the Solr - User mailing list archive at Nabble.com.

Re: Tokenizer question

Reply via email to