RE: StandardTokenizerFactory and WhitespaceTokenizerFactory

Tarala, Magesh Thu, 30 Jul 2015 18:11:39 -0700

I'm adding PatternReplaceCharFilterFactory to exclude characters. Looks like 
this works.

-----Original Message-----
From: Tarala, Magesh 
Sent: Thursday, July 30, 2015 10:37 AM
To: solr-user@lucene.apache.org
Subject: RE: StandardTokenizerFactory and WhitespaceTokenizerFactory

Using PatternReplaceCharFilterFactory to replace comma, period, etc with space 
or empty char will work?

-----Original Message-----
From: Tarala, Magesh 
Sent: Thursday, July 30, 2015 10:08 AM
To: solr-user@lucene.apache.org
Subject: StandardTokenizerFactory and WhitespaceTokenizerFactory

I am indexing text that contains part numbers in various formats that contain 
hypens/dashes, and a few other special characters.

Here's the problem: If I use StandardTokenizerFactory, the hypens, etc are 
stripped and so I cannot search by the part number 222-333-4444. I can only 
search for 222 or 333 or 444.
If I use the WhitespaceTokenizerFactory instead, I can search part numbers, but 
I'm not able to search words if they have punctuations like comma or period 
after the word. Example: wheel,

Should I use copy fields and use different tokenizers and then during the 
search based on the search string? Any other options?

RE: StandardTokenizerFactory and WhitespaceTokenizerFactory

Reply via email to