2012 8:01 AM
Subject: Re: Question about solr.WordDelimiterFilterFactory
WordDelimiterFilterFactory will _almost_ do what you want
by setting things like catenateWords=0 and catenateNumbers=1,
_except_ that the punctuation will be removed. So
12.34 -> 1234
ab,cd -> ab cd
is that "close
WordDelimiterFilterFactory will _almost_ do what you want
by setting things like catenateWords=0 and catenateNumbers=1,
_except_ that the punctuation will be removed. So
12.34 -> 1234
ab,cd -> ab cd
is that "close enough"?
Otherwise, writing a simple Filter is probably the way to go.
Best
Erick
Hello,
I am new to solr/lucene. I am tasked to index a large number of documents. Some
of these documents contain decimal points. I am looking for a way to index
these documents so that adjacent numeric characters (such as [0-9.,]) are
treated as single token. For example,
12.34 => "12.34"
12,