No, all tokenizer can be used with mappingcharfilter Koji Sekiguchi from mobile
On 2010/07/06, at 0:32, Saïd Radhouani <r.steve....@gmail.com> wrote: > Thanks Koji for the reply and for updating wiki. As it's written now in wiki, > it sounds (at least to me) like MappingCharFilterFactory works only with > WhitespaceTokenizerFactory. > > Did you really mean that? Because this filter works also with other > tkenizers. For instance, in my text type, I'm using StandardTokenizerFactory > for document processing, and WhitespaceTokenizerFactory for query processing. > > I also noticed that, in whatever order you put this filter in the definition > of a field type, it's always applied (during text processing) before the > tokenizer and all the other filters. Is there a reason for that? Is there a > possibility to force the filter to be applied at a certain order among the > other filters? > > Thanks, > -S > > On Jul 5, 2010, at 4:28 PM, Koji Sekiguchi wrote: > >> >>> In the same wiki, they say that CharStreamAwareWhitespaceTokenizerFactory >>> must be used with MappingCharFilterFactory. But, when I use these tokenizer >>> and filter together, I get a sever error saying that the filed type >>> containing these filter and tokenizer is unknown. However, when I use this >>> filter with StandardTokenizerFactory or WhitespaceTokenizerFactory! >>> >>> >> The wiki is not correct today. Before Lucene 2.9 (and Solr 1.4), >> Tokenizers can take Reader argument in constructor. But after that, >> because they can take CharStream argument in constructor, >> *CharStreamAware* Tokenizers are no longer needed (all Tokenizers >> are aware of CharStream). I'll update the wiki. >> >> Koji >> >> -- >> http://www.rondhuit.com/en/ >> >