Re: Leaving certain tokens intact during indexing and search

Marian Steinbach Wed, 30 Nov 2011 06:42:18 -0800

Thanks for the quick response!

Are you saying that I should extend WhitespaceTokenizerFactory to create my
own? Or should I simply use it?


Because, I guess tokenizing on spaces wouldn't be enough. I would need
tokenizing on slashes in other positions, just not within strings matching
([A-Z]+/[0-9]+/[0-9]+).

Marian


2011/11/30 Erick Erickson <erickerick...@gmail.com>

> There's about a zillion tokenizers, for what you're describing
> WhitespaceTokenizerFactory is a good candidate.
>
> See: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
> for a partial list, and it has links to the authoritative docs.
>
> Best
> Erick
>
>

Re: Leaving certain tokens intact during indexing and search

Reply via email to