What you've shown would be handled with WhitespaceTokenizer, but you'd have
to
prevent filters from stripping the parens. If you have to handle things like
blah ( stuff )
WhitespaceTokenizer wouldn't work.

PatternTokenizerFactory might work for you, see:
http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternTokenizerFactory.html

Best
Erick

On Tue, Apr 12, 2011 at 6:02 AM, roySolr <royrutten1...@gmail.com> wrote:

> Hello,
>
> I want to split my string when it contains "(". Example:
>
> spurs (London)
> Internationale (milan)
>
> to
>
> spurs
> (london)
> Internationale
> (milan)
>
> What tokenizer can i use to fix this problem?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Split-token-tp2810772p2810772.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to