Re: ICUTokenizer ArrayIndexOutOfBounds

2012-10-17 Thread Robert Muir
calling reset() is mandatory part of the consumer lifecycle before calling incrementToken(), see: https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/analysis/TokenStream.html A lot of people don't consume these correctly, thats why these tokenizers now try to throw exceptions if you do i

ICUTokenizer ArrayIndexOutOfBounds

2012-10-17 Thread Shane Perry
Hi, I've been playing around with using the ICUTokenizer from 4.0.0. Using the code below, I was receiving an ArrayIndexOutOfBounds exception on the call to tokenizer.incrementToken(). Looking at the ICUTokenizer source, I can see why this is occuring (usableLength defaults to -1).