The shingle filter may use space as the separator between shingles that it generates. The admin/ analysis page is your friend.
On Jul 24, 2017 2:45 PM, "Angel Todorov" <attodo...@gmail.com> wrote: > Hi Rick, > > Yep, that's really weird, because I am using the StandardTokenizerFactory, > which is supposed to remove whitespace. Also tried the > WhitespaceTokenizerFactory. I'll have a look at other analyzers or if > nothing works maybe implement my own. > > I am using a Shingle filter right after the StandardTokenizer, not sure if > that has anything to do with it. > > > Thanks, > Angel > > > On Tue, Jul 25, 2017 at 12:09 AM Rick Leir <rl...@leirtech.com> wrote: > > > Angel, > > The 20 byte is an ASCII space character, which is a separator in most > > contexts. Breaking the buffer at spaces, you can see 6 non-space tokens. > > > > Have a look at your analysis chain and see why you are getting this. > > Cheers -- Rick > > > > On July 24, 2017 4:27:00 PM EDT, Angel Todorov <attodo...@gmail.com> > > wrote: > > >Hi guys, > > > > > >I am trying to setup the FreeTextSuggester/ Lookup Factory in a > > >suggester > > >definition in SOLR. Unfortunately while the index is building, I am > > >encountering the following errors: > > > > > >*"msg":"tokens must not contain separator byte; got token=[30 20 30 20 > > >32 > > >20 72 20 61 6c 6c 65 6e 20 72] but gramCount=6, which is greater than > > >expected max ngram size=5","trace":"java.lang.IllegalArgumentException: > > >tokens must not contain separator byte; got token=[30 20 30 20 32 20 72 > > >20 > > >61 6c 6c 65 6e 20 72] but gramCount=6, which is greater than expected > > >max > > >ngram size=5\r\n\tat > > > > >org.apache.lucene.search.suggest.analyzing.FreeTextSuggester.build( > FreeTextSuggester.java:362)\r\n\tat > > >* > > > > > >I've also opened the following issue, because i don't think it's right > > >not > > >to handle this exception: > > > > > >https://issues.apache.org/jira/browse/SOLR-11139 > > > > > >But my question is about the error in general - why is it occurring? I > > >only > > >have English text, nothing special. > > > > > >Thanks, > > >Angel > > > > -- > > Sorry for being brief. Alternate email is rickleir at yahoo dot com >