Hi Sergey, I've opened an issue to add a maxTokenLength param to the StandardTokenizerFactory configuration:
https://issues.apache.org/jira/browse/SOLR-2188 I'll work on it this weekend. Are you using Solr 1.4.1? I ask because of your mention of Lucene 2.9.3. I'm not sure there will ever be a Solr 1.4.2 release. I plan on targeting Solr 3.1 and 4.0 for the SOLR-2188 fix. I'm not sure why you didn't get the results you wanted with your Lucene hack - is it possible you have other Lucene jars in your Solr classpath? Steve > -----Original Message----- > From: Sergey Bartunov [mailto:sbos....@gmail.com] > Sent: Friday, October 22, 2010 12:08 PM > To: solr-user@lucene.apache.org > Subject: How to index long words with StandardTokenizerFactory? > > I'm trying to force solr to index words which length is more than 255 > symbols (this constant is DEFAULT_MAX_TOKEN_LENGTH in lucene > StandardAnalyzer.java) using StandardTokenizerFactory as 'filter' tag > in schema configuration XML. Specifying the maxTokenLength attribute > won't work. > > I'd tried to make the dirty hack: I downloaded lucene-core-2.9.3 src > and changed the DEFAULT_MAX_TOKEN_LENGTH to 1000000, built it to jar > and replaced original lucene-core jar in solr /lib. But seems like > that it had bring no effect.