Re: My new lemmatizer interfers with the highlighter

2014-12-16 Thread Erlend Garåsen
Thanks Ahmet, I think I have solved the problem, but I didn't replace the line you suggested. Instead I added the createToken method with AttributeSource.State as a parameter and overrode the reset method. I cannot reproduce the problem anymore. BTW, what's the purpose of AttributeSource.St

Re: My new lemmatizer interfers with the highlighter

2014-12-15 Thread Ahmet Arslan
Hi Erlend, I have written a similar token filter. Please see : https://github.com/iorixxx/lucene-solr-analysis-turkish/blob/master/src/main/java/org/apache/lucene/analysis/tr/Zemberek2DeasciifyFilterFactory.java replace final String[] values = stemmer.stem(tokenTerm); with stack = stemmer.s

Re: My new lemmatizer interfers with the highlighter

2014-12-15 Thread Michael Sokolov
Well I think your first step should be finding a reproducible test case and encoding it as a unit test. But I suspect ultimately the fix will be something to do with positionIncrement ... -Mike On 12/15/2014 09:08 AM, Erlend Garåsen wrote: On 15.12.14 14:11, Michael Sokolov wrote: I'm not s

Re: My new lemmatizer interfers with the highlighter

2014-12-15 Thread Erlend Garåsen
On 15.12.14 14:11, Michael Sokolov wrote: I'm not sure, but is it necessary to set positionIncAttr to 1 when there are *not* any lemmas found? I think the usual pattern is to call clearAttributes() at the start of incrementToken It is set to 0 only if there are stems/lemmas found: if (!terms.i

Re: My new lemmatizer interfers with the highlighter

2014-12-15 Thread Michael Sokolov
I'm not sure, but is it necessary to set positionIncAttr to 1 when there are *not* any lemmas found? I think the usual pattern is to call clearAttributes() at the start of incrementToken -Mike On 12/15/14 7:38 AM, Erlend Garåsen wrote: I have written a dictionary-based lemmatizer for Universi