Re: My new lemmatizer interfers with the highlighter

Ahmet Arslan Mon, 15 Dec 2014 07:17:44 -0800

Hi Erlend,

I have written a similar token filter. Please see :


https://github.com/iorixxx/lucene-solr-analysis-turkish/blob/master/src/main/java/org/apache/lucene/analysis/tr/Zemberek2DeasciifyFilterFactory.java

replace 

final String[] values = stemmer.stem(tokenTerm);

with 

stack = stemmer.stem(tokenTerm);

Ahmet




On Monday, December 15, 2014 4:53 PM, Michael Sokolov 
<msoko...@safaribooksonline.com> wrote:
Well I think your first step should be finding a reproducible test case 
and encoding it as a unit test.  But I suspect ultimately the fix will 
be something to do with positionIncrement ...

-Mike


On 12/15/2014 09:08 AM, Erlend Garåsen wrote:
> On 15.12.14 14:11, Michael Sokolov wrote:
>> I'm not sure, but is it necessary to set positionIncAttr to 1 when there
>> are *not* any lemmas found?  I think the usual pattern is to call
>> clearAttributes() at the start of incrementToken
>
> It is set to 0 only if there are stems/lemmas found:
> if (!terms.isEmpty()) {
>   positionAttr.setPositionIncrement(0);
>
> The terms list will only contain entries if there are lemmas found.
>
> But maybe I should empty this list before I return true, just like this?
>
> if (!terms.isEmpty()) {
>   termAtt.setEmpty().append(terms.poll());
>   positionAttr.setPositionIncrement(0);
>   terms.clear();
>   return true;
> } else if ...
>

Re: My new lemmatizer interfers with the highlighter

Reply via email to