Re: My new lemmatizer interfers with the highlighter

Erlend Garåsen Tue, 16 Dec 2014 07:13:09 -0800


Thanks Ahmet,

I think I have solved the problem, but I didn't replace the line yousuggested. Instead I added the createToken method withAttributeSource.State as a parameter and overrode the reset method. Icannot reproduce the problem anymore.

BTW, what's the purpose of AttributeSource.State? Perhaps that alone hassolved the problem.


Erlend

On 15.12.14 16:13, Ahmet Arslan wrote:

Hi Erlend,

I have written a similar token filter. Please see :

https://github.com/iorixxx/lucene-solr-analysis-turkish/blob/master/src/main/java/org/apache/lucene/analysis/tr/Zemberek2DeasciifyFilterFactory.java

replace

final String[] values = stemmer.stem(tokenTerm);

with

stack = stemmer.stem(tokenTerm);

Ahmet




On Monday, December 15, 2014 4:53 PM, Michael Sokolov 
<msoko...@safaribooksonline.com> wrote:
Well I think your first step should be finding a reproducible test case
and encoding it as a unit test.  But I suspect ultimately the fix will
be something to do with positionIncrement ...

-Mike


On 12/15/2014 09:08 AM, Erlend Garåsen wrote:

On 15.12.14 14:11, Michael Sokolov wrote:

I'm not sure, but is it necessary to set positionIncAttr to 1 when there
are *not* any lemmas found?  I think the usual pattern is to call
clearAttributes() at the start of incrementToken


It is set to 0 only if there are stems/lemmas found:
if (!terms.isEmpty()) {
   positionAttr.setPositionIncrement(0);

The terms list will only contain entries if there are lemmas found.

But maybe I should empty this list before I return true, just like this?

if (!terms.isEmpty()) {
   termAtt.setEmpty().append(terms.poll());
   positionAttr.setPositionIncrement(0);
   terms.clear();
   return true;
} else if ...

Re: My new lemmatizer interfers with the highlighter

Reply via email to