Thanks for the insight. You're right, of course, regarding the score
calculation. I'll think about it. There are certain cases where the
search is human-obviously bad and could be cleaned up, but it's not too
easy to write rules for that.
--Ere
9.2.2017, 18.37, Walter Underwood kirjoitti:
1.
t, so i never tried it and just overridden the similarity in
> place.
>
> M.
>
> -Original message-
>> From:Alexandre Rafalovitch
>> Sent: Thursday 9th February 2017 18:00
>> To: solr-user
>> Subject: Re: Removing duplicate terms from query
>>
; Sent: Thursday 9th February 2017 18:00
> To: solr-user
> Subject: Re: Removing duplicate terms from query
>
> Would omitTermFreqAndPositions help here? Though that's probably an
> overkill as that disables phrase searches too. I am not sure if it is
> possible to do omitTermFreqAn
Would omitTermFreqAndPositions help here? Though that's probably an
overkill as that disables phrase searches too. I am not sure if it is
possible to do omitTermFreqAndPositions=true omitPositions=false to
just skip frequencies.
Regards,
Alex.
http://www.solr-start.com/ - Resources for Sol
1. I don’t think this is a good idea. It means that a search for “hey hey hey”
won’t score that document higher.
2. Maybe you want to change how tf is calculated. Ignore multiple occurrences
of a word.
I ran into this with the movie title “New York, New York” at Netflix. It isn’t
twice as much
Thanks Emir.
I was thinking of something very simple like doing what
RemoveDuplicatesTokenFilter does but ignoring positions. It would of
course still be possible to have the same term multiple times, but at
least the adjacent ones could be deduplicated. The reason I'm not too
eager to do it
ect: Re: Removing duplicate terms from query
>
> Hi Ere,
>
> I don't think that there is such filter. Implementing such filter would
> require looking backward which violates streaming approach of token
> filters and unpredictable memory usage.
>
> I would do it as par
Hi Ere,
I don't think that there is such filter. Implementing such filter would
require looking backward which violates streaming approach of token
filters and unpredictable memory usage.
I would do it as part of query preprocessor and not necessarily as part
of Solr.
HTH,
Emir
On 09.02.