Hi Ere,
I don't think that there is such filter. Implementing such filter would
require looking backward which violates streaming approach of token
filters and unpredictable memory usage.
I would do it as part of query preprocessor and not necessarily as part
of Solr.
HTH,
Emir
On 09.02.2017 12:24, Ere Maijala wrote:
Hi,
I just noticed that while we use RemoveDuplicatesTokenFilter during
query time, it will consider term positions and not really do anything
e.g. if query is 'term term term'. As far as I can see the term
positions make no difference in a simple non-phrase search. Is there a
built-in way to deal with this? I know I can write a filter to do
this, but I feel like this would be something quite basic to do for
the query. And I don't think it's even anything too weird for normal
users to do. Just consider e.g. searching for music by title:
Hey, hey, hey ; Shivers of pleasure
I also verified that at least according to debugQuery=true and
anecdotal evicende the search really slows down if you repeat the same
term enough.
--Ere
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/