[jira] [Commented] (LUCENE-9574) Add a token filter to drop tokens based on flags.

Gus Heck (Jira) Thu, 08 Oct 2020 09:57:14 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-9574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210333#comment-17210333
 ]


Gus Heck commented on LUCENE-9574:
----------------------------------

One interesting corner case came up when the first token in the stream matched 
the flags, but had already had a synonym added. The synonym of course had 
position increment 0 and so dropping the token caused compliants about first 
token not having a position increment > 0. I could think of no way to reach 
forward in the stream and adjust the synonym token to account for the dropping 
of it's parent. So the workaround I came up with was to create a random token 
that will effectively never match anything and thus be invisible to to replace 
instead of drop if the first token in the stream is being dropped. Not crazy 
about it and would like to ask why the restriction on position increment is 
there... it feels like for some reason downstream code expects token positions 
be be starting with 1 instead of zero or something? Open to suggestions for a 
better solution too.

> Add a token filter to drop tokens based on flags.
> -------------------------------------------------
>
>                 Key: LUCENE-9574
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9574
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Gus Heck
>            Assignee: Gus Heck
>            Priority: Major
>
> (Breaking this off of SOLR-14597 for independent review)
> A filter that tests flags on tokens vs a bitmask and drops tokens that have 
> all specified flags.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9574) Add a token filter to drop tokens based on flags.

Reply via email to