Hope this is the right list to ask this, not sure if this is a bug or if I'm doing something wrong.
We're running some text with some emojis through this filter and if I'm reading the code right when it finds a U+203C (:bangbang: | double exclamation) it replaces that with an appropriate !! ASCII characters, but if its a "fully qualified" emoji then it also includes U+FE0E after, which is a zero length "VARIATION SELECTOR-16". The issue we are running into is that the emoji is replaced with !! like it should be, but then directly after the ASCII !! there is this character that's just hanging out now because it's not matched or changed into anything. This causes some weird behavior down the line in other filters and trying to strip off punctuation, for some reason it doesn't seem to be detected as punctuation anymore. Ultimately we are trying to get down to an array of meaningful tokens out of the content, but we are getting certain emoji's all the way through the filters and we aren't sure why these ones that are ASCII folded are making it through, where the ones that aren't are filtered out like normal. Thanks, Jarett