I was thinking about that, but there are words that are legitimately
different with repeated consonants. My primary school teacher lost hair
over getting us to learn the difference between desert and dessert.

Maybe we need something that can borrow the boosting behaviour of fuzzy
query - match the exact term, but also the neighbors with a slight deboost,
so that if the main term exists those others won't show up.

On Thu, Oct 8, 2020 at 5:46 PM Andy Webb <andywebb1...@gmail.com> wrote:

> How about something like this?
>
> {
>     "add-field-type": [
>         {
>             "name": "norepeat",
>             "class": "solr.TextField",
>             "analyzer": {
>                 "tokenizer": {
>                     "class": "solr.StandardTokenizerFactory"
>                 },
>                 "filters": [
>                     {
>                         "class": "solr.LowerCaseFilterFactory"
>                     },
>                     {
>                         "class": "solr.PatternReplaceFilterFactory",
>                         "pattern": "(.)\\1+",
>                         "replacement": "$1"
>                     }
>                 ]
>             }
>         }
>     ]
> }
>
> This finds a match...
>
> http://localhost:8983/solr/#/norepeat/analysis?analysis.fieldvalue=Yes&analysis.query=yyyyYyyyyyyeeEssSsssss&analysis.fieldtype=norepeat
>
> Andy
>
>
>
> On Thu, 8 Oct 2020 at 23:02, Mike Drob <md...@mdrob.com> wrote:
>
> > I'm looking for a way to transform words with repeated letters into the
> > same token - does something like this exist out of the box? Do our
> stemmers
> > support it?
> >
> > For example, say I would want all of these terms to return the same
> search
> > results:
> >
> > YES
> > YESSS
> > YYYEEESSS
> > YYEESSSS[...]S
> >
> > I don't know how long a user would hold down the S key at the end to
> > capture their level of excitement, and I don't want to manually define
> > synonyms for every length.
> >
> > I'm pretty sure that I don't want PhoneticFilter here, maybe
> > PatternReplace? Not a huge fan of how that one is configured, and I think
> > I'd have to set up a bunch of patterns inline for it?
> >
> > Mike
> >
>

Reply via email to