Re: Folding Repeated Letters

2020-10-09 Thread Walter Underwood
Actually, helping the humans to use proper spelling is a good approach. Include a spelling correction step (non-optional) for user-generated content and spelling suggestions for queries. Completion/suggestion is another way to guide people to properly spelled words that exist in your index. I agr

Re: Folding Repeated Letters

2020-10-09 Thread Alexandre Rafalovitch
Are there that many of those words.?Because even if you deal with , there is still yas! Maybe you just have regexp synonyms? (ye+s+) Good luck, 413x On Thu., Oct. 8, 2020, 6:02 p.m. Mike Drob, wrote: > I'm looking for a way to transform words with repeated letters into the > sam

Re: Folding Repeated Letters

2020-10-09 Thread Erick Erickson
Anything you do will be wrong ;). I suppose you could kick out words that weren’t in some dictionary and accumulate a list of words not in the dictionary and just deal with them “somehow", but that’s labor-intensive since you then have to deal with proper names and the like. Sometimes you can g

Re: Folding Repeated Letters

2020-10-08 Thread Mike Drob
I was thinking about that, but there are words that are legitimately different with repeated consonants. My primary school teacher lost hair over getting us to learn the difference between desert and dessert. Maybe we need something that can borrow the boosting behaviour of fuzzy query - match the

Re: Folding Repeated Letters

2020-10-08 Thread Andy Webb
How about something like this? { "add-field-type": [ { "name": "norepeat", "class": "solr.TextField", "analyzer": { "tokenizer": { "class": "solr.StandardTokenizerFactory" }, "filter