msfroh commented on PR #14350: URL: https://github.com/apache/lucene/pull/14350#issuecomment-2730390727
Instead of a boolean flag, what if we define an interface that specifies the folding rules? It could have two methods: one that folds input characters to a canonical representation (before sorting) and one that expands from the canonical representation to the characters that should be matched. We could ship ASCII and Turkish implementations to start, say. If someone has a Romanian corpus that has a mix of characters with and without diacritics, they might strip diacritics on input and expand them for matching. (That would effectively combine lowercase and ASCII folding.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org