benwtrent commented on issue #14429: URL: https://github.com/apache/lucene/issues/14429#issuecomment-2786821619
@mikemccand OK, I gathered more info: - Modern OpenJDK (22.0.1) - Modern Linux So other system stuff doesn't seem very exotic. However, the data being ingested might have various pieces of turkish unicode. Digging around the analyzers, I didn't find any special handling, so its all using the StandardAnalyzer with no additional normalization. I wonder if we are just hitting the dreaded turkish "i" unicode issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org