rmuir commented on issue #14235: URL: https://github.com/apache/lucene/issues/14235#issuecomment-2658111400
The last one was like this, too: https://github.com/apache/lucene/pull/14079 I think people just often have trouble counting and that's why we see errors around the counts, even with logic to try to be lenient when they are wrong. See the comments in the test data of #14239 for my opinion on a different solution: I think we are too strict with the counts: we could ignore them alltogether: it is unclear to me that they are needed to parse the file? Surely hunspell C code isn't barfing on these and that's why we have the struggle. So a solution to tone down the noise could be to change parser to not use `for` loops based on these counts, but instead e.g. a `while` loop and ignore them entirely? It would be a bigger change but might save us hassles. currently any PR to analysis/ puts us on the front-lines of parsing the latest dictionaries :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org