uschindler commented on issue #9231: URL: https://github.com/apache/lucene/issues/9231#issuecomment-1634114514
Hi thanks @MartinDemberger , the PR looks good - you also added tests and an hyphenation XML file, although I have not closely looked into the internals of what you are actually doing. I think it should be fine to merge this into head, but I'd like to get another look by @rmuir who was one of the committers working on that TokenFilter. If this also fixes the problems with my dictionary and the configuration presented on its repository (https://github.com/uschindler/german-decompounder) I am more than happy. To be clear: Except reordering tokens there aren't any backwards compatibility issues by the new features? From what I understood it only removes useless tokens - order of tokens with same position does not matter. So actually somebody having an index that was created with the older version of that filter won't see any serious issues, just some inprecise matches may no longer be returned (because either the token is no longer in new documents of the index or the generated query no longer contains the token). So it would only return less matches, but no wrong matches. To me this looks fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org