rmuir commented on PR #14193: URL: https://github.com/apache/lucene/pull/14193#issuecomment-2636033017
a few more notes: * maybe we should deprecate `union(Automaton, Automaton)` and only leave `union(List<Automaton>)`. I see the former approach has proven trappy, let's guide developers to do it the faster way? The deprecation warning itself can be useful here to explain why. Nothing outside of tests uses this signature anymore. * by having a way to create a character class node, if we wanted, we could give it additional boolean `@param complement True if the character class should be negated`. It would not help performance but would allow us to cleanup on the API side: the problematic complement() is needed and awesome for negated character classes, but the general algorithm is not sustainable and requires determinization. It is a crazy trap: complement() on one of these simple character classes is the perfect algorithm: totalize() to one dead state, then nuke the previous accept state. I definitely wrote the code and deleted it several times to just have this flag, to remove the general-purpose complement() trap elsewhere, which is not efficient at all for all situations. * the PR needs more tests but I fought the existing toString-based ones to exhaustion already. At least tests work and confirm it does what we expect * warning: there may be dead code, that we could remove that I missed. We should look and try to fix RegExp class to not do everything package-private, because it makes it impossible for our tooling to detect dead code. If it used private correctly, tools would tell us if a function is unused by anything. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org