Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-05 Thread via GitHub
mikemccand commented on code in PR #13707: URL: https://github.com/apache/lucene/pull/13707#discussion_r1745429485 ## lucene/core/src/java/org/apache/lucene/util/automaton/Operations.java: ## @@ -857,22 +857,38 @@ public static boolean isEmpty(Automaton a) { return true;

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-05 Thread via GitHub
rmuir merged PR #13707: URL: https://github.com/apache/lucene/pull/13707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-04 Thread via GitHub
rmuir commented on PR #13707: URL: https://github.com/apache/lucene/pull/13707#issuecomment-2329472492 I updated the javadocs: "The automaton must be deterministic, or this method may return false." Previous implementation was: "The automaton must be minimal, or this method may return fa

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-04 Thread via GitHub
mikemccand commented on code in PR #13707: URL: https://github.com/apache/lucene/pull/13707#discussion_r1743879304 ## lucene/core/src/java/org/apache/lucene/util/automaton/Operations.java: ## @@ -857,22 +857,38 @@ public static boolean isEmpty(Automaton a) { return true;

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-04 Thread via GitHub
rmuir commented on code in PR #13707: URL: https://github.com/apache/lucene/pull/13707#discussion_r1743877792 ## lucene/core/src/java/org/apache/lucene/util/automaton/Operations.java: ## @@ -857,22 +857,38 @@ public static boolean isEmpty(Automaton a) { return true; }

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-04 Thread via GitHub
mikemccand commented on code in PR #13707: URL: https://github.com/apache/lucene/pull/13707#discussion_r1743876694 ## lucene/core/src/java/org/apache/lucene/util/automaton/Operations.java: ## @@ -857,22 +857,38 @@ public static boolean isEmpty(Automaton a) { return true;

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-02 Thread via GitHub
rmuir commented on PR #13707: URL: https://github.com/apache/lucene/pull/13707#issuecomment-2325299722 ok, the tests have found all the fun with empty cases and I think it is ready for review. for background, this problem is similar to `Operations.isEmpty()`: https://github.com/apach

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-02 Thread via GitHub
rmuir commented on PR #13707: URL: https://github.com/apache/lucene/pull/13707#issuecomment-2325282540 heh that empty case got found in a different way by the randomized check. it isn't enough to do `a.getNumStates() > 0`, for the situation where all states are dead states :) w

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-02 Thread via GitHub
rmuir commented on PR #13707: URL: https://github.com/apache/lucene/pull/13707#issuecomment-2325272520 iterating again, I changed the code here to @dweiss solution and added tests based on his examples. it runs in linear time and space (bitset), and should work generally for any DFA,

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-02 Thread via GitHub
rmuir commented on PR #13707: URL: https://github.com/apache/lucene/pull/13707#issuecomment-2325097210 OK I took a third option here in the latest commit. Javadoc is unchanged, we just detect the kind of "total" automaton being created by RegExp, too, without involving any scary algor

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-02 Thread via GitHub
rmuir commented on PR #13707: URL: https://github.com/apache/lucene/pull/13707#issuecomment-2325046919 I feel like with this code there are only two options: * keep `isTotal` O(1) , to prevent traps and problems. it does document that the input must be minimized. Maybe it is enough to fix

[PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-02 Thread via GitHub
rmuir opened a new pull request, #13707: URL: https://github.com/apache/lucene/pull/13707 Operations.isTotal currently returns `false` unless the DFA is minimal. This makes the method almost useless and we definitely don't want to encourage minimization just to make such a check. C