jpountz commented on issue #12957: URL: https://github.com/apache/lucene/issues/12957#issuecomment-1864407118
Oh I see, I created binary automata, but the API implicitly treats automata as UTF32 automata, so you need to tell it explicitly that it's a binary automaton. And something like that should fix the problem? ```java diff --git a/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java b/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java index a555ce40001..f899b331b92 100644 --- a/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java +++ b/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java @@ -2318,7 +2318,7 @@ public final class CheckIndex implements Closeable { startTerm = new BytesRef(); checkTermsIntersect(terms, automaton, startTerm); - automaton = Automata.makeAnyBinary(); + automaton = Automata.makeNonEmptyBinary(); startTerm = new BytesRef(new byte[] {'l'}); checkTermsIntersect(terms, automaton, startTerm); @@ -2369,8 +2369,8 @@ public final class CheckIndex implements Closeable { throws IOException { TermsEnum allTerms = terms.iterator(); automaton = Operations.determinize(automaton, Operations.DEFAULT_DETERMINIZE_WORK_LIMIT); - CompiledAutomaton compiledAutomaton = new CompiledAutomaton(automaton); - ByteRunAutomaton runAutomaton = new ByteRunAutomaton(automaton); + CompiledAutomaton compiledAutomaton = new CompiledAutomaton(automaton, false, true, true); + ByteRunAutomaton runAutomaton = new ByteRunAutomaton(automaton, true); TermsEnum filteredTerms = terms.intersect(compiledAutomaton, startTerm); BytesRef term; if (startTerm != null) { ``` (I had to change the automaton so that it's still considered of type "normal" and not "all") -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org