jpountz commented on issue #12957:
URL: https://github.com/apache/lucene/issues/12957#issuecomment-1864407118

   Oh I see, I created binary automata, but the API implicitly treats automata 
as UTF32 automata, so you need to tell it explicitly that it's a binary 
automaton. And something like that should fix the problem?
   
   ```java
   diff --git a/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java 
b/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
   index a555ce40001..f899b331b92 100644
   --- a/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
   +++ b/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
   @@ -2318,7 +2318,7 @@ public final class CheckIndex implements Closeable {
            startTerm = new BytesRef();
            checkTermsIntersect(terms, automaton, startTerm);
    
   -        automaton = Automata.makeAnyBinary();
   +        automaton = Automata.makeNonEmptyBinary();
            startTerm = new BytesRef(new byte[] {'l'});
            checkTermsIntersect(terms, automaton, startTerm);
    
   @@ -2369,8 +2369,8 @@ public final class CheckIndex implements Closeable {
          throws IOException {
        TermsEnum allTerms = terms.iterator();
        automaton = Operations.determinize(automaton, 
Operations.DEFAULT_DETERMINIZE_WORK_LIMIT);
   -    CompiledAutomaton compiledAutomaton = new CompiledAutomaton(automaton);
   -    ByteRunAutomaton runAutomaton = new ByteRunAutomaton(automaton);
   +    CompiledAutomaton compiledAutomaton = new CompiledAutomaton(automaton, 
false, true, true);
   +    ByteRunAutomaton runAutomaton = new ByteRunAutomaton(automaton, true);
        TermsEnum filteredTerms = terms.intersect(compiledAutomaton, startTerm);
        BytesRef term;
        if (startTerm != null) {
   ```
   
   (I had to change the automaton so that it's still considered of type 
"normal" and not "all")


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to