markharwood commented on a change in pull request #1541:
URL: https://github.com/apache/lucene-solr/pull/1541#discussion_r444284423



##########
File path: lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java
##########
@@ -743,6 +792,30 @@ private Automaton 
toAutomatonInternal(Map<String,Automaton> automata,
     }
     return a;
   }
+  private Automaton toCaseInsensitiveChar(int codepoint, int 
maxDeterminizedStates) {
+    Automaton case1 = Automata.makeChar(codepoint);
+    int altCase = Character.isLowerCase(codepoint) ? 
Character.toUpperCase(codepoint) : Character.toLowerCase(codepoint);
+    Automaton result;
+    if (altCase != codepoint) {
+      result = Operations.union(case1, Automata.makeChar(altCase));
+      result = MinimizationOperations.minimize(result, maxDeterminizedStates); 
         
+    } else {
+      result = case1;                      
+    }          
+    return result;
+  }

Review comment:
       An alternative would be an overhaul of RegExp.
   * Introducing a Builder class for the parser with named properties for 
settings
   * separating the RegExp parser logic from the  parsed objects (currently 
they are the same class). 
   * separating rendering functions (toString, to Automaton, toStringTree) from 
the parsed objects.
   
   I'm not sure if we're at the tipping point where all of that would make 
sense.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to