rmuir commented on a change in pull request #1281: LUCENE-9245: Optimize 
AutomatonTermsEnum memory and automaton Operations.getCommonPrefixBytesRef.
URL: https://github.com/apache/lucene-solr/pull/1281#discussion_r383289307
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/index/AutomatonTermsEnum.java
 ##########
 @@ -54,18 +56,20 @@
   private final boolean finite;
   // array of sorted transitions for each state, indexed by state number
   private final Automaton automaton;
-  // for path tracking: each long records gen when we last
+  // for path tracking: each short records gen when we last
   // visited the state; we use gens to avoid having to clear
-  private final long[] visited;
 
 Review comment:
   visited-state-tracking is only needed when the automaton accepts an infinite 
language. We use it for loop detection. I think before we get too fancy with 
how we clear it, we should first stop being stupid about it?
   
   So it is wasteful that we do this stuff when `finite == true` (example: 
fuzzy query) because we will never even look for a loop. its just that the 
current code unconditionally records states that it visited.
   
   I think first, in the ctor when `finite == true`, `visited[]` can be 
initialized to `null` or `new long[0]` or something, and we change this line:
   ```
   visited[state] = curGen;
   ```
   to something like this:
   ```
   if (!finite)
     visited[state] = curGen;
   ```
   
   I agree we should separately avoid tracking 64 bits per state when only 1 is 
needed. But before optimizing the storage, first lets avoid doing this stuff at 
all for ones like complex fuzzy queries?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to