rmuir commented on a change in pull request #1281: LUCENE-9245: Optimize
AutomatonTermsEnum memory and automaton Operations.getCommonPrefixBytesRef.
URL: https://github.com/apache/lucene-solr/pull/1281#discussion_r383289307
##########
File path: lucene/core/src/java/org/apache/lucene/index/AutomatonTermsEnum.java
##########
@@ -54,18 +56,20 @@
private final boolean finite;
// array of sorted transitions for each state, indexed by state number
private final Automaton automaton;
- // for path tracking: each long records gen when we last
+ // for path tracking: each short records gen when we last
// visited the state; we use gens to avoid having to clear
- private final long[] visited;
Review comment:
visited-state-tracking is only needed when the automaton accepts an infinite
language. We use it for loop detection. I think before we get too fancy with
how we clear it, we should first stop being stupid about it?
So it is wasteful that we do this stuff when `finite == true` (example:
fuzzy query) because we will never even look for a loop. its just that the
current code unconditionally records states that it visited.
I think first, in the ctor when `finite == true`, `visited[]` can be
initialized to `null` or `new long[0]` or something, and we change this line:
```
visited[state] = curGen;
```
to something like this:
```
if (!finite)
visited[state] = curGen;
```
I agree we should separately avoid tracking 64 bits per state when only 1 is
needed. But before optimizing the storage, first lets avoid doing this stuff at
all for ones like complex fuzzy queries?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]