[
https://issues.apache.org/jira/browse/LUCENE-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552909#comment-17552909
]
Tomoko Uchida commented on LUCENE-10610:
----------------------------------------
I may completely miss the point so correct me if I'm wrong - but possibly does
it make sense to make all methods that change the internal state of
{{Automaton}}, and make it immutable (from the perspective of the outside
package)? It looks {{Automaton.Builder}} has all the necessary methods to build
an Automaton instance, so I wonder if it is sufficient to only expose the
builder class. I feel a bit awkward that the built Automaton instance can be
still modified after "finishing" by the builder.
If we make it immutable, we could safely set a (pre-computed or on-the-fly)
hash value to it, and in return for that, we can freely remove {{hashCode()}}
and {{equals()}} from {{CompiledAutomaton}} and {{RunAutomaton}} classes.
Rather than having them in the classes to run actual matching operations, I
guess it could be more natural to have it in {{Automaton}} like {{Query}} class
- the prototype and most higher interface?
> RunAutomaton#hashCode() can easily cause hash collision for different
> Automatons
> --------------------------------------------------------------------------------
>
> Key: LUCENE-10610
> URL: https://issues.apache.org/jira/browse/LUCENE-10610
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Tomoko Uchida
> Priority: Minor
>
> Current RunAutomaton#hashCode() is:
> {code:java}
> @Override
> public int hashCode() {
> final int prime = 31;
> int result = 1;
> result = prime * result + alphabetSize;
> result = prime * result + points.length;
> result = prime * result + size;
> return result;
> }
> {code}
> Since it does not take account of the contents of the {{points}} array, this
> returns the same value for different automatons when their alphabet size and
> state size are the same.
> For example, this test code passes.
> {code:java}
> public void testHashCode() throws IOException {
> PrefixQuery q1 = new PrefixQuery(new Term("field", "aba"));
> PrefixQuery q2 = new PrefixQuery(new Term("field", "fee"));
> assert q1.compiled.runAutomaton.hashCode() ==
> q2.compiled.runAutomaton.hashCode();
> }
> {code}
> I suspect this is a bug?
> Note that I think it's not a serious one; all callers of this {{hashCode()}}
> take account of additional information when calculating their own hash value,
> it seems there is no substantial impact on higher-level APIs.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]