[ https://issues.apache.org/jira/browse/LUCENE-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553142#comment-17553142 ]
Tomoko Uchida edited comment on LUCENE-10610 at 6/11/22 7:13 PM: ----------------------------------------------------------------- I've been thinking about those three options, for what it's worth. 1. Do nothing * it's not a bug, at least it's harmless 2. Fix RunAutomaton#hashCode() * we can quickly fix to take account of all elements of the array 3. Rewrite AutomatonQuery and its sub-classes not to use RunAutomaton#hashCode() * this could help to avoid future confusion for other newcomers (someone like me) was (Author: tomoko uchida): I've been thinking about those three options, for what it's worth. 1. Do nothing * it's not a bug, at least it's harmless 2. Fix RunAutomaton#hashCode() * we can quickly fix to take account of all elements of the array 3. Fix AutomatonQuery and its sub-classes to not to use RunAutomaton#hashCode() * this could help to avoid future confusion for other newcomers (someone like me) > RunAutomaton#hashCode() can easily cause hash collision for different > Automatons > -------------------------------------------------------------------------------- > > Key: LUCENE-10610 > URL: https://issues.apache.org/jira/browse/LUCENE-10610 > Project: Lucene - Core > Issue Type: Bug > Reporter: Tomoko Uchida > Priority: Minor > > Current RunAutomaton#hashCode() is: > {code:java} > @Override > public int hashCode() { > final int prime = 31; > int result = 1; > result = prime * result + alphabetSize; > result = prime * result + points.length; > result = prime * result + size; > return result; > } > {code} > Since it does not take account of the contents of the {{points}} array, this > returns the same value for different automatons when their alphabet size and > state size are the same. > For example, this test code passes. > {code:java} > public void testHashCode() throws IOException { > PrefixQuery q1 = new PrefixQuery(new Term("field", "aba")); > PrefixQuery q2 = new PrefixQuery(new Term("field", "fee")); > assert q1.compiled.runAutomaton.hashCode() == > q2.compiled.runAutomaton.hashCode(); > } > {code} > I suspect this is a bug? > Note that I think it's not a serious one; all callers of this {{hashCode()}} > take account of additional information when calculating their own hash value, > it seems there is no substantial impact on higher-level APIs. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org