donnerpeter commented on pull request #2457: URL: https://github.com/apache/lucene-solr/pull/2457#issuecomment-791466479
Thanks for the suggestion, but it works on code points as well, which I'd prefer to leave out for now. On Fri, 5 Mar 2021 at 15:35, Robert Muir <notificati...@github.com> wrote: > *@rmuir* commented on this pull request. > ------------------------------ > > In > lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/TrigramAutomaton.java > <https://github.com/apache/lucene-solr/pull/2457#discussion_r588339682>: > > > + Automaton.Builder builder = new Automaton.Builder(s1.length() * N, s1.length() * N); > + int initialState = builder.createState(); > + > + for (int start = 0; start < s1.length(); start++) { > + int limit = Math.min(s1.length(), start + N); > + for (int end = start + 1; end <= limit; end++) { > + substringCounts.merge(s1.substring(start, end), 1, Integer::sum); > + } > + > + int state = initialState; > + for (int i = start; i < limit; i++) { > + int next = builder.createState(); > + builder.addTransition(state, next, s1.charAt(i)); > + state = next; > + } > + } > > fyi this automaton seems to be just a set of strings, i'm not sure if > DaciukMihovAutomatonBuilder is helpful here, because i don't know the > threshold where it starts to matter. but it is an alternative to building > NFA and then determinizing it > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/apache/lucene-solr/pull/2457#pullrequestreview-605241545>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAA5ZGOGGRTR5SNT72G566TTCDTZTANCNFSM4YUX3ODA> > . > ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org