jpountz commented on code in PR #12249: URL: https://github.com/apache/lucene/pull/12249#discussion_r1218211772
########## lucene/core/src/test/org/apache/lucene/util/graph/TestGraphTokenStreamFiniteStrings.java: ########## @@ -660,4 +663,27 @@ public void testMultipleSidePathsWithGaps() throws Exception { it.next(), new String[] {"king", "alfred", "saxons", "ruled"}, new int[] {1, 1, 3, 1}); assertFalse(it.hasNext()); } + + public void testLongTokenStreamStackOverflowError() throws Exception { + + ArrayList<Token> tokens = + new ArrayList<Token>() { + { + add(token("fast", 1, 1)); + add(token("wi", 1, 1)); + add(token("wifi", 0, 2)); + add(token("fi", 1, 1)); + } + }; + + // Add in too many tokens to get a high depth graph + for (int i = 0; i < 1024 * 10; i++) { + tokens.add(token("network", 1, 1)); + } + + TokenStream ts = new CannedTokenStream(tokens.toArray(new Token[0])); + GraphTokenStreamFiniteStrings graph = new GraphTokenStreamFiniteStrings(ts); + + assertThrows(IllegalArgumentException.class, () -> graph.articulationPoints()); Review Comment: nit: we prefer method refs to lambdas whenever possible ```suggestion assertThrows(IllegalArgumentException.class, graph::articulationPoints); ``` ########## lucene/core/src/test/org/apache/lucene/util/graph/TestGraphTokenStreamFiniteStrings.java: ########## @@ -660,4 +661,27 @@ public void testMultipleSidePathsWithGaps() throws Exception { it.next(), new String[] {"king", "alfred", "saxons", "ruled"}, new int[] {1, 1, 3, 1}); assertFalse(it.hasNext()); } + + public void testLongTokenStreamStackOverflowError() throws Exception { + ArrayList<Token> tokens = + new ArrayList<Token>() { + { + add(token("turbo", 1, 1)); + add(token("fast", 0, 2)); + add(token("charged", 1, 1)); + add(token("wi", 1, 1)); + add(token("wifi", 0, 2)); + add(token("fi", 1, 1)); + } + }; + + // Add in too many tokens to get a high depth graph + for (int i = 0; i < 1024 * 10; i++) { Review Comment: @cfournie I think that Erik's point is that your test would not catch an issue where the exception only gets thrown with a depth of 4000 or more, so changing the number of iterations to 1024+1 would help make sure that the exception gets thrown as early as we expect. I would prefer changing the number of iterations to 1024+1 as well. ########## lucene/CHANGES.txt: ########## @@ -80,6 +80,8 @@ Bug Fixes * GITHUB#12220: Hunspell: disallow hidden title-case entries from compound middle/end +* LUCENE-10181: Restrict GraphTokenStreamFiniteStrings#articulationPointsRecurse recursion depth. (Chris Fournier) Review Comment: Can you move it to the 9.7 section instead of 10.0? This feels like a change that could go in a minor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org