[GitHub] [lucene] uschindler commented on a diff in pull request #12437: LUCENE-8183: Added the abbility to get noSubMatches and noOverlapping Matches

via GitHub Wed, 19 Jul 2023 03:48:26 -0700


uschindler commented on code in PR #12437:
URL: https://github.com/apache/lucene/pull/12437#discussion_r1267297167



##########
lucene/analysis/common/src/test/org/apache/lucene/analysis/compound/TestHyphenationCompoundWordTokenFilterFactory.java:
##########
@@ -47,6 +47,33 @@ public void testHyphenationWithDictionary() throws Exception 
{
         new int[] {1, 1, 1, 1, 1, 1, 1, 1, 0, 0});
   }
 
+  /**
+   * just tests that the two no configuration options are correctly processed 
tests for the
+   * functionality are part of {@link TestCompoundWordTokenFilter}
+   */
+  public void testLucene8183() throws Exception {
+    Reader reader = new StringReader("basketballkurv");
+    TokenStream stream = new MockTokenizer(MockTokenizer.WHITESPACE, false);
+    ((Tokenizer) stream).setReader(reader);
+    stream =
+        tokenFilterFactory(
+                "HyphenationCompoundWord",
+                "hyphenator",
+                "da_UTF8.xml",

Review Comment:
   The problem with the German dictionary is the license. So we should not add 
the originals from FOP. If the Danish one works for the test all is fine (I 
think it is already only a subset of Danish).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] uschindler commented on a diff in pull request #12437: LUCENE-8183: Added the abbility to get noSubMatches and noOverlapping Matches

Reply via email to