janhoy commented on a change in pull request #84:
URL: https://github.com/apache/lucene/pull/84#discussion_r613595880



##########
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/ScandinavianNormalizationFilter.java
##########
@@ -33,14 +34,45 @@
  * <p>blåbærsyltetøj == blåbärsyltetöj == blaabaarsyltetoej but not 
blabarsyltetoj räksmörgås ==
  * ræksmørgås == ræksmörgaos == raeksmoergaas but not raksmorgas
  *
+ * <p>You can choose which of the foldings to apply (aa, ao, ae, oe, oo) 
through a parameter.
+ *
  * @see ScandinavianFoldingFilter
  */
 public final class ScandinavianNormalizationFilter extends TokenFilter {
 
+  /**
+   * Create the filter with default folding rules, backward compatible with 
all earlier versions
+   *
+   * @param input the TokenStream
+   */
   public ScandinavianNormalizationFilter(TokenStream input) {
     super(input);
+    this.foldings = ALL_FOLDINGS;
   }
 
+  /**
+   * Create the filter using custom folding rules.
+   *
+   * @param input the TokenStream
+   * @param foldings a Set of Foldings to apply (i.e. AE, OE, AA, AO, OO)
+   */
+  public ScandinavianNormalizationFilter(TokenStream input, Set<Foldings> 
foldings) {

Review comment:
       We can obtain a similar Lucene API usability by adding helper vars:
   ```java
   public static final Set<Foldings> ALL_FOLDINGS = Set.of(AA, AO, OO, AE, OE);
   public static final Set<Foldings> NORWEGIAN_FOLDINGS = Set.of(AE, OE, AA);
   public static final Set<Foldings> DANISH_FOLDINGS = NORWEGIAN_FOLDINGS;
   public static final Set<Foldings> SWEDISH_FOLDINGS = ALL_FOLDINGS;
   ```
   In the factory that would translate to perhaps a "language" parameter with 
predefined settings.
   
   I'm not opposed to thin wrapper filters for each language, but I'd like some 
feedback from other Scandinavian users on what those should default to.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to