[ 
https://issues.apache.org/jira/browse/LUCENE-9929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-9929:
--------------------------------
    Description: 
The ScandinavianNormalizationFilter applies foldings for aa, ao, ae, oe and oo. 
But all those five do not make sense for both Norwegian, Swedish and Danish. 
Implement a separate Norwegian variant, based on the Scandinavian:
{code:java}
<filter class="solr.NorwegianNormalizationFilterFactory"/>
{code}
This would have the same rules as ScandinavianNormalizationFilter except it 
would not fold oo->ø and ao->å.

  was:
The ScandinavianNormalizationFilter applies foldings for aa, ao, ae, oe and oo. 
But all those five do not make sense for both Norwegian, Swedish and Danish. 
Implement an optional configuration option where users can select which of them 
to apply. I.e. for Norwegian, a user would then configure (in Solr):
{code:java}
<filter class="solr.ScandinavianNormalizationFilterFactory foldings="ae,oe,aa"/>
{code}
This would activate foldings for ae->æ, oe->ø, aa->å, but not oo->o and ao->a.

The default will be to activate all five as before, so it will be backward 
compatible.


> NorwegianNormalizationFilter
> ----------------------------
>
>                 Key: LUCENE-9929
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9929
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> The ScandinavianNormalizationFilter applies foldings for aa, ao, ae, oe and 
> oo. But all those five do not make sense for both Norwegian, Swedish and 
> Danish. Implement a separate Norwegian variant, based on the Scandinavian:
> {code:java}
> <filter class="solr.NorwegianNormalizationFilterFactory"/>
> {code}
> This would have the same rules as ScandinavianNormalizationFilter except it 
> would not fold oo->ø and ao->å.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to