[I] Question for nori analyer behavior change [lucene]

via GitHub Fri, 09 May 2025 20:31:46 -0700


Kwanghyuk-Kim opened a new issue, #14637:
URL: https://github.com/apache/lucene/issues/14637


   ### Description
   
   Hello Mainters of Nori Analyzer,
   
   Recently, we upgraded ElasticSearch from v6.8 to v8.1 and installed Nori 
analyzer plugin corresponds with v8.1 ElasticSearch.
   
   And, through some tests, we noticed that the tokenization behavior of the 
Nori analyzer was different.
   
   For example, in the case of A7B, which is the Nori analyzer of Elastic 
Search v6.8, the token is tokenized as A7B.
   However, using ElasticSearch v8.1 tokenizes this word to A, 7, B.
   
   So, I would like to ask you some questions below.
   
   1) Reason or background of this change.
       What are the benefits of this change?
   
   2) Is there a way to configure a tokenization method that is the same or 
similar to the Nori analyzer of ElasticSearch V6.8?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[I] Question for nori analyer behavior change [lucene]

Reply via email to