[ 
https://issues.apache.org/jira/browse/LUCENE-8596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999146#comment-16999146
 ] 

Michael Sokolov commented on LUCENE-8596:
-----------------------------------------

That looks like a real problem we should fix.  There's no way to include a 
token with a "#" in a user dictionary. However it's problematic since any 
change here will not be backwards-compatible.

We can just change the behavior to respect comments only at the beginning of 
the line, and document the  breaking change. Some users may get bit when they 
upgrade (if they have been using the other comment style in their dictionaries).

Or, we can introduce some API change to support both styles of commenting. This 
seems overly complex for a pretty small edge case though: I favor fixing the 
behavior with a  breaking change plus documentation in CHANGES. If there's no 
objection, I'll merge this and add a note to CHANGES

> The replacement of comments is a bug, in "UserDictionary.java"
> --------------------------------------------------------------
>
>                 Key: LUCENE-8596
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8596
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>            Reporter: miyaharas
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/dict/UserDictionary.java#L68]
>  
> hi
> I think that this is bug.
> I think the following is correct
> {code:java}
> line = line.replaceAll ("^ #. * $", "");  
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to