[ 
https://issues.apache.org/jira/browse/LUCENE-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443972#comment-17443972
 ] 

Robert Muir commented on LUCENE-9921:
-------------------------------------

I agree, let's not rush this in close to a release. Note that we don't need to 
upgrade this jar for the 37 new unicode 14 emoji to work, they will already be 
tokenized/tagged correctly, due to the way the preallocation happens in unicode:

See test:
https://github.com/apache/lucene/blob/main/lucene/analysis/icu/src/test/org/apache/lucene/analysis/icu/segmentation/TestICUTokenizer.java#L506-L519

See list:
https://s.apache.org/pqnnc



> Can ICU regeneration tasks treat icu version as input?
> ------------------------------------------------------
>
>                 Key: LUCENE-9921
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9921
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>            Priority: Major
>
> ICU 69 was released, so i was playing with the upgrade just to test it out 
> and test out our regeneration.
> Running {{gradlew regenerate}} naively wasn't helpful, regeneration tasks 
> were SKIPPED by the build.
> So I'm curious if the ICU version can be treated as an "input" to these 
> tasks, such that if it changes, tasks know the generated output is stale?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to