[ https://issues.apache.org/jira/browse/LUCENE-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453276#comment-17453276 ]
ASF subversion and git services commented on LUCENE-10243: ---------------------------------------------------------- Commit eff5430e5877d84a6a0754f2b2f2aa0befeb7291 in lucene's branch refs/heads/branch_9x from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=eff5430 ] LUCENE-10243: increase unicode versions of tokenizers to 12.1 (#465) * Bump %unicode 9 -> %unicode 12.1 for the 3 unicode grammars * regenerate emoji conformance tests for unicode 12.1 * modify wordbreak conformance tests to use emoji data (which replaces old crazy E_base etc properties) * regenerate wordbreak conformance tests * Simplify grammar files and word-break conformance test generator, now that full-width numbers are WordBreak=Numeric * Use jflex emoji properties rather than ICU-generated ones > increase unicode versions of tokenizers to unicode 12.1 > ------------------------------------------------------- > > Key: LUCENE-10243 > URL: https://issues.apache.org/jira/browse/LUCENE-10243 > Project: Lucene - Core > Issue Type: Task > Reporter: Robert Muir > Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Followup from LUCENE-10239 > Bump the Unicode version of these tokenizers from Unicode 9 to 12.1, which is > the most recent supported by the jflex release. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org