[ 
https://issues.apache.org/jira/browse/LUCENE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206784#comment-17206784
 ] 

ASF subversion and git services commented on LUCENE-9557:
---------------------------------------------------------

Commit 7eee4fd1026355a3e420bf02a7578eee6826bc97 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7eee4fd ]

LUCENE-9557: regeneration should use python3, not python2

python2 will change the DFA, but using python3 re-generates the sources
as they exist today. plus, we don't want to depend on EOL python2.


> gradle regeneration of HTMLCharacterEntities.jflex should not use python2
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-9557
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9557
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-9557.patch
>
>
> I thought we had cleaned out the python2, but we got one straggler left.
> Currently this is set to run with python2, but it should be using python3. 
> Python3 will generate the exact same sources that are present in master 
> today. But if you run it with python2 (as currently configured) it generates 
> a slightly different grammar:
> {noformat}
> --- 
> a/lucene/analysis/common/src/java/org/apache/lucene/analysis/charfilter/HTMLCharacterEntities.jflex
> +++ 
> b/lucene/analysis/common/src/java/org/apache/lucene/analysis/charfilter/HTMLCharacterEntities.jflex
> @@ -60,7 +60,7 @@ CharacterEntities = ( "AElig" | "Aacute" | "Acirc" | 
> "Agrave" | "Alpha"
>                      | "times" | "trade" | "uArr" | "uacute" | "uarr" | 
> "ucirc"
>                      | "ugrave" | "uml" | "upsih" | "upsilon" | "uuml"
>                      | "weierp" | "xi" | "yacute" | "yen" | "yuml" | "zeta"
> -                    | "zwj" | "zwnj" )
> +('                    | "zwj" | "zwnj"', ')')
> {noformat}
> This then cascades and causes HTMLStripCharFilter.java to be regenerated 
> differently too with a different DFA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to