[ 
https://issues.apache.org/jira/browse/LUCENE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206746#comment-17206746
 ] 

Uwe Schindler commented on LUCENE-9557:
---------------------------------------

I think the reason for the difference was early stages of Gradle. There was 
only "python" int ants build, but on most systems this defaults to python 2.7 
(Ubuntu, Solaris). Maybe mac is different. I strongly prefer to say "python3" 
with version number. Maybe add a forbidden source patterns rule: disallow 
python without digit. 😜

> gradle regeneration of HTMLCharacterEntities.jflex should not use python2
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-9557
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9557
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-9557.patch
>
>
> I thought we had cleaned out the python2, but we got one straggler left.
> Currently this is set to run with python2, but it should be using python3. 
> Python3 will generate the exact same sources that are present in master 
> today. But if you run it with python2 (as currently configured) it generates 
> a slightly different grammar:
> {noformat}
> --- 
> a/lucene/analysis/common/src/java/org/apache/lucene/analysis/charfilter/HTMLCharacterEntities.jflex
> +++ 
> b/lucene/analysis/common/src/java/org/apache/lucene/analysis/charfilter/HTMLCharacterEntities.jflex
> @@ -60,7 +60,7 @@ CharacterEntities = ( "AElig" | "Aacute" | "Acirc" | 
> "Agrave" | "Alpha"
>                      | "times" | "trade" | "uArr" | "uacute" | "uarr" | 
> "ucirc"
>                      | "ugrave" | "uml" | "upsih" | "upsilon" | "uuml"
>                      | "weierp" | "xi" | "yacute" | "yen" | "yuml" | "zeta"
> -                    | "zwj" | "zwnj" )
> +('                    | "zwj" | "zwnj"', ')')
> {noformat}
> This then cascades and causes HTMLStripCharFilter.java to be regenerated 
> differently too with a different DFA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to