[ https://issues.apache.org/jira/browse/LUCENE-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17563382#comment-17563382 ]
Uwe Schindler edited comment on LUCENE-10642 at 7/6/22 6:36 PM: ---------------------------------------------------------------- bq. From the user perspective, is it non-intuitive why the character classes should be denoted with two slashes That's only in Java code (the usual stupidness) and possibly JSON. The problem is if you write "\n" the java compiler creates a newline out of it and theres never a \n in the regular expression. Actually it is a problem if you cant write {{\\n}} as this would be seen by parser as \n. was (Author: thetaphi): bq. From the user perspective, is it non-intuitive why the character classes should be denoted with two slashes That's only in Java code (the usual stupidness) and possibly JSON. The problem is if you write "\n" the java compiler creates a newline out of it and theres never a \n in the regular expression. Actually it is a problem if you cant write \\n as this would be seen by parser as \n. > Regexp query: escape sequences are treated as character classes > --------------------------------------------------------------- > > Key: LUCENE-10642 > URL: https://issues.apache.org/jira/browse/LUCENE-10642 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 9.0, 9.1, 9.2, 9.3 > Reporter: Andriy Redko > Priority: Major > > Interesting issue has been reported to Opensearch project [1], which has been > caused by [2], [3]. In the nutshell, the regression is causing escape > sequences (like \n, \r, \t, ...) to be treated as character classes > (specifically, > [https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html#bs).] > The problematic function is RegExp::matchPredefinedCharacterClass which does > not consider characters that denote an escaped construct. Simple test to > reproduce which fails with IllegalArgumentException("{color:#0451a5}invalid > character class{color}"): > > {noformat} > public class TestRegexpQuery extends LuceneTestCase { > public void testEscapeSequences() throws IOException { > assertEquals(1, regexQueryNrHits("\\n")); > assertEquals(1, regexQueryNrHits("[\\n]")); } > } > } > {noformat} > > [1] [https://github.com/opensearch-project/OpenSearch/issues/3781] > [2] > [https://github.com/apache/lucene/commit/1efce5444dd40142c55c5a3a30eeebc7b86796c3] > [3] > [https://github.com/apache/lucene/commit/819e668ce2fcfcf86b652a191cdbe0fad0a8ffce] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org