benchaplin opened a new pull request, #13887:
URL: https://github.com/apache/lucene/pull/13887

   ### Description
   
   [This issue](https://github.com/apache/lucene/issues/13234) raises a 
question about the QueryParser's ability to handle escaped brackets in a range 
query's terms.
   
   ```
   queryParser.parse( "[ a\\[i\\] TO b\\[i\\] ]" );
   
   /* 
   org.apache.lucene.queryparser.classic.ParseException: Cannot parse '[a\[i\] 
TO b\[i\]]': Encountered " "]" "] "" at line 1, column 6.
   Was expecting:
       "TO" ...
    */
   ```
   
   I discovered a workaround highlighted in [a previous 
PR](https://github.com/apache/lucene/pull/13323). However, I think that escaped 
special characters (notably: closing brackets and spaces) should be allowed in 
a range term, so I've updated the JavaCC token definition.
   
   ### Backwards Compatibility
   
   There are two JavaCC changes:  
   1. **Addition of `| "\\" ~[]`** - by nature of the `|` this should not fail 
to parse anything that was parsed before. 
   2. **Addition of `"\\"` in the negation set: `~[ "\\", " ", "]", "}" ]`** - 
if the `"\\"` is followed by another character, it will be parsed by change 1. 
If it's followed by EOF, it would have thrown a `ParseException` for "Term can 
not end with escape character."
   
   By this logic, nothing that was parsed before the changes will fail to parse 
after the changes. The one break in backwards compatibility is the message of 
the `ParseException` in some cases, for example:
   
   `queryParser.parse("[\\ TO abc]");` 
   - Before: `org.apache.lucene.queryparser.classic.ParseException: Cannot 
parse '[\ TO abc]': Term can not end with escape character.`
   - After: `org.apache.lucene.queryparser.classic.ParseException: Cannot parse 
'[\ TO abc]': Encountered " <RANGE_GOOP> "\\] "" at line 1, column 6.
   Was expecting:
       "TO" ...`
     - This is to be expected as escaped spaces are now allowed, so the first 
term parsed is `" TO"`.
   
   ---
   
   ### Question: other QueryParsers
   
   In previous changes to QueryParser.jj, I noticed authors also changed 
[StandardSyntaxParser.jj](https://github.com/apache/lucene/blob/main/lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/standard/parser/StandardSyntaxParser.jj)
 and Solr's 
[QueryParser.jj](https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/parser/QueryParser.jj).
 I imagine I'd want to do the same here, but wanted to check with maintainers 
first.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to