rmuir commented on PR #14493:
URL: https://github.com/apache/lucene/pull/14493#issuecomment-2804210293

   > What would be the right way to get the original input string then, please? 
Or do you just declare that `RegExp` does not support that?
   > 
   
   It doesn't have a method to support this. if the feature is really needed, I 
would be in favor of doing this in the obvious way if possible: just holding a 
ref to the original string before any parsing.
   
   > Also, it's not just about readability - even as a human, if I see 
`regex.toString() == "\\s"`, before this fix, I don't know whether it is `s` 
character or `whitespace` class.
   
   This PR makes it better though. Personally my opinion: the entire 
`.toString()` should be rewritten from scratch.  It was never properly designed 
to be user-friendly in any way.  I still plan to merge your incremental 
improvement though.
   
   For testing purposes we rely on two things:
   - Correct Parse: mostly this is done via `toStringTree()` which returns 
structured-ish format of the represenation. See TestRegExpParsing.java as 
example.
   - Correct Behavior: mostly this is done via `Automaton` classes which is 
unambiguous.
   
   Assertions on the `toString()` are mostly just legacy and there's no 
guarantees about the format of that method AFAIK, so we can change it to be 
better.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to