rmuir commented on PR #14493: URL: https://github.com/apache/lucene/pull/14493#issuecomment-2804210293
> What would be the right way to get the original input string then, please? Or do you just declare that `RegExp` does not support that? > It doesn't have a method to support this. if the feature is really needed, I would be in favor of doing this in the obvious way if possible: just holding a ref to the original string before any parsing. > Also, it's not just about readability - even as a human, if I see `regex.toString() == "\\s"`, before this fix, I don't know whether it is `s` character or `whitespace` class. This PR makes it better though. Personally my opinion: the entire `.toString()` should be rewritten from scratch. It was never properly designed to be user-friendly in any way. I still plan to merge your incremental improvement though. For testing purposes we rely on two things: - Correct Parse: mostly this is done via `toStringTree()` which returns structured-ish format of the represenation. See TestRegExpParsing.java as example. - Correct Behavior: mostly this is done via `Automaton` classes which is unambiguous. Assertions on the `toString()` are mostly just legacy and there's no guarantees about the format of that method AFAIK, so we can change it to be better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org