Mark Harwood created LUCENE-9371:
------------------------------------
Summary: Make RegExp internal state more visible to support more
rendering formats
Key: LUCENE-9371
URL: https://issues.apache.org/jira/browse/LUCENE-9371
Project: Lucene - Core
Issue Type: Improvement
Components: core/search
Reporter: Mark Harwood
Assignee: Mark Harwood
This is a proposal to open up read-only access to the internal state of RegExp
objects.
The RegExp parser provides a useful parsed object model for regular
expressions. Today it offers three rendering functions:
1) To Automaton (for query execution)
2) To string (for machine-readable regular expressions)
3) To StringTree (for debug purposes)
There are at least 2 other rendering functions that would be useful:
a) To "Explain" format (like the plain-English descriptions used in [regex
debugging tools|https://regex101.com/r/2DUzac/1])
b) To Query (queries used to accelerate regex searches by providing an
approximation of the search terms and [hitting an ngram
index|https://github.com/wikimedia/search-extra/blob/master/docs/source_regex.md])
To support these and other renderings/transformations it would be useful to
open read-only access to the fields held in RegExp objects - either through
making them public finals or offering getter access methods. This would free
the RegExp class from having to support all possible transformations.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]