> Am 24.11.2017 um 20:13 schrieb Xiaodi Wu via swift-evolution > <[email protected]>: > >> On Thu, Nov 23, 2017 at 5:33 PM, Chris Lattner <[email protected]> wrote: >>> On Nov 23, 2017, at 10:35 AM, Xiaodi Wu via swift-evolution >>> <[email protected]> wrote: >>> This proposed addition addresses a known pain point, to be sure, but I >>> think it has many implications for the future direction of the language and >>> I'd like to explore them here. >> >> Thanks for writing this up Xiaodi, >> >>> We should certainly move any discussion about regex literals into its own >>> thread, but to make it clear that I'm not simply suggesting that we >>> implement something in Swift 10 instead of addressing a known pain point >>> now, here's a sketch of how Swift 5 could make meaningful progress: >>> >>> - Teach the lexer about basic /pattern/flag syntax. >>> - Add an `ExpressibleByRegularExpressionLiteral`, where the initializer >>> would be something like `init(regularExpressionLiteralPattern: String, >>> flags: RegularExpressionFlags)` where RegularExpressionFlags would be an >>> OptionSet type. >>> - Add conformance to `ExpressibleByRegularExpressionLiteral` to >>> `NSRegularExpression`. >>> - Have no default `RegularExpressionLiteralType` for now so that, in the >>> future, we can discuss and design a Swift standard library regular >>> expression type, which is justifiable because we've baked in language >>> support for the literal. This can be postponed. >> >> This approach could make sense, but it makes a couple of assumptions that >> I’m not certain are the right way to go (to be clear, I’m not certain that >> they’re wrong either!). >> >> Things I’d like to carefully consider: >> >> 1) We could make the compiler parse and validate regex literals at compile >> time: >> >> a) this allows the compiler to emit diagnostics (with fixits!) on malformed >> literals. >> >> b) When the compiler knows the grammar of the regex, it can precompile the >> regex into a DFA table or static executable code, rather than runtime >> compiling into a bytecode. >> >> c) however, the compiler can’t parse the literal unless it knows the dialect >> it corresponds to. While we could parameterize this somehow (e.g. as a >> requirement in ExpressibleByRegularExpressionLiteral), if we weren’t bound >> by backwards compatibility, we would just keep things simple and say “there >> is one and only one grammar”. I’d argue that having exactly one grammar >> supported by the // syntax is also *better* for users, rather than saying >> “it depends on what library you’re passing the regex into”. > > I think we've circled back to a topic that we've discussed here before. I do > agree that having more of this validation at compile time would improve the > experience. However, I can see a few drawbacks to the _compiler_ doing the > validation: > > - In the absence of a `constexpr`-like facility, supporting runtime > expressions would mean we'd be writing the same code twice, once in C++ for > compile-time validation of literal expressions and another time in Swift for > runtime expressions. > > - As seen in these discussions about string literals where users want to copy > and paste text and have it "just work," supporting only one dialect in regex > literals will inevitably lead users to ask for other types of regex literals > for each individual flavor of regex they encounter. > > Just like ExpressibleByDictionaryLiteral doesn't deduplicate keys, leaving > that to Dictionary, I think regex literals are better off not validating > literal expressions (or, maybe, doing only the barest sanity check), leaving > the rest to concrete regex types. As you point out with validation of integer > overflows during constant folding, we could get enough compile-time > validation even without teaching the compiler itself how to validate the > literal. > >> 2) I’d like to explore the idea of making // syntax be *patterns* instead of >> simply literals. As a pattern, it should be possible to bind submatches >> directly into variable declarations, eliminating the need to count parens in >> matches or other gross things. Here is strawman syntax with a dumb example: >> >> if case /([a-zA-Z]+: let firstName) ([a-zA-Z]+: let lastName)/ = >> getSomeString() { >> print(firstName, lastName) >> } > > This is an interesting idea. But is it significantly more usable than the > same type having a collection of named matches using the usual Perl syntax? > > if case /(?<firstName>[a-zA-Z]+) (?<lastName>[a-zA-Z]+)/ = getSomeString() { > print(Regex.captured["firstName"], Regex.captured["lastName"]) > } >
Definitely. Not only is it much more readable, it is much safer as well, as the compiler will tell you that a name is not defined on a typo. Furthermore, as Chris suggested, this can be extended to directly get out other types than strings in a typesafe was (which should be extendible to user defined types conforming to a specific protocol). >> 3) I see regex string matching as the dual to string interpolation. We >> already provide the ability for types to specify a default way to print >> themselves, and it would be great to have default regex’s associated with >> many types, so you can just say “match an Int here” instead of having to >> match [0-9]+ and then do a failable conversion to Int outside the regex. >> >> >> 4) I’d like to consider some of the advances that Perl 6 added to its regex >> grammar. Everyone knows that modern regex’s aren’t actually regular anyway, >> so it begs the question of how far to take it. If nothing else, I >> appreciate the freeform structure supported (including inline comments) >> which make them more readable. > > Sounds like we want multiline regex literals :) Absolutely. -Thorsten > >> We should also support a dynamic regex engine as well, because there are >> sometimes reasons to runtime construct regex’s. This could be handled by >> having the Regex type support a conversion from String or something, >> orthogonal to the language support for regex literals/patterns. > _______________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
