> Am 24.11.2017 um 20:13 schrieb Xiaodi Wu via swift-evolution 
> <[email protected]>:
> 
>> On Thu, Nov 23, 2017 at 5:33 PM, Chris Lattner <[email protected]> wrote:
>>> On Nov 23, 2017, at 10:35 AM, Xiaodi Wu via swift-evolution 
>>> <[email protected]> wrote:
>>> This proposed addition addresses a known pain point, to be sure, but I 
>>> think it has many implications for the future direction of the language and 
>>> I'd like to explore them here.
>> 
>> Thanks for writing this up Xiaodi,
>> 
>>> We should certainly move any discussion about regex literals into its own 
>>> thread, but to make it clear that I'm not simply suggesting that we 
>>> implement something in Swift 10 instead of addressing a known pain point 
>>> now, here's a sketch of how Swift 5 could make meaningful progress:
>>> 
>>> - Teach the lexer about basic /pattern/flag syntax.
>>> - Add an `ExpressibleByRegularExpressionLiteral`, where the initializer 
>>> would be something like `init(regularExpressionLiteralPattern: String, 
>>> flags: RegularExpressionFlags)` where RegularExpressionFlags would be an 
>>> OptionSet type.
>>> - Add conformance to `ExpressibleByRegularExpressionLiteral` to 
>>> `NSRegularExpression`.
>>> - Have no default `RegularExpressionLiteralType` for now so that, in the 
>>> future, we can discuss and design a Swift standard library regular 
>>> expression type, which is justifiable because we've baked in language 
>>> support for the literal. This can be postponed.
>> 
>> This approach could make sense, but it makes a couple of assumptions that 
>> I’m not certain are the right way to go (to be clear, I’m not certain that 
>> they’re wrong either!).
>> 
>> Things I’d like to carefully consider:
>> 
>> 1) We could make the compiler parse and validate regex literals at compile 
>> time:
>> 
>> a) this allows the compiler to emit diagnostics (with fixits!) on malformed 
>> literals.  
>> 
>> b) When the compiler knows the grammar of the regex, it can precompile the 
>> regex into a DFA table or static executable code, rather than runtime 
>> compiling into a bytecode.
>> 
>> c) however, the compiler can’t parse the literal unless it knows the dialect 
>> it corresponds to.  While we could parameterize this somehow (e.g. as a 
>> requirement in ExpressibleByRegularExpressionLiteral), if we weren’t bound 
>> by backwards compatibility, we would just keep things simple and say “there 
>> is one and only one grammar”.  I’d argue that having exactly one grammar 
>> supported by the // syntax is also *better* for users, rather than saying 
>> “it depends on what library you’re passing the regex into”.
> 
> I think we've circled back to a topic that we've discussed here before. I do 
> agree that having more of this validation at compile time would improve the 
> experience. However, I can see a few drawbacks to the _compiler_ doing the 
> validation:
> 
> - In the absence of a `constexpr`-like facility, supporting runtime 
> expressions would mean we'd be writing the same code twice, once in C++ for 
> compile-time validation of literal expressions and another time in Swift for 
> runtime expressions.
> 
> - As seen in these discussions about string literals where users want to copy 
> and paste text and have it "just work," supporting only one dialect in regex 
> literals will inevitably lead users to ask for other types of regex literals 
> for each individual flavor of regex they encounter.
> 
> Just like ExpressibleByDictionaryLiteral doesn't deduplicate keys, leaving 
> that to Dictionary, I think regex literals are better off not validating 
> literal expressions (or, maybe, doing only the barest sanity check), leaving 
> the rest to concrete regex types. As you point out with validation of integer 
> overflows during constant folding, we could get enough compile-time 
> validation even without teaching the compiler itself how to validate the 
> literal.
> 
>> 2) I’d like to explore the idea of making // syntax be *patterns* instead of 
>> simply literals.  As a pattern, it should be possible to bind submatches 
>> directly into variable declarations, eliminating the need to count parens in 
>> matches or other gross things.  Here is strawman syntax with a dumb example:
>> 
>> if case /([a-zA-Z]+: let firstName) ([a-zA-Z]+: let lastName)/ = 
>> getSomeString() {
>>    print(firstName, lastName)
>> }
> 
> This is an interesting idea. But is it significantly more usable than the 
> same type having a collection of named matches using the usual Perl syntax?
> 
>   if case /(?<firstName>[a-zA-Z]+) (?<lastName>[a-zA-Z]+)/ = getSomeString() {
>     print(Regex.captured["firstName"], Regex.captured["lastName"])
>   }
> 

Definitely. Not only is it much more readable, it is much safer as well, as the 
compiler will tell you that a name is not defined on a typo. Furthermore, as 
Chris suggested, this can be extended to directly get out other types than 
strings in a typesafe was (which should be extendible to user defined types 
conforming to a specific protocol).


>> 3) I see regex string matching as the dual to string interpolation.  We 
>> already provide the ability for types to specify a default way to print 
>> themselves, and it would be great to have default regex’s associated with 
>> many types, so you can just say “match an Int here” instead of having to 
>> match [0-9]+ and then do a failable conversion to Int outside the regex.
>> 
>> 
>> 4) I’d like to consider some of the advances that Perl 6 added to its regex 
>> grammar.  Everyone knows that modern regex’s aren’t actually regular anyway, 
>> so it begs the question of how far to take it.  If nothing else, I 
>> appreciate the freeform structure supported (including inline comments) 
>> which make them more readable.
> 
> Sounds like we want multiline regex literals :)

Absolutely.

-Thorsten 


> 
>> We should also support a dynamic regex engine as well, because there are 
>> sometimes reasons to runtime construct regex’s.  This could be handled by 
>> having the Regex type support a conversion from String or something, 
>> orthogonal to the language support for regex literals/patterns.
> _______________________________________________
> swift-evolution mailing list
> [email protected]
> https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to