[swift-evolution] Resolving identifier vs. operator debates

Ethan Tira-Thompson via swift-evolution Wed, 04 Oct 2017 16:45:14 -0700

Forking from discussion in “A path forward on rationalizing unicode identifiers 
and operators”, it was suggested to put this in a new thread.



Background:

Swift partitions the character set into operators and identifiers to aid in 
efficient parsing.  This has the unfortunate side effect that the language spec 
shoulders the burden of how to classify the thousands of unicode characters, 
and it must do so universally across all users and contexts.

There are many characters with ambiguous usage, such as denoting the transpose 
of matrix A as Aᵀ.  The notation is specifically using a superscript T, but 
this is also fundamentally a latin letter and the unicode code point is found 
in the phonetic extensions block, not the math symbols block.  In general, many 
symbols could refer either to an action (operator), or the result of that 
action (identifier), or have disparate domain-specific meanings.  Should the 
language spec really be in the business of deciding the ‘right’ use of each 
character like this?

I also assert a lot of the bad reputation of custom operators comes from 
languages which have limited operator character sets, which forces developers 
to overload standard operators with surprising effects, instead of choosing a 
symbol which is both unique and better recognized for the task at hand.  
Allowing developers to choose apt operator symbols is akin to encouraging 
descriptive identifiers.  Writing good code is all about making these choices 
appropriately, and that requires context, which only the end developer has.

To be clear, this will most likely be relegated to niche applications serving 
domain experts. As established below the default behavior is to opt-out of 
exotic operator choices.  But given a user who wishes to do so, better to give 
them the right tools for the purpose.


Goals:

1. Performance: file-local operator decisions (don’t require loading all the 
imports first)
2. Maintenance: improve operator auditing/discoverability
3. Functionality: let users write what they want without lobbying this list
4. Well defined: aid in resolving conflicts between modules


Pitch:

Enable users to ‘import' specific operator symbols on a per-file basis, 
updating the operator set used for parsing that file.

In the simplest form this would look like:
        import operator ᵀ

This is only needed for “non-standard” operators.  But by providing this escape 
hatch, we can be conservative about choosing “standard” operators to a smaller, 
well known set and avoid a lot of debate without sacrificing expressibility.

When this import is encountered, then any matching operator declarations are 
made available simply because the character is interpreted as such.  (i.e. all 
modules’ operators are loaded as normal, but the compiler can only make the 
connection in files that opt-in to interpreting that character as an operator.) 
 Conversely, conflicting module identifiers become inaccessible following such 
an import, and hopefully good API would supply less exotic alternative 
interfaces for both cases.  Worst case the user could write an extension in a 
new file with the complementary character choice and remap offending 
operator/identifiers as they see fit.

Regarding operator declarations, one could suggest that the declaration itself 
could update the operator character set for that file.  However I suggest 
always requiring the import operator statement (for non-standard operators) 
partly to surface guidance when a choice of operator will require explicit 
imports from other files.  This also reduces potential for obfuscation by 
operators with visually similar representation, as the import list would draw 
attention to this chicanery.


Advanced Pitch:

The previous provides the “minimum viable product”, but we might like to take 
this a little further and make it module-specific:
        import matrixlib (operators: [ᵀ,·,⊗])

Again, only “non-standard” operators need to be listed, the “standard” 
operators would import the same as today.  But now as readers we can see where 
special operators are coming from, and potentially filter competing 
declarations from different modules.  I also like that an operator family can 
be listed on a single line rather than potentially a dozen lines covering 
various combinations.  A module vendor can concisely document its operator list 
and make it easy to maintain and discover.

This syntax mimics a module “init” call, which could be a powerful concept for 
future extensions.  For example, we could introduce “standardOperators: false” 
to disable the automatic import of standard operators overloads—which some 
users might appreciate regardless of character set issues.  (e.g. users could 
select between conflicting standard operators in different modules, or just 
peace of mind there’s no surprises.)

I anticipate this form would take a bit more work to implement, as Swift would 
need to filter of the visibility of operators per module based on the 
declarations in the current file.  However, these two versions can work 
together.  The first form provides a global import across modules, and the 
module-specific form can be added later.


What do people think?

Thanks,
 -Ethan






_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

[swift-evolution] Resolving identifier vs. operator debates

Reply via email to