Re: [swift-evolution] [Draft] scope-based submodules

Brent Royal-Gordon via swift-evolution Tue, 28 Feb 2017 22:56:55 -0800

> On Feb 24, 2017, at 11:34 AM, Matthew Johnson via swift-evolution 
> <[email protected]> wrote:
> 
> Scope-based submodules
> 
>       • Proposal: SE-NNNN
>       • Authors: Matthew Johnson
>       • Review Manager: TBD
>       • Status: Awaiting review


Well, this is certainly comprehensive! Sorry about the delay in answering; I've 
been hosting a house guest and haven't had a lot of free time.

> The primary goal of this proposal are to introduce a unit of encapsulation 
> within a module that is larger than a file as a means of adding explicit 
> structure to a large program. All other goals are subordinate to this goal 
> and should be considered in light of it. 

I agree with this as the primary goal of a submodule system.

> Some other goals of this proposal are:
> 
>       • Submodules should help us to manage and understand the internal 
> dependencies of a large, complex system.
>       • Submodules should be able to collaborate with peer submodules without 
> necessarily being exposed to the rest of the module.
>       • A module should not be required to expose its internal submodule 
> structure to users when symbols are exported.
>       • It should be possible to extract a submodule from existing code with 
> minimal friction. The only difficulty should be breaking any circular 
> dependencies.

One goal I don't see mentioned here is "segment the API surface exposed to 
importing code". The `UIGestureRecognizerSubclass` use case has been thoroughly 
discussed, but I think there are probably a lot of cases where there are two 
"sides" to an API and it'd often be helpful to hide one unless it's needed. 
`URLProtocol` and `URLProtocolClient` come to mind; the many weird little 
classes and symbols related to `NSAtomicStore` and `NSIncrementalStore` might 
be another.

(I'm not necessarily suggesting that the Foundation and Core Data overlays 
should move these into submodules—I'm suggesting that, if they were implemented 
in a Swift with a submodule feature, they would be candidates for submodule 
encapsulation.)

> Submodule names form a hierarchical path:
> 
>       • The fully qualified name of the submodule specified by 
> Submodule.InnerSubmodule is: MyModuleName.Submodule.InnerSubmodule.
>       • In this example, InnerSubmodule is a child of Submodule.
>       • A submodule may not have the same name as any of its ancestors. This 
> follows the rule used by types.

Does being in a nested submodule have any semantic effect, or is it just a 
naming trick?

> Submodules may not be extended. They form strictly nested scopes.
> 
>       • The only way to place code in a submodule is with a submodule 
> declaration at the top of a file.
>       • All code in a file exists in a single submodule.

I'm a big supporter of the 1-to-N submodule-to-file approach.

> There are several other ways to specify which submodule the top-level scope 
> of a file is in. All of these alternatives share a crucial problem: you can’t 
> tell what submodule your code is in by looking at the file. 
> 
> The alternatives are:
> 
>       • Use a manifest file. This would be painful to maintain.
>       • Use file system paths. This is too tightly coupled to physical 
> organization. Appendix A discusses file system independence in more detail.
>       • Leave this up to the build system. This makes it more difficult for a 
> module to support multiple build systems.

I'm going to push back on this a little. I don't like the top-of-file 
`submodule` declaration for several reasons:

        1. Declarations in a Swift file are almost always order-independent. 
There certainly aren't any that must be the first thing in the file to be valid.

        2. Swift usually keeps configuration stuff out of source files so you 
can copy and paste snippets of code or whole files around with minimum fuss. 
Putting `submodule` declarations in files means that developers would need to 
open and modify those files if they wanted to copy them to a different project. 
(It's worth noting that your own goal of making it easy to extract submodules 
into separate modules is undermined by submodule declarations inside files.)

        3. However you're organizing your source code—whether in the file 
system, an IDE project, or whatever else—it's very likely that you will end up 
organizing files by submodule. That means either information about submodules 
will have to be specified twice—once in a canonical declaration and again in 
source file organization—and kept in sync, or IDEs and tooling will have to 
interpret the `submodule` declarations in source files and reflect that 
information in their UIs.

        4. Your cited reason for rejecting build system-based approaches is 
that "This makes it more difficult for a module to support multiple build 
systems", but Swift has this same problem in *many* other parts of its design. 
For instance, module names and dependencies are build system concerns, despite 
the fact that this makes it harder to support multiple build systems. I can 
only conclude that supporting multiple build systems with a single code base 
is, in the long term, a non-goal, presumably by improving the Xcode/SwiftPM 
story in some way.

I'm still a fan of build-system-based approaches because I think they're better 
about these issues. The only way that they're worse is that—as you note—it may 
not be clear which submodule a particular file is in. But I think this is 
basically a UI problem for editors.

> Top-level export
> 
> All export statements consist of an access modifier, the export keyword, and 
> a submodule name:
> 
> open export ChildSubmodule

Is the access control keyword mandatory?

If we do indeed use `submodule` statements, could we attach the attributes to 
them, rather than having a separate statement in a different file?

What does it mean if a `public` or `open` symbol is in a submodule which is not 
`export`ed?

>       • A submodule may be published under a different external name using 
> the export as NewName syntax*.

What's the use case for this feature?

>       • @implicit causes symbols from the submodule to be implicitly imported 
> when the module is imported.
>       • @inline causes the symbols from the submodule to appear as if they 
> had been declared directly within the top-level submodule.

So if you write `@implicit public export Bar` in module `Foo`, then writing 
`import Foo` also imports `Foo.Bar.Baz` *as* `Foo.Bar.Baz`, whereas `@inline 
public export Bar` copies `Foo.Bar.Baz` into `Foo`, so it imports as `Foo.Baz`?

What's the use case for supporting both of these behaviors?

> Exports within the module
> 
> A submodule may bound the maximum visibility of any of its descendent 
> submodules by explicitly exporting it:

I'm not sure how valuable this feature is in this kind of submodule design.

***

To avoid being coy, here's the export control model that *I* think would make 
the most sense for this general class of submodule system designs:

1. A submodule with `public` or `open` symbols is importable from outside the 
module. There is no need to separately mark the submodule as importable.

2. Normally, references within the module to submodule symbols need to be 
prefixed with the submodule name. (That is, in top-level `Foo` code, you need 
to write `Bar.Baz` to access `Foo.Bar.Baz`). As a convenience, you can import a 
submodule, which makes the submodule's symbols available to that file as though 
they were top-level module symbols.

3. When you import a submodule, you can mark it with `@exported`; this 
indicates that the symbols in that submodule should be aliased and, if `public` 
or `open`, re-exported to other modules.

4. There are no special facilities for renaming submodules or implicitly 
importing submodules.

> Importing submodules
> 
> Submodules are imported in exactly the same way as an external module by 
> using an import statement.

Okay, but what exactly does importing *do*? Set up un-prefixed private aliases 
for the submodule's internal-and-up APIs?

> There are a few additional details that are not applicable for external 
> modules:
> 
>       • Circular imports are not allowed.

Why not? In this design, all submodules are evaluated at once, so I'm not sure 
why circular imports would be a problem.

> // `Grandparent` and all of its descendents can see `Child1` (fully 
> qualified: `Grandparent.Parent.Child1`)
> // This reads: `Child1` is scoped to `Grandparent`.
> 
> scoped(Grandparent) export Child1
> 
> // `Child2` is visible throughout the module but may not be exported for use 
> by clients.
> // This reads: `Child2` is scoped to the module.
> 
> scoped(module) export Child2
> 
> With parameterization, scoped has the power to specify all access levels that 
> Swift has today:
> 
> `scoped`                                      == `private` (Swift 3)
> `scoped(file)`                                == `private` (Swift 2 & 4?) == 
> `fileprivate` (Swift 3)
> `scoped(submodule)`                           == `internal`
> `scoped(public) scoped(internal, inherit)`*   == `public`
> `scoped(public)`                              == `open`
> 
> The parameterization of scoped also allows us to reference other scopes that 
> we cannot in today’s system, specifically extensions: scoped(extension) and 
> outer types: scoped(TypeName).

What is the purpose of creating more verbose aliases for existing access 
levels? I can't think of one, which means that these are redundant.

And if we remove them as redundant, the remaining access control levels look 
like:

        scoped
        private
        scoped(TypeName)
        internal
        scoped(SomeModule)
        scoped(module)
        scoped(extension)
        public
        open

There's just no logic to the use of the `scoped` keyword here—it doesn't really 
mean anything other than "we didn't want to assign a keyword to this access 
level".

***

I think we need to go back to first principles here. The reason to introduce a 
new access level is that we believe that a submodule is a large enough unit of 
code that it will simultaneously need to encapsulate some of its implementation 
details from other submodules, *and* have some of its own implementation 
details encapsulated from the rest of the submodule. Thus, we need at least 
three access levels within a submodule: one that exposes an API to other 
submodules, one that exposes an API throughout a submodule, and one that 
exposes it to only part of a submodule.

What we do *not* need is a way to allow access only from certain other named 
submodules. The goal is to separate external and internal interfaces, not to 
micromanage who can access what.

Basically, that means we need one of two things. Keeping all existing keywords 
the same—i.e., not removing either `private` or `fileprivate`— and using 
`semi-` as a placeholder, we want to either have:

        private: surrounding scope
        fileprivate: surrounding file
        semi-internal: surrounding submodule
        internal: surrounding module
        public: all modules (no subclassing)
        open: all modules (with subclassing)

Or:

        private: surrounding scope
        fileprivate: surrounding file
        internal: surrounding submodule
        semi-public: surrounding module
        public: all modules (no subclassing)
        open: all modules (with subclassing)

The difference between the two is that, with `semi-internal` below `internal`, 
submodule APIs are exposed by default to other submodules; with `semi-public` 
above `internal`, submodule APIs are encapsulated by default from other 
submodules.

I think encapsulating by default is the right decision, so we want the 
`semi-public` design. But there's also a second reason to use that design: We 
can anticipate another use case for it. The library resilience design document 
discusses the idea of "resilience domains"—groups of libraries whose versions 
are always matched, and which therefore don't need to use resilient 
representations of each others' data structures—and the idea of having "SPIs", 
basically APIs that are only public to certain clients. I think these ideas 
could be conflated, so that a semi-public API would be available both to other 
submodules in the module and to other libraries in your resilience domain, and 
that this feature could be used to expose SPIs.

So, that leaves an important question: what the hell do you call this thing? My 
best suggestions are `confidential` and `privileged`; in the context of 
information, these are both used to describe information which *is* shared, but 
only within a select group. (Think, for instance, of attorney-client privilege: 
You can share this information with your lawyer, but not with anyone else.)

So in short, I suggest adding a single access level to the existing system:

        private
        fileprivate
        internal
        confidential/privileged
        public
        open

This is orthogonal to any other simplification of the access control system, 
like removing `private` or `fileprivate`.

> Appendix A: file system independence

I think we need to decide: Is a translation unit of some sort—whether it's a 
physical on-disk file or some simulacrum like a database record or just a 
separate string—something intrinsic to Swift? I think it should be; it 
simplifies a lot of parts of the language that would otherwise require nesting 
and explicit scoping.

If translation units are an implicit part of Swift, then this section is not 
really necessary. If translation units aren't, then we need to rethink a lot of 
things that are already built in.

-- 
Brent Royal-Gordon
Architechies

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Draft] scope-based submodules

Reply via email to