Hey everyone, I've been thinking for a while now about the advantages of
nillable types, especially for basic types, and I'm interested in making an
official language proposal. I'd love to know what the rest of the community
thinks about it, and if this discussion has been had before (I found some
vaguely similar past proposals but they were all specifically about making
pointers easier to use, which is not the same thing).
Thanks,
Dan
# Proposal: Nillable types
## Summary
I propose that for any type T, the type `T | nil` should also be a valid
type with the following properties:
1. The zero value of a `T | nil` is `nil`.
2. Any `T` or `nil` may be used as a value of `T | nil`
3. It is a compile-time error to use a `T | nil` as a `T` without first
checking that it is non-nil.
So basically, you should be able to write code like
```go
func Foo(x int32 | nil) {
if x != nil {
fmt.Println("X squared is", x*x)
} else {
fmt.Println("X is undefined")
}
}
```
## Motivation
There are loads of examples out there (I can dig up a bunch if it would be
helpful) where various go programs need values that can take some "unset"
value that is distinct from the zero value of that type. Examples include
configuration objects, command line flags, and quite a lot of large
structured data types (protobuf used to have every field be a pointer for
this reason, though in v3 they got rid of this in their go implementation
because it was so frustrating to do, and now go programs just can't
distinguish unset values from zero values in protobufs).
### Alternatives and why they're bad
There are three common ways I've seen of dealing with this in Go, and they
all have serious limitations:
#### Pointers
The variable `var x *int32` allows you to distinguish between `x == nil`
and `*x == 0`. However, using a pointer has three major drawbacks:
1. No compile-time protection
Go does not force developers to put nil guards on pointers, so nothing will
stop you from dereferencing `x` without checking if it's nil. I've lost
count of the number of times I've seen code that panics in corner cases
because someone assumed that some pointer would always be set.
2. Passed values are mutable
If I want a function that can take an integer or an unset marker, and I
define it as `func foo(x *int32)`, then users of my library just have to
trust me that calling `foo(&some_var)` won't change the value of
`some_var`; this breaks an important encapsulation boundary.
3. No literals
You can't take the address of a literal, so instead of simply writing e.g.
`x = 3`, you have to either use a temporary variable:
```
tmp := 3
x = &tmp
```
or write a helper function that does this for you. Numerous libraries have
implemented these helper functions - for example, both
https://pkg.go.dev/github.com/openconfig/ygot/ygot and
https://pkg.go.dev/github.com/golang/protobuf/proto define helpers named
`Bool`, `Float32`, `Float64`, `Int`, `Int32`, etc. just so that you can
write
```
x = proto.Int32(3)
```
But this makes code a lot more cumbersome to read, and also requires the
developer to restate the type of `x` every time (as compared to `x = 3`,
where Go will infer that `3` means `int32(3)` based on the type of `x`).
[Sourcegraph finds more than 10k results for functions that just take an
int and return a pointer to
it](https://sourcegraph.com/search?q=context:global+/func+Int%5C%28%5Cw%2B+int%5C%29+%5C*int/+lang:go&patternType=keyword&sm=0)
so this is coming up A LOT.
#### Sentinel values
Some code uses explicit "impossible" values to indicate unset. For example,
a nonegative value might have type `int` and be set to -1 to indicate that
it is unset. However, this fails the criteria that the zero value for the
type should be unset. It also requires that every function using this value
check for -1 before using the value, and the compiler cannot enforce that
this check has been made.
Furthermore, this requires you to use a broader type than the type you
actually care about, which may be impossible (e.g. if the value can be any
float) or extremely unwieldy (e.g. if you have to use an integer to
represent a bool).
#### An additional bool
You can also approximate this by using a struct like
```go
struct {
X int32
IsZero bool
}
```
The zero value for this struct has `IsZero=false`, so you can use that to
determine that `X` is not explicitly 0, but is in fact unset. However, this
is confusing (what does it mean if IsZero is true but X is not 0?) and
awkward (you have to remember to set IsZero any time you set X to 0, and to
check IsZero any time you want to read X), and again, the compiler will not
complain if you fail to do these.
An example of using BOTH a sentinel value AND an additional bool in golang
itself is:
https://github.com/golang/go/blob/68d3a9e417344c11426f158c7a6f3197a0890ff1/src/crypto/x509/x509.go#L724
. The `MaxPathLen` value is considered "unset" if it's set to -1 OR if it's
set to 0 and the bool `MaxPathLenZero` is false.
This is necessary if you want to be able to mark it unset in a single line
(`cert.MaxPathLen=-1`) but also have it be unset on any zero-valued (i.e.
uninitialized) certificate. But as a consequence, every single use of
MaxPathLen has to be guarded by multiple checks and any attempt to set it
has to be careful about setting the 0 indicator as well; forgetting to take
both possibilities into account would break your handling of X.509
certificates (which could even be a security issue, if you're rolling your
own certificate handler instead of using an existing one).
## Nillable Types
If Go supported nillable types, an example like `MaxPathLen` would be
written simply as
`MaxPathLen uint32 | nil`. The zero value would be nil (unset), setting it
to a specific value would be easy (`c.MaxPathLen=5`), and setting it to nil
would also be easy (`c.MaxPathLen=nil`).
Moreover, the compiler could enforce at compile time that a developer can't
forget the possibility of nil.
### Syntax
The syntax would simply be that any type can have `| nil` appended to it to
make a new, nillable type. For the sake of sanity, it would be reasonable
to generate syntax errors on redundant constructs like `int | nil | nil`.
This syntax is (in my opinion) more readable than some other languages'
syntaxes for the same (e.g. `int? x` in C#) and more concise that most
other languages (e.g. `x: typing.Optional[int]` in Python or `x :: Maybe
Int` in haskell)
Types would be checked by type assertions:
```go
func foo2(x int | nil) {
if i, ok := x.(int); ok {
fmt.Println("x squared is", i*i)
} else {
fmt.Println("No value for x")
}
}
```
As with other type assertions, you may omit `ok` if you're sure that the
value will match, but it will panic if you're wrong:
```go
func foo3(x int | nil) {
if x != nil {
fmt.Println("x squared is", x.(int)*x.(int))
} else {
fmt.Println("No value for x")
}
}
```
### Nice-to-have: Implicit type guards
Ideally, the compiler would also infer simple type guards so that we
wouldn't need intermediate variables or unchecked type assertions:
```go
func foo(x int | nil) {
if x != nil {
fmt.Println("x squared is", x*x)
} else {
fmt.Println("No value for x")
}
}
```
i.e. the compiler would infer that `x` cannot be nil inside the `if` block
and therefore must be an int. Obviously this is impossible to do for
arbitrary expressions, but typecheckers in numerous other languages (e.g.
`mypy` and typescript) do recognize simple `if x != None` checks as type
guards; I have no idea whether it would be difficult to add this to Go.
### Expressed in terms of pointers
You could think of `T | nil` as being like a `*T` except with easier syntax
for using it and not passed by reference; you could implement it solely in
terms of AST transformations if you wanted to by having the following
statements correspond to each other:
`var x int32 | nil` -> `var _x *int`
`x = nil` -> `_x = nil`
`x = 3` -> `var _tmp int32 = 3; _x = &_tmp`
`y, ok := x.(int)` -> `var y int32, ok bool; if _x == nil { ok = false }
else { y = *_x; ok = true}`
`y := x.(int)` -> `y := *_x`
`foo(x)` -> `if _x == nil { foo(nil) } else { _tmp := *_x; foo(&_tmp) }`
I doubt this would be the most efficient way to actually implement this
feature (I am not an expert on the internal works of the go compiler), but
the fact that it *could* be written this way makes me think it would not be
difficult to add to the language.
### Other implications
This change would be entirely backward-compatible - no existing code would
contain the `T | nil` syntax, so nothing would change in the compilation of
any existing code.
I don't think this would make the language any harder to learn - the syntax
for using it is the same as the syntax for other type assertions, and the
use of | for union types is A) pretty common in other languages, and B)
under discussion as a more general Go feature (#57644). Moreover, it would
make a lot of code more readable: the `x509.go` example from earlier has 16
lines of comments around `MaxPathLen` and `MaxPathLenZero` just to explain
how they interact, and additional comments when they're used explaining
again how they work; none of that would necessary if it were a single
value-or-nil.
Also, this syntax fits nicely with the proposal for more general sum types
(https://github.com/golang/go/issues/57644).
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/golang-nuts/7e4a5552-f761-423c-8dc6-75903529378dn%40googlegroups.com.