(Looking for some feedback on real-world code usage. Please read to the end,
then if you can experiment with the code you work on and report back, I'd
appreciate it!)
Switch expressions, from a type checking perspective, are basically
generalizations of conditional expressions: instead of 2 operands to check, we
have n.
A reasonable expectation is that, if I rewrite my conditional expression as a
switch expression, it will behave the same:
test ? foo() : bar()
is equivalent to
switch (test) { case true -> foo(); case false -> bar(); }
So, as a starting point, the typing rules for switches should be the same as
the typing rules for conditionals, but generalized to an arbitrary number of
results.
(The "results" of a switch expression are all expressions appearing after a
'->' or a 'break'.)
Conditional expressions and switch expressions are typically used as poly
expressions (in a context that has a target type). But that won't always be the
case. One notable usage that doesn't have a target type is an initializer for
'var': "var x = ...". So they are sometimes poly expressions, sometimes
standalone.
Conditional expression typing is driven by an ad hoc categorization scheme
which looks at the result expressions and tries to predict whether they will
all have type boolean/Boolean, primitive/boxed number, or something else/a mix
("tries to predict" because in some cases we can't type-check the expression
until we've completed the categorization).
In the numeric case, we then identify the narrowest primitive type that can
contain the results.
In the other/mixed case, we then type check by pushing down a target type, or,
if none is available, producing a reference type from the lub operation.
A couple of observations:
- The primitive vs. reference choice is meaningful, because the primitive and
reference type hierarchies are different (e.g., int can be widened to long, but
Integer can't be widened to Long). Preferring primitive typing where possible
seems like the right choice.
- The ad hoc categorization is a bit of a mess. It's complex and imperfect.
What people probably expect is that, where a target type is available, that's
what the compiler will use—but the compiler ignores the target type in the
primitive cases.
Why? Well, in 8, when we introduced target typing of conditionals, we
identified some incompatibilities that would occur if we changed the handing of
primitives, and we didn't want to be disruptive.
Some examples:
Boolean x = test ? z : zbox; // specified: can NPE; target typing: no null check
Integer x = test ? s : i; // specified: ok; target typing: can't convert
short->Integer
Number x = test ? s : i; // specified: box to Integer; target typing: box to
Short or Integer
double d = test ? l : f; // specified: long->float loses precision; target
typing: long->double better precision
m(test ? z : zbox); // specified: prefers m(boolean); target typing: m(boolean)
and m(Boolean) are ambiguous
At this point, we've got a choice:
A) Fully mimic the conditional behavior in switch expressions
B) Do target typing (when available) for all switch expressions, diverging from
conditionals
C) Do target typing (when available) for all switches and conditionals,
accepting the incompatibilities
(A) sacrifices simplicity. (B) sacrifices consistency. (C) sacrifices
compatibility.
General thoughts on simplicity (is the current behavior hard to understand?)
and consistency (is it bad if the conditional/switch refactoring leads to
subtly different typing?) are welcome.
And we could use some clarification is just how significant the compatibility
costs of (C) are. With that in mind, here's a javac patch:
http://cr.openjdk.java.net/~dlsmith/logPrimitiveConditionals.patch
A javac built with this patch supports an option that will output diagnostics
wherever conditionals at risk of incompatible change are detected:
javac -XDlogPrimitiveConditionals Foo.java
If you're able to build OpenJDK with this patch and run it on some real-world
code, I'd appreciate any insights about what you find.
—Dan