If we have
var x = new Foo();
...
x = new Bar();
we compute the type of x
based solely on the type of the initializer. So the above code is equivalent to:
Foo x = new Foo();
...
x = new Bar();
If Bar
is a subtype of Foo
this code is fine, otherwise it is a type error -- just like before. Local variable type inference is purely local; we compute the type of x
from its initializer, and thereafter, it is as if it were manifestly typed.
It would be possible to attempt to infer a type compatible with all assignments, say by taking the least upper bound of all types to which x
were assigned, but this is likely to cause more problems that it solves. Such types are often complex and surprising (such as intersection types like Comparable & Serializable
), and would lead to "action at a distance" errors. The simple scheme outlined here seems far preferable.
ArrayList
, can you infer List
instead, as we probably would have written by hand?That would be confusing type inference with mind reading.
We agree it would be nice to be able to write:
var ints = { 1, 2, 3 }
The way the feature works, though, is: we derive the type of the variable by treating the initializer as a standalone expression, and deriving its type. However, array initializers, like lambdas and method refs, are poly expressions -- they need a target type in order to compute their type. So they are rejected.
Could we make this work? We probably could. But it would add a lot of complexity to the feature, for the benefit of a mostly corner case. We prefer to err on the side of simplicity here.
Yes.
If you say:
var x = new ArrayList<>()
then you're asking for the compiler to infer both the type argument to List
, and the type of x
. But you've not provided enough type information for the compiler to do a good job.
In most cases, you'll get an informative compiler error telling you that you're asking for your mind to be read. In some cases, we'll fall back to inferring Object
, as we currently do with:
Object o = new ArrayList<>() // always inferred ArrayList<Object> here
It doesn't; this is an error.
Yes.
Yes.
var
and as variable names, package names, or method names?No.
var
and as a class name?Yes. We consider this to be an acceptable source incompatibility, because these violate the long-standing naming convention that class names should start with an uppercase letter (and are likely to be quite rare.)
No! Variables are still statically typed, as they have always been. What's happening here is that you have the opportunity to let the compiler figure out the type, rather than typing it out -- but the variable still has a fixed static type.
Yes. But good developers will also be able to use this feature to write clear and readable code with less ceremony.
When used properly, no; poor readability is likely to stem from elsewhere, such as poorly chosen variable names. It is easy to point to code like:
var x = y.getFoo()
as evidence of "see, its unreadable, you have no idea what 'x' is." But the readability problem here is that 'x' is just a poorly chosen variable name, not the lack of manifest type.
While every situation may be different, we believe that, with judicious use of this feature in well-written code, readability is actually enhanced. Consider a block of locals:
UserModelHandle userDB = broker.findUserDB();
List<User> users = db.getUsers();
Map<User, Address> addressesByUser = db.getAddresses();
The most important thing on each line is the variable name -- because this describes the role of the variable in the current program. And the variables names are not so easy to visually pick out from the above code -- they're stuck in the middle of each line, and at a different place on each line.
This feature moves the most important thing to be front-and-center in the reader's view. With inferred types:
var userDB = broker.findUserDB();
var users = db.getUsers();
var addressesByUser = db.getAddresses();
the true intent of this code pops out much more readily; the variable names are (almost) front and center. The lack of manifest types is not an impediment, because we've chosen good variable names.
Another aspect in which this feature could improve readability is that users frequently construct complex nested and chained expressions, not because this is the most readable way to write the code, but because the overhead of declaring additional temporary variable seems burdensome. By reducing this overhead, implementation patterns will likely reequilibrate to a less artificially-condensed form, enhancing readability.
Type inference is constraint solving. The main choices a language designer gets to make in designing a type inference algorithm are where to get the constraints, and when (over what scope) to solve. We could, if we chose, let all assignments to a variable contribute constraints, and solve globally; while this produces nice sharp types (though sometimes surprising types) when it works, it produces extremely confusing error messages when it doesn't, and means that a change anywhere in a program could affect things far away in the program.
For example, if we followed the approach of using all assignments to constrain the type of an implicitly typed variable, if we had:
var x = "";
...
x = anInteger;
the compiler might compute the type of x
by taking the least upper bound of String
and Integer
. (You might think this would be Object
, but really is something more like Object&Serializable&Comparable
, if not more complicated.)
Action-at-a-distance refers to the fact that a use of x
several hundred lines away could change the type of x
, and cause confusing errors nowhere near where the change occurred.
int x[]
convention?In part, this would be asking for half-inference; "I want x
to be an array with a given rank, but please infer the component type." Valid, but that's basically a different feature. Besides, this convention is, at this point, mostly vestigial.
Field and method declarations are part of a classes interface contract, and can be referenced from other classes (meaning their type descriptors will be copied into other classfiles, and dynamically linked by descriptor match at runtime.) This means that small changes to the program implementation could silently turn into binary incompatibilities if the type changes subtly and the client is not recompiled.
The operating theory here is that local variables are implementation details -- they are part of the method implementation, not part of the classes interface contract. They cannot be referenced from other compilation units or other methods. Therefore, a lower degree of ceremony is needed than when specifying interface contracts across compilation units.
Yes, we could choose to include these; this would be a tradeoff of broader applicability for greater complexity. We're inclined to keep it simple.
The JDK is pretty large, but yes, we already did this over a corpus of ~100 popular OSS projects including Eclipse, NetBeans, and many Apache projects. We got essentially the same numbers.
Left-hand diamond is a viable feature, and has pros and cons. It helps for helps for generic types, but doesn't help at all for long type names (AbstractBeanProviderFactory
), which means it would not be an either/or choice, but an either/both choice.
We expect tooling will evolve to help here. In languages that have this feature, IDEs will show you the type as you hover over the variable name -- most Java IDEs already do this. (We are also considering providing a javac option to produce a source file that shows all inferred types (lambda formals, generic method type parameters, diamond constructor args, inferred locals) which would be useful in debugging.)
No.
No; there is no runtime component to this feature.
In any realistic case, no. (Even with a manifest type, the compiler still has to synthesize the type of the initializer, and perform a subtyping check against the manifest type.)
No; local variables do not appear in Javadoc.
There are no classfile changes mandated by this feature.