FAQ for JEP 286 (Local Variable Type Inference)

Inference

How do we infer the type of a variable that has multiple assignments?

If we have

 var x = new Foo();
 ...
 x = new Bar();

we compute the type of x based solely on the type of the initializer. So the above code is equivalent to:

 Foo x = new Foo();
 ...
 x = new Bar();

If Bar is a subtype of Foo this code is fine, otherwise it is a type error -- just like before. Local variable type inference is purely local; we compute the type of x from its initializer, and thereafter, it is as if it were manifestly typed.

It would be possible to attempt to infer a type compatible with all assignments, say by taking the least upper bound of all types to which x were assigned, but this is likely to cause more problems that it solves. Such types are often complex and surprising (such as intersection types like Comparable & Serializable), and would lead to "action at a distance" errors. The simple scheme outlined here seems far preferable.

If the type of the initializer is ArrayList, can you infer List instead, as we probably would have written by hand?

That would be confusing type inference with mind reading.

Why is it not possible to use var when the initializer is an array initializer?

We agree it would be nice to be able to write:

 var ints = { 1, 2, 3 }

The way the feature works, though, is: we derive the type of the variable by treating the initializer as a standalone expression, and deriving its type. However, array initializers, like lambdas and method refs, are poly expressions -- they need a target type in order to compute their type. So they are rejected.

Could we make this work? We probably could. But it would add a lot of complexity to the feature, for the benefit of a mostly corner case. We prefer to err on the side of simplicity here.

Will this work for the index variable of a for loop or foreach loop?

Yes.

What happens if we ask for inference on both sides?

If you say:

 var x = new ArrayList<>()

then you're asking for the compiler to infer both the type argument to List, and the type of x. But you've not provided enough type information for the compiler to do a good job.

In most cases, you'll get an informative compiler error telling you that you're asking for your mind to be read. In some cases, we'll fall back to inferring Object, as we currently do with:

 Object o = new ArrayList<>()  // always inferred ArrayList<Object> here

How does it work with variables that don't have initializers?

It doesn't; this is an error.

Can I mix explicit and implicit types in the same program?

Yes.

Can I say "final var" to enforce immutability?

Yes.

Compatibility

Will this interfere with existing uses of var and as variable names, package names, or method names?

No.

Will this interfere with existing uses of var and as a class name?

Yes. We consider this to be an acceptable source incompatibility, because these violate the long-standing naming convention that class names should start with an uppercase letter (and are likely to be quite rare.)

General

Is this dynamic typing?

No! Variables are still statically typed, as they have always been. What's happening here is that you have the opportunity to let the compiler figure out the type, rather than typing it out -- but the variable still has a fixed static type.

Won't bad developers misuse this feature to write terrible code?

Yes. But good developers will also be able to use this feature to write clear and readable code with less ceremony.

Doesn't this reduce readability?

When used properly, no; poor readability is likely to stem from elsewhere, such as poorly chosen variable names. It is easy to point to code like:

 var x = y.getFoo()

as evidence of "see, its unreadable, you have no idea what 'x' is." But the readability problem here is that 'x' is just a poorly chosen variable name, not the lack of manifest type.

While every situation may be different, we believe that, with judicious use of this feature in well-written code, readability is actually enhanced. Consider a block of locals:

 UserModelHandle userDB = broker.findUserDB();
 List<User> users = db.getUsers();
 Map<User, Address> addressesByUser = db.getAddresses();

The most important thing on each line is the variable name -- because this describes the role of the variable in the current program. And the variables names are not so easy to visually pick out from the above code -- they're stuck in the middle of each line, and at a different place on each line.

This feature moves the most important thing to be front-and-center in the reader's view. With inferred types:

 var userDB = broker.findUserDB();
 var users = db.getUsers();
 var addressesByUser = db.getAddresses();

the true intent of this code pops out much more readily; the variable names are (almost) front and center. The lack of manifest types is not an impediment, because we've chosen good variable names.

Another aspect in which this feature could improve readability is that users frequently construct complex nested and chained expressions, not because this is the most readable way to write the code, but because the overhead of declaring additional temporary variable seems burdensome. By reducing this overhead, implementation patterns will likely reequilibrate to a less artificially-condensed form, enhancing readability.

What do you mean by "action at a distance"?

Type inference is constraint solving. The main choices a language designer gets to make in designing a type inference algorithm are where to get the constraints, and when (over what scope) to solve. We could, if we chose, let all assignments to a variable contribute constraints, and solve globally; while this produces nice sharp types (though sometimes surprising types) when it works, it produces extremely confusing error messages when it doesn't, and means that a change anywhere in a program could affect things far away in the program.

For example, if we followed the approach of using all assignments to constrain the type of an implicitly typed variable, if we had:

var x = "";
...
x = anInteger;

the compiler might compute the type of x by taking the least upper bound of String and Integer. (You might think this would be Object, but really is something more like Object&Serializable&Comparable, if not more complicated.)

Action-at-a-distance refers to the fact that a use of x several hundred lines away could change the type of x, and cause confusing errors nowhere near where the change occurred.

Why don't we infer types for variables that use the C-style int x[] convention?

In part, this would be asking for half-inference; "I want x to be an array with a given rank, but please infer the component type." Valid, but that's basically a different feature. Besides, this convention is, at this point, mostly vestigial.

Why exclude field declarations?

Field and method declarations are part of a classes interface contract, and can be referenced from other classes (meaning their type descriptors will be copied into other classfiles, and dynamically linked by descriptor match at runtime.) This means that small changes to the program implementation could silently turn into binary incompatibilities if the type changes subtly and the client is not recompiled.

The operating theory here is that local variables are implementation details -- they are part of the method implementation, not part of the classes interface contract. They cannot be referenced from other compilation units or other methods. Therefore, a lower degree of ceremony is needed than when specifying interface contracts across compilation units.

But you could allow inference on private fields or private method return types, as these are not part of the public API. Why not?

Yes, we could choose to include these; this would be a tradeoff of broader applicability for greater complexity. We're inclined to keep it simple.

You ran the prototype over the JDK as a corpus, and gathered statistics. Can you run it over a larger source base?

The JDK is pretty large, but yes, we already did this over a corpus of ~100 popular OSS projects including Eclipse, NetBeans, and many Apache projects. We got essentially the same numbers.

Why not do "left hand diamond" instead?

Left-hand diamond is a viable feature, and has pros and cons. It helps for helps for generic types, but doesn't help at all for long type names (AbstractBeanProviderFactory), which means it would not be an either/or choice, but an either/both choice.

Tooling

How do we debug programs with inferred types?

We expect tooling will evolve to help here. In languages that have this feature, IDEs will show you the type as you hover over the variable name -- most Java IDEs already do this. (We are also considering providing a javac option to produce a source file that shows all inferred types (lambda formals, generic method type parameters, diamond constructor args, inferred locals) which would be useful in debugging.)

Does this affect the runtime or classfile?

No.

Will this affect runtime performance?

No; there is no runtime component to this feature.

Will this affect compiler performance?

In any realistic case, no. (Even with a manifest type, the compiler still has to synthesize the type of the initializer, and perform a subtyping check against the manifest type.)

Does this affect Javadoc?

No; local variables do not appear in Javadoc.

Will it require targeting the latest class file?

There are no classfile changes mandated by this feature.