Lambda Specification, Part G: Type Inference

Navigation: Overview - Part A - Part B - Part C - Part D - Part E - Part F - Part G - Part H - Part J
Sections: 18 - 18.1 - 18.1.1 - 18.1.2 - 18.1.3 - 18.2 - 18.2.1 - 18.2.1.1 - 18.2.1.2 - 18.2.2 - 18.2.3 - 18.2.4 - 18.2.5 - 18.3 - 18.4 - 18.5 - 18.5.1 - 18.5.2 - 18.5.3 - 18.5.4 - 4.10.4 - 15.12.2.7 - 15.12.2.8
Version 0.9.3. Copyright © 2014 Oracle America, Inc. Legal Notice.

Summary

A variety of compile-time analyses require reasoning about types that are not yet known. Principal among these are generic method applicability testing and generic method invocation type inference. In general, we refer to the process of reasoning about unknown types as type inference.

At a high level, type inference consists of reducing compatibility assertions about expressions or types, called constraint formulas, to a set of bounds on inference variables. For example, given inference variable α, it might be determined that the constraint formula String[] <: α[] reduces to the lower bound String <: α. As bounds are inferred, they are incorporated into a bound set. Ultimately, this bound set is resolved to produce an instantiation for each of the inference variables.

To infer a generic method's applicability, we assert that the arguments to the method are compatible with their corresponding formal parameter types, and reduce these constraints to a bound set. We then check that there exists a resolution of the resulting bounds.

To infer the type of a generic method invocation, we re-use the inferred bound set from method applicability testing, augmenting it with the results of asserting the compatibility of the method's return type with its target type. If any invocation arguments are implicitly-typed lambda expressions or inexact method references, we resolve their targeted parameter types as necessary in order to allow their compatibility constraints to be safely reduced. We also assert that any exceptions thrown by lambda expression bodies are contained by the throws clauses of their target function types. After reduction, we resolve the inference variables and use them as type arguments to determine the invocation type of the method.

Type inference is also used to determine: i) a target functional interface type for a lambda expression that is assigned to a wildcard-parameterized functional interface; and ii) whether an applicable method is more specific than another applicable, generic method.

The following details of reduction are especially noteworthy:

When a generic method invocation or diamond constructor invocation appears as the argument to another invocation, the target type of the nested invocation is the targeted formal parameter type. If the target type contains inference variables, the inference variables and bounds from the nested invocation are "lifted" into the outer inference analysis, and dependencies between the outer and inner inference variables are inferred. Ultimately, information from an outermost invocation can be "pushed down" to nested invocations via these dependencies.

When a lambda expression appears as the argument to a generic invocation, and its target type includes inference variables, we proceed by:

Thus, lambda bodies can influence the instantiation of inference variables appearing in the targeted function type's return type, and lambda parameter types, if explicit, can influence the instantiation of inference variables appearing in the targeted function type's parameter types.

A method reference appearing as the argument to a generic invocation is handled in much the same way.

18 Type Inference [New]

A variety of compile-time analyses require reasoning about types that are not yet known. Principal among these are generic method applicability testing (18.5.1) and generic method invocation type inference (18.5.2). In general, we refer to the process of reasoning about unknown types as type inference.

At a high level, type inference can be decomposed into three processes:

These processes interact closely: reduction can trigger incorporation; incorporation may lead to further reduction; and resolution may cause further incorporation.

We elaborate in the following sections.

Discussion and motivation:

  1. This section introduces a new chapter to replace the current specification's definition of type argument inference in 15.12.2.7 and 15.12.2.8.

    In comparison to JLS 7, important changes to inference include:

    • Adding support for lambda expressions and method references as method invocation arguments.
    • Generalizing to define inference in terms of poly expressions, which may not have well-defined types until after inference is complete. This has the notable effect of improving inference for nested generic method and diamond constructor invocations.
    • Describing how inference is used to handle wildcard-parameterized functional interface target types and most-specific method analysis.
    • Clarifying the distinction between invocation applicability testing (which involves only the invocation arguments) and invocation type inference (which incorporates a target type).
    • Delaying resolution of all inference variables, even those with lower bounds, until invocation type inference, in order to get better results.
    • Improving inference behavior for interdependent (or self-dependent) variables.
    • Eliminating bugs and potential sources of confusion. This revision more carefully and precisely handles the distinction between specific conversion contexts and subtyping, and describes reduction by paralleling the corresponding non-inference relations. Where there are intentional departures from the non-inference relations, these are explicitly identified as such.
    • Laying a foundation for future evolution: enhancements to or new applications of inference will be easier to integrate into the specification.
  2. When reading this material for the first time, section 18.1 is useful for establishing the proper vocabulary; but then it may be useful for context to skip ahead to 18.5 and save the details of the algorithm (18.2, 18.3, and 18.4) for last.

18.1 Concepts and Notation [New]

This section defines inference variables, constraint formulas, and bounds, as the terms will be used throughout this chapter. It also presents the notation we will use.

18.1.1 Inference Variables [New]

Inference variables are meta-variables for types—that is, they are special names that allow us to reason about types abstractly. To distinguish them from type variables, we represent inference variables with Greek letters, principally α.

We'll loosely use the term "type" in this chapter to include type-like syntax that contains inference variables. We use the term proper type to exclude such "types" that mention inference variables. Assertions that involve inference variables are assertions about every proper type that can be produced by replacing each inference variable with a proper type.

18.1.2 Constraint Formulas [New]

Constraint formulas are assertions of compatibility or subtyping that may involve inference variables. The formulas may take one of the following forms: [jsr335-18.1.2-10]

Examples of constraint formulas:

18.1.3 Bounds [New]

During the inference process, a set of bounds on inference variables is maintained. A single bound has one of the following forms: [jsr335-18.1.3-10]

A bound is satisfied by an inference variable substitution if, after applying the substitution, the assertion is true. (The bound false can never be satisfied.) [jsr335-18.1.3-20]

Some bounds relate an inference variable to a proper type. Let T be a proper type. Given a bound of the form α = T or T = α, we say T as an instantiation of α. Similarly, given a bound of the form α <: T, we say T is a proper upper bound of α, and given a bound of the form T <: α, we say T is a proper lower bound of α. [jsr335-18.1.3-30]

Other bounds relate two inference variables, or an inference variable to a type that contains inference variables. Such bounds, of the form S = T or S <: T, are called dependencies. [jsr335-18.1.3-40]

A bound of the form G<α1, ..., αn> = capture(G<A1, ..., An>) indicates that α1, ..., αn are placeholders for the results of capture. This is necessary because capture conversion can only be performed on a proper type, and the inference variables in A1, ..., An may not yet be resolved.

A bound of the form throws α is purely informational: it directs resolution to optimize the instantiation of α so that, if possible, it is not a checked exception.

An important intermediate result of inference is a bound set. It is sometimes convenient to refer to an empty bound set with the literal true; this is merely out of convenience, and the two are interchangeable.

Examples of bound sets:

When inference begins, a bound set is typically generated from a list of type parameter declarations P1, ..., Pp and associated inference variables α1, ..., αp. Such a bound set is constructed as follows. For each l (1 ≤ l ≤ p): [jsr335-18.1.4-80]

Discussion and motivation:

The strategy described here of decomposing a TypeBound into individual types separated by &, allowing some components to be treated as proper bounds and others as dependencies, is a slight enhancement to the JLS 7 behavior, which treats the bound as a monolithic intersection type.

18.2 Reduction [New]

Reduction is the process by which a set of constraint formulas (18.1.2) is simplified to produce a bound set (18.1.3).

Each constraint formula is considered in turn. A "current" bound set is initially empty. The rules in this section outline how the formula is reduced to one or both of:

Reduction completes when no further constraint formulas remain to be reduced.

The results of a reduction step are always soundness-preserving: if an inference variable instantiation satisfies the reduced constraints and bounds, it will also satisfy the original constraint. On the other hand, reduction is not completeness-preserving: there may exist inference variable instantiations that satisfy the original constraint but do not satisfy a reduced constraint or bound. This is due to inherent limitations of the algorithm, along with a desire to avoid undue complexity. One effect is that there are expressions for which type argument inference fails to find a solution, but that can be well-typed if the programmer explicitly inserts appropriate types.

Discussion and motivation:

The reduction rules defined in JLS 7 are neither soundness- nor completeness-preserving. The reduction rules described here improve upon this by guaranteeing soundness and reducing the set of invocations which inference is unable to handle. The rules are also, of course, expanded to support the new language features.

Unsoundness in JLS 7 arises, among other things, out of the fact that false and true results are treated interchangeably—neither has any impact on the current bound set. It is not a fundamental type safety problem, because the results of inference are checked for validity before an expression is considered well-typed. But it has the unfortunate effect of covering up bugs: if the spec misses an important constraint, there's not a hard line that says that's "wrong." The claim of soundness here raises the bar: if the rules miss a constraint that is necessary to guarantee that the initial constraint formula is true, that is a failing of the rules, and it must be corrected to make the spec internally consistent.

In addition, a soundness-preserving approach avoids the potential confusion that could arise if, say, inference unsoundly ignored certain difficult-to-analyze lambda expression arguments. (The effect might be that inference gets "lucky" in guessing valid type arguments in one scenario, only to fail when a seemingly unrelated adjustment changes the results.)

18.2.1 Expression Compatibility Constraints [New]

A constraint formula of the form ⟨Expression → T⟩ is reduced as follows: [jsr335-18.2.1-10]

Discussion and motivation:

  1. By treating nested generic method invocations as poly expressions, we improve the behavior of inference for nested invocations. For example, the following is illegal in Java 7 but legal in Java 8:
    ProcessBuilder b = new ProcessBuilder(Collections.emptyList());
    // Constructor expects a List<String>

    When both the outer and the nested invocation require inference, the problem is more difficult. For example:

    List<String> ls = new ArrayList<>(Collections.emptyList());

    Our approach is to "lift" the bounds inferred for the nested invocation (simply { α <: Object } in the case of emptyList here) into the outer inference process (in this case, trying to infer β where the constructor is for type ArrayList<β>). We also infer dependencies between the nested inference variables and the outer inference variables (the constraint List<α>Collection<β> would reduce to the dependency α = β). In this way, resolution of the inference variables in the nested invocation can wait until additional information can be inferred from the outer invocation (based on the assignment target, β = String).

  2. An alternative strategy for handling nested invocations was considered and rejected: in certain cases, the inference variables of the outer call are resolved before considering the nested call; typing of the nested call is delayed until a proper target type can be provided. In other cases, the nested call is treated as a standalone invocation.

    This has the advantage of avoiding the extra complexity involved with reduction producing new inference variables and dependencies. But there are two problems with this approach:

    • It is not powerful enough to handle situations in which information from both the outer and inner calls needs to be synthesized in order to choose an appropriate inference variable instantiation.
    • There's not an obvious rule for choosing whether the outer or the inner call should be resolved first, and so the resulting behavior is very ad hoc.

18.2.1.1 Lambda Expression Compatibility [New]

A constraint formula of the form ⟨LambdaExpression → T⟩, where T mentions at least one inference variable, is reduced as follows, following the compatibility rules outlined in 15.27.3. [jsr335-18.2.1.2-10]

If T is not a functional interface type (9.8), the constraint reduces to false. [jsr335-18.2.1.2-14]

Otherwise, a ground target type, T', must be derived from T, as described in 15.27.3. If 18.5.3 is used to derive a functional interface parameterization, then the test that F<A'1, ..., A'm> is a subtype of F<A1, ..., Am> is not performed (this is instead asserted with a constraint formula below). If no valid type can be found, the constraint reduces to false. [jsr335-18.2.1.2-20]

Otherwise, the congruence of the LambdaExpression with the function type of T' is asserted as follows: [jsr335-18.2.1.2-30]

Discussion and motivation:

  1. The key piece of information to derive from a compatibility constraint involving a lambda expression is the set of bounds on inference variables appearing in the target function type's return type. This is crucial, because functional interfaces are often generic, and many methods operating on these types are generic, too.

    In the simplest case, a lambda expression may simply provide a lower bound for an inference variable:

    <T> List<T> makeThree(Factory<T> factory) { ... }
    String s = makeThree(() -> "abc").get(2);
    

    In more complex cases, a result expression may be a poly expression—perhaps even another lambda expression—and so the inference variable might be passed through multiple constraint formulas with different target types before a bound is produced.

    Most of the work described in this section precedes assertions about the result expressions; its purpose is to derive the lambda expression's function type, and to check for expressions that are clearly disqualified from compatibility.

    We do not attempt to produce bounds on inference variables that appear in the target function type's throws clause. This is because exception containment is not part of compatibility (15.27.3)—in particular, it must not influence method applicability (18.5.1). However, we do get bounds on these variables later, because invocation type inference (18.5.2) produces exception containment constraint formulas (18.2.5).

  2. If the target type is an inference variable, or if the target type's parameter types contain inference variables, we produce false. During invocation type inference (18.5.2), extra substitutions are performed in order to instantiate these inference variables, thus avoiding this scenario. (In other words, reduction will, in practice, never be "invoked" with a target type of one of these forms.)
  3. The result expressions of a lambda expression are required to be compatible in an assignment context with the target return type (see 15.27.3). Consider a targeted return type, R. If R is a proper type (e.g., Byte derived from Function<α,Byte>), assignability is easy enough to test, and we do so. If R is not a proper type (e.g., α derived from Function<String,α>), we make the simplifying assumption that loose invocation compatibility will be sufficient.

    The difference between assignment compatibility and loose invocation compatibility is that assignment allows narrowing of constant expressions (e.g., Byte b = 100). So this simplifying assumption is not completeness-preserving: given target return type α and an integer literal return expression, 100, it is possible that α could be instantiated to Byte; however, reduction will not produce such a bound.

18.2.1.2 Method Reference Compatibility [New]

A constraint formula of the form ⟨MethodReference → T⟩, where T mentions at least one inference variable, is reduced as follows, following the compatibility rules outlined in 15.13.1. [jsr335-18.2.1.3-10]

If T is not a functional interface type (9.8), or if T is a functional interface type but does not have a function type, the constraint reduces to false. [jsr335-18.2.1.3-12]

Otherwise, if there does not exist a potentially-applicable method for the method reference when targeting T (15.13.1), the constraint reduces to false. [jsr335-18.2.1.3-20]

Otherwise, there are two cases. If the method reference is exact (15.13.1), then let P1, ..., Pn be the parameter types of the function type of T; let P'1, ..., P'k be the parameter types of the potentially-applicable method. The compatibility constraint reduces to a new set of constraints, as follows: [jsr335-18.2.1.3-30]

If, instead, the method reference is inexact (15.13.1), then if one or more of the function type's parameter types is not a proper type, the constraint reduces to false. [jsr335-18.2.1.3-35]

Otherwise, a search for a compile-time declaration is performed, as defined in 15.13.1. If there does not exist a compile-time declaration for the method reference, the constraint reduces to false. [jsr335-18.2.1.3-16]

If a compile-time declaration can be found then there are three cases: [jsr335-18.2.1.3-40]

Discussion and motivation:

The strategy used to determine a return type for a generic referenced method follows the same pattern as for generic method invocations (18.2.1). This may involve "lifting" bounds into the outer context and inferring dependencies between the two sets of inference variables.

18.2.2 Type Compatibility Constraints [New]

A constraint formula of the form ⟨S → T⟩ is reduced as follows: [jsr335-18.2.2-10]

The fourth and fifth cases are considered uses of unchecked conversion (5.1.9). These, along with any use of unchecked conversion in the first case, may result in unchecked warnings, and may influence a method's invocation type (15.12.2.6). [jsr335-18.2.2-12]

Discussion and motivation:

  1. Boxing T to T' is not completeness-preserving: for example, if T were long, S might be instantiated to Integer, which is not a subtype of Long, but could be unboxed and then widened to long. We avoid this problem in most cases by giving special treatment to inference-variable return types that we know are already constrained to be certain boxed primitive types. See 18.5.2.
  2. Similarly, the treatment of unchecked conversion sacrifices completeness in cases in which T is not a parameterized type (for example, if T is an inference variable). It is not usually clear in such situations whether the unchecked conversion is necessary or not. Since unchecked conversions introduce unchecked warnings, inference prefers to avoid them unless it is clearly necessary.

18.2.3 Subtyping Constraints [New]

A constraint formula of the form ⟨S <: T⟩ is reduced as follows: [jsr335-18.2.3-10]

A constraint formula of the form ⟨S <= T⟩, where S and T are type arguments (4.5.1), is reduced as follows: [jsr335-18.2.3-20]

18.2.4 Type Equality Constraints [New]

A constraint formula of the form ⟨S = T⟩, where S and T are types, is reduced as follows: [jsr335-18.2.4-10]

A constraint formula of the form ⟨S = T⟩, where S and T are type arguments (4.5.1), is reduced as follows: [jsr335-18.2.4-20]

Discussion and motivation:

Note that we do not address intersection types here, because it is impossible for reduction to encounter an intersection type that is not a proper type.

18.2.5 Checked Exception Constraints [New]

A constraint formula of the form LambdaExpression →throws T is reduced as follows: [jsr335-18.2.5-14]

A constraint formula of the form MethodReference →throws T is reduced as follows: [jsr335-18.2.5-15]

Discussion and motivation:

  1. Constraints on checked exceptions are handled separately from constraints on return types, because return type compatibility influences applicability of methods (18.5.1), while exceptions only influence the invocation type after overload resolution is complete (18.5.2). This could be simplified by including exception compatibility in the definition of lambda expression compatibility (15.27.3), but this would lead to possibly-surprising cases in which exceptions that can be thrown by an explicitly-typed lambda body change overload resolution.
  2. The exceptions thrown by a lambda body cannot be determined until i) the parameter types of the lambda are known, and ii) the target type of result expressions in the body is known. (The second requirement is to account for generic method invocations in which, for example, the same type parameter appears in the return type and the throws clause.) Hence, we require both of these, as derived from the target type T, to be proper types.

    One consequence is that lambda expressions returned from other lambda expressions cannot generate constraints from their thrown exceptions. These constraints can only be generated from top-level lambda expressions.

  3. The handling of the case in which more than one inference variable appears in a function type's throws clause is not completeness-preserving. Either variable may, on its own, satisfy the constraint that each checked exception be declared, but we can't be sure which one is intended. So, for predictability, we constrain them both.

18.3 Incorporation [New]

Whenever a bound set contains two or more bounds, it is possible that new bounds can be inferred based on the assertions of the original bounds. As bound sets are constructed and grown during inference, new bounds are incorporated by identifying complementary pairs of bounds and inferring further information in this way. [jsr335-18.3-10]

Specifically, for any pair of bounds in a bound set matching one of the following rules, a new constraint formula is inferred. In the rules, S and T are inference variables or types, and U is a proper type. For conciseness, a bound of the form α = T may also match a bound of the form T = α. [jsr335-18.3-12]

If two bounds have the form α <: S and α <: T, and if for some generic class or interface, G, there exists a supertype (4.10) of S of the form G<S1, ..., Sn> and a supertype of T of the form G<T1, ..., Tn>, then for all i, 1 ≤ i ≤ n, if Si and Ti are types (not wildcards), the constraint ⟨Si = Ti is implied. [jsr335-18.3-14]

Additionally, when a bound set contains a bound of the form G<α1, ..., αn> = capture(G<A1, ..., An>), further bounds or constraints may be inferred. Let P1, ..., Pn represent the type parameters of G, B1, ..., Bn represent the bounds (4.4) of these type parameters, and θ represent the substitution [P1:=α1, ..., Pn:=αn]. Let R be a type that is not an inference variable (but is not necessarily a proper type). Then for all i (1 ≤ i ≤ n): [jsr335-18.3-20]

The resulting constraint formulas are then reduced, and any new bounds are added to the bound set. This may trigger further incorporation; ultimately, the set will reach a fixed point and no further bounds can be inferred. [jsr335-18.3-14]

A bound set that is fully incorporated, if it does not contain the bound false, has the following properties:

Discussion and motivation:

The assertion that incorporation reaches a fixed point oversimplifies the matter slightly. Building on the work of Kennedy and Pierce, On Decidability of Nominal Subtyping with Variance, this property can be proven by making the argument that the set of types that may appear in the bound set is finite. The argument relies on two assumptions:

The specification does not currently guarantee these properties (it is vague about the handling of wildcards when reducing subtyping constraints, and does not detect expansive inheritance paths), but is likely to do so in a future version.

(Notably, this is not a new problem: the existing subtyping algorithm is similarly at risk of non-termination.)

18.4 Resolution [New]

Given a bound set that does not contain the bound false, a subset of the inference variables mentioned by the bound set may be resolved. This means that a satisfactory instantiation may be added to the set for each inference variable, until all the requested variables have instantiations. Dependencies in the bound set may require that the variables be resolved in a particular order, or that additional variables be resolved. [jsr335-18.4-10]

Consider a bound of one of the following forms, where T is either an inference variable β or a type that mentions β: [jsr335-18.4-22]

Given such a bound, if α appears on the left-hand side of another bound of the form G<..., α, ...> = capture(G<...>), then β depends on the resolution of α. Otherwise, α depends on the resolution of β. [jsr335-18.4-22]

An inference variable α appearing on the left-hand side of a bound of the form G<..., α, ...> = capture(G<...>) depends on the resolution of every other inference variable mentioned in this bound (on both sides of the = sign). [jsr335-18.4-20]

An inference variable α also depends on the resolution of itself, as well as the resolution of an inference variable β if there exists an inference variable γ such that α depends on the resolution of γ and γ depends on the resolution of β. [jsr335-18.4-24]

Given a set of inference variables to resolve, let V be the union of this set and all variables upon which the resolution of at least one variable in this set depends. [jsr335-18.4-30]

If every variable in V has an instantiation, then the resolution procedure terminates. Otherwise, let { α1, ..., αn } be a non-empty subset of uninstantiated variables in V such that i) for all i, 1 ≤ i ≤ n, if αi depends on the resolution of a variable β, then either β has an instantiation or there is some j such that β = αj; and ii) there exists no non-empty proper subset of { α1, ..., αn } with this property. [jsr335-18.4-32]

Then resolution proceeds by generating an instantiation for each of α1, ..., αn. [jsr335-18.4-34]

If, for all i (1 ≤ i ≤ n), the bound set does not contain a bound of the form G<..., αi, ...> = capture(G<...>), we first define candidate instantiations, T1, ..., Tn. For all i (1 ≤ i ≤ n), define Ti as follows: [jsr335-18.4-40]

If incorporating the bounds α1 = T1, ..., αn = Tn with the current bound set produces a set that does not include false, then this is the new bound set; resolution proceeds by selecting a new set of variables to instantiate (if necessary), as described above. [jsr335-18.4-42]

Otherwise, a second attempt is made to instantiate { α1, ..., αn }, as follows: [jsr335-18.4-50]

If the resulting type variables Z1, ..., Zn do not have well-formed bounds (that is, a lower bound is not a subtype of an upper bound), then resolution fails. Otherwise, the new bound set is produced by removing all bounds of the form G<..., αi, ...> = capture(G<...>) (for all i, 1 ≤ i ≤ n) and incorporating α1 = Z1, ..., αn = Zn. If the result contains the bound false, resolution fails. Otherwise, resolution proceeds by selecting a new set of variables to instantiate (if necessary), as described above. [jsr335-18.4-52]

Discussion and motivation:

The first attempt at instantiating an inference variable derives the instantiation from that variable's bounds. Sometimes, however, complex dependencies mean that the result is not within the variable's bounds. In case of failure, a second attempt is performed, analogous to capture conversion (5.1.10): fresh type variables are introduced, with bounds derived from the bounds of the inference variables. (Note that the lower bounds of these "capture" variables are computed using only proper types: this is important in order to avoid attempts to perform typing computations on uninstantiated type variables.)

18.5 Uses of Inference [New]

Using the inference processes defined above, the following analyses are performed at compile-time.

18.5.1 Invocation Applicability Inference [New]

Given a method invocation that provides no explicit type arguments, the following process, modeled after the applicability rules in 15.12.2.2, 15.12.2.3, & 15.12.2.4, determines whether a potentially-applicable generic method m is applicable: [jsr335-18.5.1-10]

Discussion and motivation:

Consider the following method invocation and assignment:

List<Number> ln = Arrays.asList(1, 2.0);

A most-specific applicable method for the invocation must be identified as described in 15.12. The only potentially-applicable method (15.12.2.1) is declared as follows:

public static <T> List<T> asList(T... a)

Trivially (because of its arity), this method is neither applicable by strict invocation (15.12.2.2) nor applicable by loose invocation (15.12.2.3). But since there are no other candidates, in a third phase the method is checked for applicability by variable-arity invocation.

The initial bound set, B, is a trivial upper bound for a single inference variable, α:

{ α <: Object }

The initial constraint formula set is as follows:

{ ⟨1 → α⟩, ⟨2.0 → α⟩ }

These are reduced to a new bound set, B1:

{ α <: Object, Integer <: α, Double <: α }

Then, to test whether the method is applicable, we attempt to resolve these bounds. We succeed, producing the rather complex instantiation

α = Number & Comparable<? extends Number & Comparable<?>>
We have thus demonstrated that the method is applicable; since no other candidates exist, it is the most-specific applicable method. The type of the method invocation, and its compatibility with the target type in the assignment, is not determined until further inference can occur, as described in the next section.

18.5.2 Invocation Type Inference [New]

Consider a method invocation that provides no explicit type arguments, and a corresponding most-specific applicable generic method m. The following process is used to infer the invocation type (15.12.2.6) of the chosen method. [jsr335-18.5.2-10]

Invocation type inference may require carefully sequencing the reduction of constraint formulas of the forms ⟨Expression → T⟩, ⟨LambdaExpression →throws T⟩, and ⟨MethodReference →throws T⟩. To facilitate this sequencing, the input variables of these constraints are defined as follows: [jsr335-18.5.2-20]

The output variables of these constraints are all inference variables mentioned by the type on the right-hand side of the constraint, T, that are not input variables. [jsr335-18.5.2-22]

Discussion and motivation:

  1. It is important to note that two "rounds" of inference are involved in finding the type of a method invocation. This is necessary to allow a target type to influence the type of the invocation without allowing it to influence the choice of an applicable method. The first round produces a bound set and tests that a resolution exists, but does not commit to that resolution. The second round reduces additional constraints and then performs a second resolution, this time "for real."

    The distinction between these two rounds of inference was confused in previous iterations of the JLS (compare the discussion of inference in 15.12.2.2 vs. 15.12.2.6; also see 15.12.2.8, which makes ambiguous use of the term "now").

  2. Let's revisit the example from the previous section:
    List<Number> ln = Arrays.asList(1, 2.0);

    The most-specific applicable method is declared as follows:

    public static <T> List<T> asList(T... a)

    In order to complete type-checking of the method invocation, we must determine whether it is compatible with its target type, List<Number>.

    The bound set used to demonstrate applicability in the previous section, B2, was:

    { α <: Object, Integer<: α , Double <: α }

    The new constraint formula set is as follows:

    { ⟨List<α>List<Number>⟩ }

    This compatibility constraint produces an equality bound for α, which is included in the new bound set, B3:

    { α <: Object, Integer<: α , Double <: α, α = Number }

    These bounds are trivially resolved:

    α = Number

    Finally, we perform a substitution on the declared return type of asList to determine that the invocation expression has type List<Number>; clearly, this is compatible with the target type.

    Note that this strategy for inference is different than JLS 7, which would have instantiated α based on its lower bounds (before even considering the invocation's target type), as we did in the previous section. This would result in a type error, since the resulting type is not a subtype of List<Number>.

  3. The use of capture when comparing a method result type to its target type is new, and allows for more precise inference.
  4. Under various special circumstances, based on the bounds appearing in B2, we eagerly resolve an inference variable that appears as the return type of the invocation. This is to avoid unfortunate situations in which the usual constraint, ⟨R θ → T⟩, is not completeness-preserving. It is, unfortunately, possible that by eagerly resolving the variable, we are unable to make use of bounds that would be inferred later. It is also possible that, in some cases, bounds that will later be inferred from the invocation arguments (such as implicitly-typed lambda expressions) would have caused a different outcome if they had been present in B2. Despite these limitations, the strategy allows for reasonable outcomes in typical use cases, and preserves backwards-compatibility with the JLS 7 algorithm.

18.5.3 Functional Interface Parameterization Inference [New]

Where a lambda expression with explicit parameter types P1, ..., Pn targets a wildcard-parameterized functional interface, F<A1, ..., Am>, a parameterization of F may be derived as the ground target type of the expression, according to the following process. [jsr335-18.5.3-20]

Discussion and motivation:

  1. In order to determine the function type of a wildcard-parameterized functional interface, we have to "instantiate" the wildcard type arguments with specific types. The "default" approach is to simply replace the wildcards with their bounds, as described in 9.8. But this produces spurious errors in cases in which a lambda expression has explicit parameter types that do not correspond to the wildcard bounds. For example:
    Predicate<? super Integer> p = (Number n) -> n.equals(23);

    The lambda expression in this example is a Predicate<Number>, which is a subtype of Predicate<? super Integer> but not Predicate<Integer>. The analysis in this section is used to infer that Number is an appropriate choice for the type argument to Predicate.

  2. The analysis here, while described in terms of general type inference, is intentionally quite simple. The only constraints are equality constraints, which means that reduction amounts to simple pattern matching.

    A more powerful strategy might also infer constraints from the body of the lambda expression. But, given possible interactions with inference for surrounding and/or nested generic method invocations, this would introduce a lot of extra complexity.

18.5.4 More Specific Method Inference [New]

When testing that one applicable method is more specific than another (15.12.2.5), where the second method is generic, it is necessary to test whether some instantiation of the second method's type parameters can be inferred to make the first method more specific than the second. [jsr335-18.5.4-10]

Let m1 be the first method and m2 be the second. Let exp1, ..., expk be the argument expressions of the corresponding invocation. [jsr335-18.5.4-12]

Where m2 has type parameters P1, ..., Pp, let α1, ..., αp be inference variables. Let θ be the substitution [P1:=α1, ..., Pp:=αp]. [jsr335-18.5.4-14]

If m1 and m2 are applicable by strict or loose invocation (15.12.2.2, 15.12.2.3), then let S1, ..., Sk be the parameter types of m1, and let T1, ..., Tk be the result of θ applied to the parameter types of m2. If m1 and m2 are applicable by variable arity invocation (15.12.2.4), then let S1, ..., Sk be the first k variable-arity parameter types of m1, and let T1, ..., Tk be the result of θ applied to the first k variable-arity parameter types of m2. [jsr335-18.5.4-16]

Note that no substitution is applied to S1, ..., Sk; even if m1 is generic, the type parameters of m1 are treated as type variables, not inference variables.

First, an initial bound set B is constructed from the declared bounds of P1, ..., Pp, as described in 18.1.3. [jsr335-18.5.4-20]

Second, for all i, 1 ≤ i ≤ k, a set of constraint formulas or bounds is generated. [jsr335-18.5.4-30]

If Si and Ti are proper types, the result is true if Si is more specific than Ti for expi (15.12.2.5), and false otherwise. [jsr335-18.5.4-32]

Otherwise, if Si is a functional interface type and Ti is a parameterization of functional interface, I, and none of the following is true: [jsr335-18.5.4-33]

Then let U1, ..., Uk and R1 be the parameter types and return type, respectively, of the function type of the capture of Si, and let V1, ..., Vk and R2 be the parameter types and return type, respectively, of the function type of Ti. The following constraints or bounds are generated: [jsr335-18.5.4-34]

Otherwise, the constraint ⟨Si <: Ti is generated. [jsr335-18.5.4-36]

Third, if m2 is applicable by variable arity invocation and has k+1 parameters, then where Sk+1 is the k+1th variable-arity parameter type of m1 and Tk+1 is θ applied to the k+1th variable-arity parameter type of m2, the constraint ⟨Sk+1 <: Tk+1 is generated. [jsr335-18.5.4-40]

Fourth, the generated bounds and constraint formulas are reduced and incorporated with B to produce a bound set B'. [jsr335-18.5.4-50]

If B' does not contain the bound false, and the inference variables in B' can be resolved, then m1 is more specific than m2. Otherwise, m1 is not more specific than m2. [jsr335-18.5.4-52]

Discussion and motivation:

The use of inference to compare methods during the more-specific test is not new, but it has been "lifted" out of 15.12.2.5 into this new section. The special treatment of functional interfaces is new, mirroring the changes to 15.12.2.5 for non-generic methods.

4.10.4 Least Upper Bound of Reference Types [New]

The least upper bound, or lub, of a set of reference types is a shared supertype that is more specific than any other shared supertype (that is, no other shared supertype is a subtype of the least upper bound). This type, lub(U1, ..., Uk), is determined as follows. [jsr335-4.10.4-10]

If k = 1, then the lub is the type itself: lub(U) = U. [jsr335-4.10.4-20]

Otherwise, for each Ui (1 ≤ i ≤ k), we write ST(Ui) for the set of supertypes of Ui, and define the erased supertype set of Ui: [jls-15.12.2.7-430]

EST(Ui) = { V | W in ST(Ui) and V = |W| } where |W| is the erasure of W. [jls-15.12.2.7-440]

The reason for computing the set of erased supertypes is to deal with situations where the set of types includes several distinct invocations of a generic type declaration.

For example, given List<String> and List<Object>, simply intersecting the sets ST(List<String>) = { List<String>, Collection<String>, Object } and ST(List<Object>) = { List<Object>, Collection<Object>, Object } would yield a set { Object } , and we would have lost track of the fact that T can safely be assumed to be a List.

In contrast, intersecting EST(List<String>) = { List, Collection, Object } and EST(List<Object>) = { List, Collection, Object } yields { List, Collection, Object } , which will eventually enable us to produce List<?> as described below.

The erased candidate set for a set of types, EC, is the intersection of all the sets EST(Ui) for each Ui in U1, ..., Uk. [jls-15.12.2.7-450]

The minimal erased candidate set for U1, ..., Uk is: [jls-15.12.2.7-460]

MEC = { V | V in EC, and for all W ≠ V in EC, it is not the case that W <: V } [jls-15.12.2.7-470]

Because we are seeking to infer more precise types, we wish to filter out any candidates that are supertypes of other candidates. This is what computing MEC accomplishes.

In our running example, we had EC = { List, Collection, Object } , and now MEC = { List } .

The next step will be to recover type arguments for the erased types in MEC.

...

The definition of the lci and lcta functions from 15.12.2.7 in JLS 7 should be inserted here.

...

Finally, we define the least upper bound based on all the elements of the minimal erased candidate set of U1, ..., Uk. If any of these elements are generic, we use the CandidateInvocation function to recover the type argument information.

Define Candidate(W) = CandidateInvocation(W) if W is generic, W otherwise. [jls-15.12.2.7-530]

Then lub(U1, ..., Uk) is Candidate(W1) & ... & Candidate(Wr), where Wi (1 ≤ i ≤ r) are the elements of MEC. [jls-15.12.2.7-540]

Strictly speaking, the lub function defined here only approximates a least upper bound. Formally, there may exist some other type T such that all of U1, ..., Uk are subtypes of T and T is a subtype of lub(U1, ..., Uk). However, a Java compiler's implementation of lub must be consistent with the above definition. [jsr335-4.10.4-90]

It is possible that the lub computation yields an infinite type. This is permissible, and a Java compiler must recognize such situations and represent them appropriately using cyclic data structures. [jls-15.12.2.7-600]

The possibility of an infinite type stems from the recursive calls to lub. Readers familiar with recursive types should note that an infinite type is not the same as a recursive type.

Discussion and motivation:

This section is almost entirely lifted from 15.12.2.7, with only minor changes to eliminate references to type inference (such as the inference variable Tj).

15.12.2.7 Inferring Type Arguments Based on Actual Arguments [Modified]

Compare JLS 15.12.2.7

This section is deleted; it has been replaced by Chapter 18, principally 18.2 and 18.5.1.

15.12.2.8 Inferring Unresolved Type Arguments [Modified]

Compare JLS 15.12.2.8

This section is deleted; it has been replaced by Chapter 18, principally 18.3, 18.4, and 18.5.2.