Navigation: Overview - Part A - Part B - Part C - Part D - Part E - Part F - Part G - Part H - Part J
Sections: 18 - 18.1 - 18.1.1 - 18.1.2 - 18.1.3 - 18.2 - 18.2.1 - 18.2.1.1 - 18.2.1.2 - 18.2.2 - 18.2.3 - 18.2.4 - 18.2.5 - 18.3 - 18.4 - 18.5 - 18.5.1 - 18.5.2 - 18.5.3 - 18.5.4 - 4.10.4 - 15.12.2.7 - 15.12.2.8
Version 0.9.3. Copyright © 2014 Oracle America, Inc. Legal Notice.
A variety of compile-time analyses require reasoning about types that are not yet known. Principal among these are generic method applicability testing and generic method invocation type inference. In general, we refer to the process of reasoning about unknown types as type inference.
At a high level, type inference consists of reducing compatibility assertions about expressions or types, called constraint formulas, to a set of bounds on inference variables. For example, given inference variable α, it might be determined that the constraint formula ⟨String[]
<: α[]
⟩ reduces to the lower bound String
<: α. As bounds are inferred, they are incorporated into a bound set. Ultimately, this bound set is resolved to produce an instantiation for each of the inference variables.
To infer a generic method's applicability, we assert that the arguments to the method are compatible with their corresponding formal parameter types, and reduce these constraints to a bound set. We then check that there exists a resolution of the resulting bounds.
To infer the type of a generic method invocation, we re-use the inferred bound set from method applicability testing, augmenting it with the results of asserting the compatibility of the method's return type with its target type. If any invocation arguments are implicitly-typed lambda expressions or inexact method references, we resolve their targeted parameter types as necessary in order to allow their compatibility constraints to be safely reduced. We also assert that any exceptions thrown by lambda expression bodies are contained by the throws
clauses of their target function types. After reduction, we resolve the inference variables and use them as type arguments to determine the invocation type of the method.
Type inference is also used to determine: i) a target functional interface type for a lambda expression that is assigned to a wildcard-parameterized functional interface; and ii) whether an applicable method is more specific than another applicable, generic method.
The following details of reduction are especially noteworthy:
When a generic method invocation or diamond constructor invocation appears as the argument to another invocation, the target type of the nested invocation is the targeted formal parameter type. If the target type contains inference variables, the inference variables and bounds from the nested invocation are "lifted" into the outer inference analysis, and dependencies between the outer and inner inference variables are inferred. Ultimately, information from an outermost invocation can be "pushed down" to nested invocations via these dependencies.
When a lambda expression appears as the argument to a generic invocation, and its target type includes inference variables, we proceed by:
Thus, lambda bodies can influence the instantiation of inference variables appearing in the targeted function type's return type, and lambda parameter types, if explicit, can influence the instantiation of inference variables appearing in the targeted function type's parameter types.
A method reference appearing as the argument to a generic invocation is handled in much the same way.
A variety of compile-time analyses require reasoning about types that are not yet known. Principal among these are generic method applicability testing (18.5.1) and generic method invocation type inference (18.5.2). In general, we refer to the process of reasoning about unknown types as type inference.
At a high level, type inference can be decomposed into three processes:
These processes interact closely: reduction can trigger incorporation; incorporation may lead to further reduction; and resolution may cause further incorporation.
We elaborate in the following sections.
Discussion and motivation:
This section introduces a new chapter to replace the current specification's definition of type argument inference in 15.12.2.7 and 15.12.2.8.In comparison to JLS 7, important changes to inference include:
- Adding support for lambda expressions and method references as method invocation arguments.
- Generalizing to define inference in terms of poly expressions, which may not have well-defined types until after inference is complete. This has the notable effect of improving inference for nested generic method and diamond constructor invocations.
- Describing how inference is used to handle wildcard-parameterized functional interface target types and most-specific method analysis.
- Clarifying the distinction between invocation applicability testing (which involves only the invocation arguments) and invocation type inference (which incorporates a target type).
- Delaying resolution of all inference variables, even those with lower bounds, until invocation type inference, in order to get better results.
- Improving inference behavior for interdependent (or self-dependent) variables.
- Eliminating bugs and potential sources of confusion. This revision more carefully and precisely handles the distinction between specific conversion contexts and subtyping, and describes reduction by paralleling the corresponding non-inference relations. Where there are intentional departures from the non-inference relations, these are explicitly identified as such.
- Laying a foundation for future evolution: enhancements to or new applications of inference will be easier to integrate into the specification.
This section defines inference variables, constraint formulas, and bounds, as the terms will be used throughout this chapter. It also presents the notation we will use.
Inference variables are meta-variables for types—that is, they are special names that allow us to reason about types abstractly. To distinguish them from type variables, we represent inference variables with Greek letters, principally α.
We'll loosely use the term "type" in this chapter to include type-like syntax that contains inference variables. We use the term proper type to exclude such "types" that mention inference variables. Assertions that involve inference variables are assertions about every proper type that can be produced by replacing each inference variable with a proper type.
Constraint formulas are assertions of compatibility or subtyping that may involve inference variables. The formulas may take one of the following forms: [jsr335-18.1.2-10]
throws
clause of the function type derived from T. [jsr335-18.1.2-10-F]
throws
clause of the function type of T. [jsr335-18.1.2-10-F]
Examples of constraint formulas:
- From
Collections.singleton("hi")
, we have the constraint⟨Through reduction, this will become the constraint"hi"
→ α⟩⟨String
<: α⟩- From
Arrays.asList(1, 2.0)
, we have the constraints⟨Through reduction, these will become the constraints1
→ α⟩
⟨2.0
→ α⟩⟨and thenint
→ α⟩
⟨double
→ α⟩⟨Integer
<: α⟩
⟨Double
<: α⟩- From the target type of the constructor invocation
List<Thread> lt = new ArrayList<>()
, we have the constraint⟨Through reduction, this eventually becomes the constraintArrayList<
α>
→List<Thread>
⟩⟨α <=and thenThread
⟩⟨α =Thread
⟩
During the inference process, a set of bounds on inference variables is maintained. A single bound has one of the following forms: [jsr335-18.1.3-10]
<
α_{1}, ..., α_{n}>
= capture(G<
A_{1}, ..., A_{n}>
): The variables α_{1}, ..., α_{n} represent the result of capture conversion (5.1.10) applied to G<
A_{1}, ..., A_{n}>
(where A_{1}, ..., A_{n} may be types or wildcards and may mention inference variables). [jsr335-18.1.3-10-D]
throws
α: The inference variable α appears in a throws
clause. [jsr335-18.1.3-10-E]
A bound is satisfied by an inference variable substitution if, after applying the substitution, the assertion is true. (The bound false can never be satisfied.) [jsr335-18.1.3-20]
Some bounds relate an inference variable to a proper type. Let T be a proper type. Given a bound of the form α = T or T = α, we say T as an instantiation of α. Similarly, given a bound of the form α <: T, we say T is a proper upper bound of α, and given a bound of the form T <: α, we say T is a proper lower bound of α. [jsr335-18.1.3-30]
Other bounds relate two inference variables, or an inference variable to a type that contains inference variables. Such bounds, of the form S = T or S <: T, are called dependencies. [jsr335-18.1.3-40]
A bound of the form G<
α_{1}, ..., α_{n}>
= capture(G<
A_{1}, ..., A_{n}>
) indicates that α_{1}, ..., α_{n} are placeholders for the results of capture. This is necessary because capture conversion can only be performed on a proper type, and the inference variables in A_{1}, ..., A_{n} may not yet be resolved.
A bound of the form throws
α is purely informational: it directs resolution to optimize the instantiation of α so that, if possible, it is not a checked exception.
An important intermediate result of inference is a bound set. It is sometimes convenient to refer to an empty bound set with the literal true; this is merely out of convenience, and the two are interchangeable.
Examples of bound sets:
- { α =
String
} contains a single bound, instantiating α asString
.- {
Integer
<: α,Double
<: α, α <:Object
} describes two proper lower bounds and one proper upper bound for α.- { α <:
Iterable<?>
, β <:Object
, α <:List<
β>
} describes a proper upper bound for each of α and β, along with a dependency between them.- { } contains no bounds nor dependencies, and can be referred to as true.
- { false } expresses the fact that no satisfactory instantiation exists.
When inference begins, a bound set is typically generated from a list of type parameter declarations P_{1}, ..., P_{p} and associated inference variables α_{1}, ..., α_{p}. Such a bound set is constructed as follows. For each l (1 ≤ l ≤ p): [jsr335-18.1.4-80]
Object
appears in the set. [jsr335-18.1.4-80-A]
&
in the TypeBound, the bound α_{l} <: T [P_{1}:=α_{1}, ..., P_{p}:=α_{p}] appears in the set; if this results in no proper upper bounds for α_{l} (only dependencies), then the bound α_{l} <: Object
also appears in the set. [jsr335-18.1.4-80-B]
Discussion and motivation:
The strategy described here of decomposing a TypeBound into individual types separated by&
, allowing some components to be treated as proper bounds and others as dependencies, is a slight enhancement to the JLS 7 behavior, which treats the bound as a monolithic intersection type.
Reduction is the process by which a set of constraint formulas (18.1.2) is simplified to produce a bound set (18.1.3).
Each constraint formula is considered in turn. A "current" bound set is initially empty. The rules in this section outline how the formula is reduced to one or both of:
Reduction completes when no further constraint formulas remain to be reduced.
The results of a reduction step are always soundness-preserving: if an inference variable instantiation satisfies the reduced constraints and bounds, it will also satisfy the original constraint. On the other hand, reduction is not completeness-preserving: there may exist inference variable instantiations that satisfy the original constraint but do not satisfy a reduced constraint or bound. This is due to inherent limitations of the algorithm, along with a desire to avoid undue complexity. One effect is that there are expressions for which type argument inference fails to find a solution, but that can be well-typed if the programmer explicitly inserts appropriate types.
Discussion and motivation:
The reduction rules defined in JLS 7 are neither soundness- nor completeness-preserving. The reduction rules described here improve upon this by guaranteeing soundness and reducing the set of invocations which inference is unable to handle. The rules are also, of course, expanded to support the new language features.
Unsoundness in JLS 7 arises, among other things, out of the fact that false and true results are treated interchangeably—neither has any impact on the current bound set. It is not a fundamental type safety problem, because the results of inference are checked for validity before an expression is considered well-typed. But it has the unfortunate effect of covering up bugs: if the spec misses an important constraint, there's not a hard line that says that's "wrong." The claim of soundness here raises the bar: if the rules miss a constraint that is necessary to guarantee that the initial constraint formula is true, that is a failing of the rules, and it must be corrected to make the spec internally consistent.
In addition, a soundness-preserving approach avoids the potential confusion that could arise if, say, inference unsoundly ignored certain difficult-to-analyze lambda expression arguments. (The effect might be that inference gets "lucky" in guessing valid type arguments in one scenario, only to fail when a seemingly unrelated adjustment changes the results.)
A constraint formula of the form ⟨Expression → T⟩ is reduced as follows: [jsr335-18.2.1-10]
(
Expression' )
, the constraint reduces to ⟨Expression' → T⟩. [jsr335-18.2.1-10-C1]
This bound set may contain new inference variables, as well as dependencies between these new variables and the inference variables in T.
?
Expression_{2} :
Expression_{3}, the constraint reduces to two constraint formulas, ⟨Expression_{2} → T⟩ and ⟨Expression_{3} → T⟩. [jsr335-18.2.1-10-C3]
Discussion and motivation:
By treating nested generic method invocations as poly expressions, we improve the behavior of inference for nested invocations. For example, the following is illegal in Java 7 but legal in Java 8:ProcessBuilder b = new ProcessBuilder(Collections.emptyList()); // Constructor expects a List<String>When both the outer and the nested invocation require inference, the problem is more difficult. For example:
List<String> ls = new ArrayList<>(Collections.emptyList());Our approach is to "lift" the bounds inferred for the nested invocation (simply { α <:
Object
} in the case ofemptyList
here) into the outer inference process (in this case, trying to infer β where the constructor is for typeArrayList<
β>
). We also infer dependencies between the nested inference variables and the outer inference variables (the constraint ⟨List<
α>
→Collection<
β>
⟩ would reduce to the dependency α = β). In this way, resolution of the inference variables in the nested invocation can wait until additional information can be inferred from the outer invocation (based on the assignment target, β =String
). An alternative strategy for handling nested invocations was considered and rejected: in certain cases, the inference variables of the outer call are resolved before considering the nested call; typing of the nested call is delayed until a proper target type can be provided. In other cases, the nested call is treated as a standalone invocation.This has the advantage of avoiding the extra complexity involved with reduction producing new inference variables and dependencies. But there are two problems with this approach:
- It is not powerful enough to handle situations in which information from both the outer and inner calls needs to be synthesized in order to choose an appropriate inference variable instantiation.
- There's not an obvious rule for choosing whether the outer or the inner call should be resolved first, and so the resulting behavior is very ad hoc.
A constraint formula of the form ⟨LambdaExpression → T⟩, where T mentions at least one inference variable, is reduced as follows, following the compatibility rules outlined in 15.27.3. [jsr335-18.2.1.2-10]
If T is not a functional interface type (9.8), the constraint reduces to false. [jsr335-18.2.1.2-14]
Otherwise, a ground target type, T', must be derived from T, as described in 15.27.3. If 18.5.3 is used to derive a functional interface parameterization, then the test that F<
A'_{1}, ..., A'_{m}>
is a subtype of F<
A_{1}, ..., A_{m}>
is not performed (this is instead asserted with a constraint formula below). If no valid type can be found, the constraint reduces to false. [jsr335-18.2.1.2-20]
Otherwise, the congruence of the LambdaExpression with the function type of T' is asserted as follows: [jsr335-18.2.1.2-30]
void
and the lambda body is neither a statement expression nor a void-compatible block, the constraint reduces to false. [jsr335-18.2.1.2-30-D]
void
and the lambda body is a block that is not value-compatible, the constraint reduces to false. [jsr335-18.2.1.2-30-E]
void
) type R, assume the lambda's parameter types are the same as the function type's parameter types. Then: [jsr335-18.2.1.2-30-F2]
Discussion and motivation:
The key piece of information to derive from a compatibility constraint involving a lambda expression is the set of bounds on inference variables appearing in the target function type's return type. This is crucial, because functional interfaces are often generic, and many methods operating on these types are generic, too.In the simplest case, a lambda expression may simply provide a lower bound for an inference variable:
<T> List<T> makeThree(Factory<T> factory) { ... } String s = makeThree(() -> "abc").get(2);In more complex cases, a result expression may be a poly expression—perhaps even another lambda expression—and so the inference variable might be passed through multiple constraint formulas with different target types before a bound is produced.
Most of the work described in this section precedes assertions about the result expressions; its purpose is to derive the lambda expression's function type, and to check for expressions that are clearly disqualified from compatibility.
We do not attempt to produce bounds on inference variables that appear in the target function type's
throws
clause. This is because exception containment is not part of compatibility (15.27.3)—in particular, it must not influence method applicability (18.5.1). However, we do get bounds on these variables later, because invocation type inference (18.5.2) produces exception containment constraint formulas (18.2.5). If the target type is an inference variable, or if the target type's parameter types contain inference variables, we produce false. During invocation type inference (18.5.2), extra substitutions are performed in order to instantiate these inference variables, thus avoiding this scenario. (In other words, reduction will, in practice, never be "invoked" with a target type of one of these forms.) The result expressions of a lambda expression are required to be compatible in an assignment context with the target return type (see 15.27.3). Consider a targeted return type, R. If R is a proper type (e.g.,Byte
derived fromFunction<α,Byte>
), assignability is easy enough to test, and we do so. If R is not a proper type (e.g., α derived fromFunction<String,α>
), we make the simplifying assumption that loose invocation compatibility will be sufficient.The difference between assignment compatibility and loose invocation compatibility is that assignment allows narrowing of constant expressions (e.g.,
Byte b = 100
). So this simplifying assumption is not completeness-preserving: given target return type α and an integer literal return expression,100
, it is possible that α could be instantiated toByte
; however, reduction will not produce such a bound.
A constraint formula of the form ⟨MethodReference → T⟩, where T mentions at least one inference variable, is reduced as follows, following the compatibility rules outlined in 15.13.1. [jsr335-18.2.1.3-10]
If T is not a functional interface type (9.8), or if T is a functional interface type but does not have a function type, the constraint reduces to false. [jsr335-18.2.1.3-12]
Otherwise, if there does not exist a potentially-applicable method for the method reference when targeting T (15.13.1), the constraint reduces to false. [jsr335-18.2.1.3-20]
Otherwise, there are two cases. If the method reference is exact (15.13.1), then let P_{1}, ..., P_{n} be the parameter types of the function type of T; let P'_{1}, ..., P'_{k} be the parameter types of the potentially-applicable method. The compatibility constraint reduces to a new set of constraints, as follows: [jsr335-18.2.1.3-30]
void
, let R be this type. Then if the return type of the potentially-applicable compile-time declaration is void
, the constraint reduces to false. Otherwise, where R' is the result of applying capture conversion (5.1.10) to the return type of the potentially-applicable compile-time declaration, ⟨R' → R⟩. [jsr335-18.2.1.3-30-C]
If, instead, the method reference is inexact (15.13.1), then if one or more of the function type's parameter types is not a proper type, the constraint reduces to false. [jsr335-18.2.1.3-35]
Otherwise, a search for a compile-time declaration is performed, as defined in 15.13.1. If there does not exist a compile-time declaration for the method reference, the constraint reduces to false. [jsr335-18.2.1.3-16]
If a compile-time declaration can be found then there are three cases: [jsr335-18.2.1.3-40]
void
, the constraint reduces to is true. [jsr335-18.2.1.3-40-A]
This bound set may contain new inference variables, as well as dependencies between these new variables and the inference variables in T.
void
, the constraint reduces to false; otherwise, the constraint reduces to ⟨R' → R⟩. [jsr335-18.2.1.3-40-B]
Discussion and motivation:
The strategy used to determine a return type for a generic referenced method follows the same pattern as for generic method invocations (18.2.1). This may involve "lifting" bounds into the outer context and inferring dependencies between the two sets of inference variables.
A constraint formula of the form ⟨S → T⟩ is reduced as follows: [jsr335-18.2.2-10]
<
T_{1}, ..., T_{n}>
, and there exists no type of the form G<
...>
that is a supertype (4.10) of S, but the raw type G is a supertype of S, then the constraint reduces to true. [jsr335-18.2.2-10-E]
<
T_{1}, ..., T_{n}>
[]
^{k}, and there exists no type of the form G<
...>
[]
^{k} that is a supertype of S, but the raw type G[]
^{k} is a supertype of S, then the constraint reduces to true. (The notation []
^{k} indicates an array type of k dimensions.) [jsr335-18.2.2-10-F]
The fourth and fifth cases are considered uses of unchecked conversion (5.1.9). These, along with any use of unchecked conversion in the first case, may result in unchecked warnings, and may influence a method's invocation type (15.12.2.6). [jsr335-18.2.2-12]
Discussion and motivation:
Boxing T to T' is not completeness-preserving: for example, if T werelong
, S might be instantiated toInteger
, which is not a subtype ofLong
, but could be unboxed and then widened tolong
. We avoid this problem in most cases by giving special treatment to inference-variable return types that we know are already constrained to be certain boxed primitive types. See 18.5.2. Similarly, the treatment of unchecked conversion sacrifices completeness in cases in which T is not a parameterized type (for example, if T is an inference variable). It is not usually clear in such situations whether the unchecked conversion is necessary or not. Since unchecked conversions introduce unchecked warnings, inference prefers to avoid them unless it is clearly necessary.
A constraint formula of the form ⟨S <: T⟩ is reduced as follows: [jsr335-18.2.3-10]
[]
, then among the supertypes of S that are array types, a most-specific type is identified: S'[]
(this may be S itself). If no such array type exists, the constraint reduces to false. Otherwise: [jsr335-18.2.3-10-E3]
&
... &
I_{n}, the constraint reduces to the following new constraints: for all i, 1 ≤ i ≤ n, ⟨S <: I_{i}⟩. [jsr335-18.2.3-10-E5]
A constraint formula of the form ⟨S <= T⟩, where S and T are type arguments (4.5.1), is reduced as follows: [jsr335-18.2.3-20]
?
, the constraint reduces to true. [jsr335-18.2.3-20-B]
? extends
T': [jsr335-18.2.3-20-C]
?
, the constraint reduces to ⟨Object
<: T'⟩. [jsr335-18.2.3-20-C4]
? extends
S', the constraint reduces to ⟨S' <: T'⟩. [jsr335-18.2.3-20-C2]
? super
S', the constraint reduces to ⟨Object
= T'⟩. [jsr335-18.2.3-20-C3]
? super
T': [jsr335-18.2.3-20-D]
? super
S', the constraint reduces to ⟨T' <: S'⟩. [jsr335-18.2.3-20-D2]
A constraint formula of the form ⟨S = T⟩, where S and T are types, is reduced as follows: [jsr335-18.2.4-10]
[]
and T'[]
, the constraint reduces to ⟨S' = T'⟩. [jsr335-18.2.4-10-E]
A constraint formula of the form ⟨S = T⟩, where S and T are type arguments (4.5.1), is reduced as follows: [jsr335-18.2.4-20]
?
and T has the form ?
, the constraint reduces to true. [jsr335-18.2.4-20-B]
?
and T has the form ? extends
T', the constraint reduces to ⟨Object
= T'⟩. [jsr335-18.2.4-20-F]
? extends
S' and T has the form ?
, the constraint reduces to ⟨S' = Object
⟩. [jsr335-18.2.4-20-G]
? extends
S' and T has the form ? extends
T', the constraint reduces to ⟨S' = T'⟩. [jsr335-18.2.4-20-C]
? super
S' and T has the form ? super
T', the constraint reduces to ⟨S' = T'⟩. [jsr335-18.2.4-20-D]
Discussion and motivation:
Note that we do not address intersection types here, because it is impossible for reduction to encounter an intersection type that is not a proper type.
A constraint formula of the form LambdaExpression →_{throws} T is reduced as follows: [jsr335-18.2.5-14]
void
nor a proper type, the constraint reduces to false. [jsr335-18.2.5-14-C2]
throws
clause that are not proper types. If the LambdaExpression is implicitly-typed, let the expression's parameter types be the function type's parameter types; if the lambda body is a poly expression or a block containing a poly result expression, let the targeted return type by the function type's return type. Then let E'_{1}, ..., E'_{m} be the checked exceptions that the lambda body can throw (11.2). [jsr335-18.2.5-14-D]
throws
clause consists only of proper types), then if there exists some i (1 ≤ i ≤ m) such that E'_{i} is not a subtype of any proper type in the throws
clause, the constraint reduces to false; otherwise, the constraint reduces to true. [jsr335-18.2.5-14-E1]
throws
clause, then the constraints include, for all j (1 ≤ j ≤ n), ⟨E'_{i} <: E_{j}⟩. In addition, for all j (1 ≤ j ≤ n), the constraint also reduces to the bound throws
E_{j}. [jsr335-18.2.5-14-E2]
A constraint formula of the form MethodReference →_{throws} T is reduced as follows: [jsr335-18.2.5-15]
void
nor a proper type, the constraint reduces to false. [jsr335-18.2.5-15-C]
throws
clause that are not proper types; and let E'_{1}, ..., E'_{m} be the checked exceptions in the throws
clause of the invocation type of the method reference's compile-time declaration (15.13.2) (derived based on the target function type's parameter types and return type). [jsr335-18.2.5-15-D]
throws
clause consists only of proper types), then if there exists some i (1 ≤ i ≤ m) such that E'_{i} is not a subtype of any proper type in the throws
clause, the constraint reduces to false; otherwise, the constraint reduces to true. [jsr335-18.2.5-15-E1]
throws
clause, then the constraints include, for all j (1 ≤ j ≤ n), ⟨E'_{i} <: E_{j}⟩. In addition, for all j (1 ≤ j ≤ n), the constraint also reduces to the bound throws
E_{j}. [jsr335-18.2.5-15-E2]
Discussion and motivation:
Constraints on checked exceptions are handled separately from constraints on return types, because return type compatibility influences applicability of methods (18.5.1), while exceptions only influence the invocation type after overload resolution is complete (18.5.2). This could be simplified by including exception compatibility in the definition of lambda expression compatibility (15.27.3), but this would lead to possibly-surprising cases in which exceptions that can be thrown by an explicitly-typed lambda body change overload resolution. The exceptions thrown by a lambda body cannot be determined until i) the parameter types of the lambda are known, and ii) the target type of result expressions in the body is known. (The second requirement is to account for generic method invocations in which, for example, the same type parameter appears in the return type and thethrows
clause.) Hence, we require both of these, as derived from the target type T, to be proper types.One consequence is that lambda expressions returned from other lambda expressions cannot generate constraints from their thrown exceptions. These constraints can only be generated from top-level lambda expressions.
The handling of the case in which more than one inference variable appears in a function type'sthrows
clause is not completeness-preserving. Either variable may, on its own, satisfy the constraint that each checked exception be declared, but we can't be sure which one is intended. So, for predictability, we constrain them both.
Whenever a bound set contains two or more bounds, it is possible that new bounds can be inferred based on the assertions of the original bounds. As bound sets are constructed and grown during inference, new bounds are incorporated by identifying complementary pairs of bounds and inferring further information in this way. [jsr335-18.3-10]
Specifically, for any pair of bounds in a bound set matching one of the following rules, a new constraint formula is inferred. In the rules, S and T are inference variables or types, and U is a proper type. For conciseness, a bound of the form α = T may also match a bound of the form T = α. [jsr335-18.3-12]
If two bounds have the form α <: S and α <: T, and if for some generic class or interface, G, there exists a supertype (4.10) of S of the form G<
S_{1}, ..., S_{n}>
and a supertype of T of the form G<
T_{1}, ..., T_{n}>
, then for all i, 1 ≤ i ≤ n, if S_{i} and T_{i} are types (not wildcards), the constraint ⟨S_{i} = T_{i}⟩ is implied. [jsr335-18.3-14]
Additionally, when a bound set contains a bound of the form G<
α_{1}, ..., α_{n}>
= capture(G<
A_{1}, ..., A_{n}>
), further bounds or constraints may be inferred. Let P_{1}, ..., P_{n} represent the type parameters of G, B_{1}, ..., B_{n} represent the bounds (4.4) of these type parameters, and θ represent the substitution [P_{1}:=α_{1}, ..., P_{n}:=α_{n}]. Let R be a type that is not an inference variable (but is not necessarily a proper type). Then for all i (1 ≤ i ≤ n): [jsr335-18.3-20]
?
: [jsr335-18.3-20-B]
? extends
T: [jsr335-18.3-20-C]
Object
, α_{i} <: R implies ⟨T <: R⟩ [jsr335-18.3-20-C2]
Object
, α_{i} <: R implies ⟨B_{i} θ <: R⟩ [jsr335-18.3-20-C2]
? super
T: [jsr335-18.3-20-D]
The resulting constraint formulas are then reduced, and any new bounds are added to the bound set. This may trigger further incorporation; ultimately, the set will reach a fixed point and no further bounds can be inferred. [jsr335-18.3-14]
A bound set that is fully incorporated, if it does not contain the bound false, has the following properties:
Discussion and motivation:
The assertion that incorporation reaches a fixed point oversimplifies the matter slightly. Building on the work of Kennedy and Pierce, On Decidability of Nominal Subtyping with Variance, this property can be proven by making the argument that the set of types that may appear in the bound set is finite. The argument relies on two assumptions:
- New capture variables are not generated when reducing subtyping constraints (18.2.3).
- Expansive inheritance paths are not pursued.
The specification does not currently guarantee these properties (it is vague about the handling of wildcards when reducing subtyping constraints, and does not detect expansive inheritance paths), but is likely to do so in a future version.
(Notably, this is not a new problem: the existing subtyping algorithm is similarly at risk of non-termination.)
Given a bound set that does not contain the bound false, a subset of the inference variables mentioned by the bound set may be resolved. This means that a satisfactory instantiation may be added to the set for each inference variable, until all the requested variables have instantiations. Dependencies in the bound set may require that the variables be resolved in a particular order, or that additional variables be resolved. [jsr335-18.4-10]
Consider a bound of one of the following forms, where T is either an inference variable β or a type that mentions β: [jsr335-18.4-22]
Given such a bound, if α appears on the left-hand side of another bound of the form G<
..., α, ...>
= capture(G<
...>
), then β depends on the resolution of α. Otherwise, α depends on the resolution of β. [jsr335-18.4-22]
An inference variable α appearing on the left-hand side of a bound of the form G<
..., α, ...>
= capture(G<
...>
) depends on the resolution of every other inference variable mentioned in this bound (on both sides of the = sign). [jsr335-18.4-20]
An inference variable α also depends on the resolution of itself, as well as the resolution of an inference variable β if there exists an inference variable γ such that α depends on the resolution of γ and γ depends on the resolution of β. [jsr335-18.4-24]
Given a set of inference variables to resolve, let V be the union of this set and all variables upon which the resolution of at least one variable in this set depends. [jsr335-18.4-30]
If every variable in V has an instantiation, then the resolution procedure terminates. Otherwise, let { α_{1}, ..., α_{n} } be a non-empty subset of uninstantiated variables in V such that i) for all i, 1 ≤ i ≤ n, if α_{i} depends on the resolution of a variable β, then either β has an instantiation or there is some j such that β = α_{j}; and ii) there exists no non-empty proper subset of { α_{1}, ..., α_{n} } with this property. [jsr335-18.4-32]
Then resolution proceeds by generating an instantiation for each of α_{1}, ..., α_{n}. [jsr335-18.4-34]
If, for all i (1 ≤ i ≤ n), the bound set does not contain a bound of the form G<
..., α_{i}, ...>
= capture(G<
...>
), we first define candidate instantiations, T_{1}, ..., T_{n}. For all i (1 ≤ i ≤ n), define T_{i} as follows: [jsr335-18.4-40]
throws
α_{i}, and the proper upper bounds of α_{i} are, at most, Exception
, Throwable
, and Object
, then T_{i} = RuntimeException
. [jsr335-18.4-40-B]
If incorporating the bounds α_{1} = T_{1}, ..., α_{n} = T_{n} with the current bound set produces a set that does not include false, then this is the new bound set; resolution proceeds by selecting a new set of variables to instantiate (if necessary), as described above. [jsr335-18.4-42]
Otherwise, a second attempt is made to instantiate { α_{1}, ..., α_{n} }, as follows: [jsr335-18.4-50]
If the resulting type variables Z_{1}, ..., Z_{n} do not have well-formed bounds (that is, a lower bound is not a subtype of an upper bound), then resolution fails. Otherwise, the new bound set is produced by removing all bounds of the form G<
..., α_{i}, ...>
= capture(G<
...>
) (for all i, 1 ≤ i ≤ n) and incorporating α_{1} = Z_{1}, ..., α_{n} = Z_{n}. If the result contains the bound false, resolution fails. Otherwise, resolution proceeds by selecting a new set of variables to instantiate (if necessary), as described above. [jsr335-18.4-52]
Discussion and motivation:
The first attempt at instantiating an inference variable derives the instantiation from that variable's bounds. Sometimes, however, complex dependencies mean that the result is not within the variable's bounds. In case of failure, a second attempt is performed, analogous to capture conversion (5.1.10): fresh type variables are introduced, with bounds derived from the bounds of the inference variables. (Note that the lower bounds of these "capture" variables are computed using only proper types: this is important in order to avoid attempts to perform typing computations on uninstantiated type variables.)
Using the inference processes defined above, the following analyses are performed at compile-time.
Given a method invocation that provides no explicit type arguments, the following process, modeled after the applicability rules in 15.12.2.2, 15.12.2.3, & 15.12.2.4, determines whether a potentially-applicable generic method m is applicable: [jsr335-18.5.1-10]
throws
clause of m, then the bound throws
α_{i} is incorporated with B_{0}, producing the set B_{1}. [jsr335-18.5.1-10-F]
Discussion and motivation:
Consider the following method invocation and assignment:
List<Number> ln = Arrays.asList(1, 2.0);A most-specific applicable method for the invocation must be identified as described in 15.12. The only potentially-applicable method (15.12.2.1) is declared as follows:
public static <T> List<T> asList(T... a)Trivially (because of its arity), this method is neither applicable by strict invocation (15.12.2.2) nor applicable by loose invocation (15.12.2.3). But since there are no other candidates, in a third phase the method is checked for applicability by variable-arity invocation.
The initial bound set, B, is a trivial upper bound for a single inference variable, α:
{ α <:Object
}The initial constraint formula set is as follows:
{ ⟨1
→ α⟩, ⟨2.0
→ α⟩ }These are reduced to a new bound set, B_{1}:
{ α <:Object
,Integer
<: α,Double
<: α }Then, to test whether the method is applicable, we attempt to resolve these bounds. We succeed, producing the rather complex instantiation
α =We have thus demonstrated that the method is applicable; since no other candidates exist, it is the most-specific applicable method. The type of the method invocation, and its compatibility with the target type in the assignment, is not determined until further inference can occur, as described in the next section.Number & Comparable<? extends Number & Comparable<?>>
Consider a method invocation that provides no explicit type arguments, and a corresponding most-specific applicable generic method m. The following process is used to infer the invocation type (15.12.2.6) of the chosen method. [jsr335-18.5.2-10]
<
A_{1}, ..., A_{n}>
, and one of A_{1}, ..., A_{n} is a wildcard, then, for fresh inference variables β_{1}, ..., β_{n}, the constraint formula ⟨G<
β_{1}, ..., β_{n}>
→ T⟩ is reduced and incorporated, along with the bound G<
β_{1}, ..., β_{n}>
= capture(G<
A_{1}, ..., A_{n}>
), with B_{2}. [jsr335-18.5.2-10-C14]
<
...>
that is a supertype of S, but the raw type G is a supertype of S. [jsr335-18.5.2-10-C17]
Invocation type inference may require carefully sequencing the reduction of constraint formulas of the forms ⟨Expression → T⟩, ⟨LambdaExpression →_{throws} T⟩, and ⟨MethodReference →_{throws} T⟩. To facilitate this sequencing, the input variables of these constraints are defined as follows: [jsr335-18.5.2-20]
void
, then for each result expression, Expression', in the lambda body (or for the body itself if it is an expression), the input variables of ⟨Expression' → R⟩. [jsr335-18.5.2-20-A2]
?
Expression_{2} :
Expression_{3}, the input variables are the input variables of ⟨Expression_{2} → T⟩ and ⟨Expression_{3} → T⟩. [jsr335-18.5.2-20-F]
The output variables of these constraints are all inference variables mentioned by the type on the right-hand side of the constraint, T, that are not input variables. [jsr335-18.5.2-22]
Discussion and motivation:
It is important to note that two "rounds" of inference are involved in finding the type of a method invocation. This is necessary to allow a target type to influence the type of the invocation without allowing it to influence the choice of an applicable method. The first round produces a bound set and tests that a resolution exists, but does not commit to that resolution. The second round reduces additional constraints and then performs a second resolution, this time "for real."The distinction between these two rounds of inference was confused in previous iterations of the JLS (compare the discussion of inference in 15.12.2.2 vs. 15.12.2.6; also see 15.12.2.8, which makes ambiguous use of the term "now").
Let's revisit the example from the previous section:List<Number> ln = Arrays.asList(1, 2.0);The most-specific applicable method is declared as follows:
public static <T> List<T> asList(T... a)In order to complete type-checking of the method invocation, we must determine whether it is compatible with its target type,
List<Number>
.The bound set used to demonstrate applicability in the previous section, B_{2}, was:
{ α <:Object
,Integer
<: α ,Double
<: α }The new constraint formula set is as follows:
{ ⟨List<α>
→List<Number>
⟩ }This compatibility constraint produces an equality bound for α, which is included in the new bound set, B_{3}:
{ α <:Object
,Integer
<: α ,Double
<: α, α =Number
}These bounds are trivially resolved:
α =Number
Finally, we perform a substitution on the declared return type of
asList
to determine that the invocation expression has typeList<Number>
; clearly, this is compatible with the target type.Note that this strategy for inference is different than JLS 7, which would have instantiated α based on its lower bounds (before even considering the invocation's target type), as we did in the previous section. This would result in a type error, since the resulting type is not a subtype of
List<Number>
. The use of capture when comparing a method result type to its target type is new, and allows for more precise inference. Under various special circumstances, based on the bounds appearing in B_{2}, we eagerly resolve an inference variable that appears as the return type of the invocation. This is to avoid unfortunate situations in which the usual constraint, ⟨R θ → T⟩, is not completeness-preserving. It is, unfortunately, possible that by eagerly resolving the variable, we are unable to make use of bounds that would be inferred later. It is also possible that, in some cases, bounds that will later be inferred from the invocation arguments (such as implicitly-typed lambda expressions) would have caused a different outcome if they had been present in B_{2}. Despite these limitations, the strategy allows for reasonable outcomes in typical use cases, and preserves backwards-compatibility with the JLS 7 algorithm.
Where a lambda expression with explicit parameter types P_{1}, ..., P_{n} targets a wildcard-parameterized functional interface, F<
A_{1}, ..., A_{m}>
, a parameterization of F may be derived as the ground target type of the expression, according to the following process. [jsr335-18.5.3-20]
<
α_{1}, ..., α_{m}>
, where α_{1}, ..., α_{m} are new inference variables. If n ≠ k, no valid parameterization exists. [jsr335-18.5.3-20-B]
<
A'_{1}, ..., A'_{m}>
, is constructed as follows, for 1 ≤ i ≤ m: [jsr335-18.5.3-20-E]
<
A'_{1}, ..., A'_{m}>
is not a well-formed type (that is, the type arguments are not within their bounds), or if F<
A'_{1}, ..., A'_{m}>
is not a subtype of F<
A_{1}, ..., A_{m}>
, no valid parameterization exists. [jsr335-18.5.3-20-F]
<
A'_{1}, ..., A'_{m}>
, if all the type arguments are types, or the non-wildcard parameterization (9.8) of F<
A'_{1}, ..., A'_{m}>
if one or more type arguments are still wildcards. [jsr335-18.5.3-20-F1]
Discussion and motivation:
In order to determine the function type of a wildcard-parameterized functional interface, we have to "instantiate" the wildcard type arguments with specific types. The "default" approach is to simply replace the wildcards with their bounds, as described in 9.8. But this produces spurious errors in cases in which a lambda expression has explicit parameter types that do not correspond to the wildcard bounds. For example:Predicate<? super Integer> p = (Number n) -> n.equals(23);The lambda expression in this example is a
Predicate<Number>
, which is a subtype ofPredicate<? super Integer>
but notPredicate<Integer>
. The analysis in this section is used to infer thatNumber
is an appropriate choice for the type argument toPredicate
. The analysis here, while described in terms of general type inference, is intentionally quite simple. The only constraints are equality constraints, which means that reduction amounts to simple pattern matching.A more powerful strategy might also infer constraints from the body of the lambda expression. But, given possible interactions with inference for surrounding and/or nested generic method invocations, this would introduce a lot of extra complexity.
When testing that one applicable method is more specific than another (15.12.2.5), where the second method is generic, it is necessary to test whether some instantiation of the second method's type parameters can be inferred to make the first method more specific than the second. [jsr335-18.5.4-10]
Let m_{1} be the first method and m_{2} be the second. Let exp_{1}, ..., exp_{k} be the argument expressions of the corresponding invocation. [jsr335-18.5.4-12]
Where m_{2} has type parameters P_{1}, ..., P_{p}, let α_{1}, ..., α_{p} be inference variables. Let θ be the substitution [P_{1}:=α_{1}, ..., P_{p}:=α_{p}]. [jsr335-18.5.4-14]
If m_{1} and m_{2} are applicable by strict or loose invocation (15.12.2.2, 15.12.2.3), then let S_{1}, ..., S_{k} be the parameter types of m_{1}, and let T_{1}, ..., T_{k} be the result of θ applied to the parameter types of m_{2}. If m_{1} and m_{2} are applicable by variable arity invocation (15.12.2.4), then let S_{1}, ..., S_{k} be the first k variable-arity parameter types of m_{1}, and let T_{1}, ..., T_{k} be the result of θ applied to the first k variable-arity parameter types of m_{2}. [jsr335-18.5.4-16]
Note that no substitution is applied to S_{1}, ..., S_{k}; even if m_{1} is generic, the type parameters of m_{1} are treated as type variables, not inference variables.
First, an initial bound set B is constructed from the declared bounds of P_{1}, ..., P_{p}, as described in 18.1.3. [jsr335-18.5.4-20]
Second, for all i, 1 ≤ i ≤ k, a set of constraint formulas or bounds is generated. [jsr335-18.5.4-30]
If S_{i} and T_{i} are proper types, the result is true if S_{i} is more specific than T_{i} for exp_{i} (15.12.2.5), and false otherwise. [jsr335-18.5.4-32]
Otherwise, if S_{i} is a functional interface type and T_{i} is a parameterization of functional interface, I, and none of the following is true: [jsr335-18.5.4-33]
Then let U_{1}, ..., U_{k} and R_{1} be the parameter types and return type, respectively, of the function type of the capture of S_{i}, and let V_{1}, ..., V_{k} and R_{2} be the parameter types and return type, respectively, of the function type of T_{i}. The following constraints or bounds are generated: [jsr335-18.5.4-34]
void
, true. [jsr335-18.5.4-34-A1]
void
, true. [jsr335-18.5.4-34-B1]
Otherwise, the constraint ⟨S_{i} <: T_{i}⟩ is generated. [jsr335-18.5.4-36]
Third, if m_{2} is applicable by variable arity invocation and has k+1 parameters, then where S_{k+1} is the k+1th variable-arity parameter type of m_{1} and T_{k+1} is θ applied to the k+1th variable-arity parameter type of m_{2}, the constraint ⟨S_{k+1} <: T_{k+1}⟩ is generated. [jsr335-18.5.4-40]
Fourth, the generated bounds and constraint formulas are reduced and incorporated with B to produce a bound set B'. [jsr335-18.5.4-50]
If B' does not contain the bound false, and the inference variables in B' can be resolved, then m_{1} is more specific than m_{2}. Otherwise, m_{1} is not more specific than m_{2}. [jsr335-18.5.4-52]
The least upper bound, or lub, of a set of reference types is a shared supertype that is more specific than any other shared supertype (that is, no other shared supertype is a subtype of the least upper bound). This type, lub(U_{1}, ..., U_{k}), is determined as follows. [jsr335-4.10.4-10]
If k = 1, then the lub is the type itself: lub(U) = U. [jsr335-4.10.4-20]
Otherwise, for each U_{i} (1 ≤ i ≤ k), we write ST(U_{i}) for the set of supertypes of U_{i}, and define the erased supertype set of U_{i}: [jls-15.12.2.7-430]
EST(U_{i}) = { V | W in ST(U_{i}) and V = |W| } where |W| is the erasure of W. [jls-15.12.2.7-440]
The reason for computing the set of erased supertypes is to deal with situations where the set of types includes several distinct invocations of a generic type declaration.
For example, given List<String>
and List<Object>
, simply intersecting the sets ST(List<String>
) = { List<String>
, Collection<String>
, Object
} and ST(List<Object>
) = { List<Object>
, Collection<Object>
, Object
} would yield a set { Object
} , and we would have lost track of the fact that T can safely be assumed to be a List
.
In contrast, intersecting EST(List<String>
) = { List
, Collection
, Object
} and EST(List<Object>
) = { List
, Collection
, Object
} yields { List
, Collection
, Object
} , which will eventually enable us to produce List<?>
as described below.
The erased candidate set for a set of types, EC, is the intersection of all the sets EST(U_{i}) for each U_{i} in U_{1}, ..., U_{k}. [jls-15.12.2.7-450]
The minimal erased candidate set for U_{1}, ..., U_{k} is: [jls-15.12.2.7-460]
MEC = { V | V in EC, and for all W ≠ V in EC, it is not the case that W <: V } [jls-15.12.2.7-470]
Because we are seeking to infer more precise types, we wish to filter out any candidates that are supertypes of other candidates. This is what computing MEC accomplishes.
In our running example, we had EC = { List
, Collection
, Object
} , and now MEC = { List
} .
The next step will be to recover type arguments for the erased types in MEC.
...
The definition of the lci and lcta functions from 15.12.2.7 in JLS 7 should be inserted here.
...
Finally, we define the least upper bound based on all the elements of the minimal erased candidate set of U_{1}, ..., U_{k}. If any of these elements are generic, we use the CandidateInvocation function to recover the type argument information.
Define Candidate(W) = CandidateInvocation(W) if W is generic, W otherwise. [jls-15.12.2.7-530]
Then lub(U_{1}, ..., U_{k}) is Candidate(W_{1}) &
... &
Candidate(W_{r}), where W_{i} (1 ≤ i ≤ r) are the elements of MEC. [jls-15.12.2.7-540]
Strictly speaking, the lub function defined here only approximates a least upper bound. Formally, there may exist some other type T such that all of U_{1}, ..., U_{k} are subtypes of T and T is a subtype of lub(U_{1}, ..., U_{k}). However, a Java compiler's implementation of lub must be consistent with the above definition. [jsr335-4.10.4-90]
It is possible that the lub computation yields an infinite type. This is permissible, and a Java compiler must recognize such situations and represent them appropriately using cyclic data structures. [jls-15.12.2.7-600]
The possibility of an infinite type stems from the recursive calls to lub. Readers familiar with recursive types should note that an infinite type is not the same as a recursive type.
Discussion and motivation:
This section is almost entirely lifted from 15.12.2.7, with only minor changes to eliminate references to type inference (such as the inference variable T_{j}).
Compare JLS 15.12.2.7
Compare JLS 15.12.2.8