State of the Specialization

December 2014: Proto Edition

Brian Goetz

This is an informal sketch of proposed enhancements to the Java Language (and secondarily, to the Java Virtual Machine) to support generics over primitives (and eventually, value types). It refines the previous iteration posted in July 2014. This is an early draft, informed by the first round of prototyping efforts.

Background

One of the early compromises with Java Generics is that generic type variables can only be instantiated with reference types, not primitive types. It has always been an irritant that one cannot generify over primitive types without boxing, which forces users to give up performance in order to get the benefit of code abstraction. With the addition of value types, as is currently under investigation, this restriction starts to become untenable. (The choice to introduce nominal rather than structural function types in Java 8 was also arguably influenced by this restriction.)

In this document, we will illustrate specialization of generics over primitives; these examples extend naturally to value types.

Parametric polymorphism

Parametric polymorphism always entails tradeoffs between code footprint, abstraction, and specificity, and different languages have chosen different tradeoffs.

At one end of the spectrum, C++ creates a specialized class for each instantiation of a template, and different specializations have no typing relationship with each other. This is a heterogeneous translation, where different instantiations of a parametric type in the source language correspond to different runtime classes. This provides a high degree of specificity, to the point where expressions such as a+b can be interpreted relative to the behavior of + on the instantiated types of a and b, but entails a large code footprint as well as a loss of abstraction -- one cannot declare a variable whose type is the equivalent of Foo<?> in Java. (C++ does have template methods, so it retains behavioral parametricity but not data parametricity.)

At the other end, we have Java's current erased implementation which produces one class for all reference instantiations and no support for primitive instantiations. (This is a homogeneous translation, and the restriction that Java's generics can only range over reference types comes from the limitations of homogeneous translation with respect to the bytecode set of the JVM, which uses different bytecodes for operations on reference types vs primitive types.) However, erased generics in Java provide both behavioral parametricity (generic methods) and data parametricity (raw and wildcard instantiations of generic types.)

C# supports generics over both reference and struct types with a homogeneous translation; it is able to do so because the .NET bytecode set has a notion of type variables, which can range over reference and struct types. It then generates one set of native code for all reference types (using erasure at the native code level), and a specialized representation for each instantiated struct type. The cost of this approach was that when generics were introduced in .NET, existing collection classes (and other core library classes) had to be rewritten and collection-using programs (which was all of them) had to be modified to use the new generic libraries.

Design choices behind Java Generics

Generics were introduced into Java in Java 5 (though the effort to introduce them began considerably earlier.) The key requirement that drove many of the design choices was that of gradual migration compatibility: it must be possible to evolve an existing non-generic class to be generic in a binary-compatible and source-compatible manner.

To illustrate this requirement, consider a non-generic class A, and a client C which references (or extends) A. The gradual migration compatibily requirements demand that C.class continue to work even after A is generified and recompiled, and that that C can be recompiled without modifying C to be aware of A's generification, and that this continue to work as well. Without these requirements, generifying a class would require a "flag day" where all clients have to be at least recompiled, if not modified. Since such a "flag day" across the entire Java ecosystem was out of the question, we needed a generic type system that allowed core platform classes -- especially Collections -- to be generified without requiring clients be made aware of their generification.

To achieve these goals, a homogeneous translation strategy was chosen, where generic type variables are erased to their bounds as they are incorporated into bytecode. This means that whether a class is generic or not, it still compiles to a single class, with the same name, and whose member signatures are the same. Type safety is verified at compile time, and runtime is unfettered by the generic type system. In turn, this imposed the restriction that generics could only work over reference types, since Object is the most general type available, and it does not extend to primitive types.

Erasure example: a simple Box class

Suppose we have the following class:

class Box<T> {
    private final T t;

    public Box(T t) { this.t = t; }

    public T get() { return t; }
}

Compiling this class today yields the following bytecode, where uses of T are erased to T's bound, Object:

class Box extends java.lang.Object{
private final java.lang.Object t;

public Box(java.lang.Object);
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   aload_0
   5:   aload_1
   6:   putfield    #2; //Field t:Ljava/lang/Object;
   9:   return

public java.lang.Object get();
  Code:
   0:   aload_0
   1:   getfield    #2; //Field t:Ljava/lang/Object;
   4:   areturn
}

Note that some occurrences of Object here really mean Object, such as the supertype of Box, and others are derived from the erasure of the type variable T -- and the bytecode does not tell us which are which. (The only clue is the Signature attribute, which encodes the generic type information, and the signature attribute is used only by the compiler, to reconstruct generic information when reading a classfile. However, as we'll see, some instances of erasure have no such trail of breadcrumbs back to their generic heritage.)

The signature attributes for Box and its members are as follows (use javap -v to see them):

class Box<T extends java.lang.Object> extends java.lang.Object
  Signature: #17                          // <T:Ljava/lang/Object;>Ljava/lang/Object;
{
  public Box(T);
    descriptor: (Ljava/lang/Object;)V
    Signature: #13                          // (TT;)V

  public T get();
    descriptor: ()Ljava/lang/Object;
    Signature: #16                          // ()TT;
}

But the bytecodes themselves, such as the areturn in get(), do not have any sort of signature attribute, so we have no idea whether this bytecode is returning a T or an Object -- and these cannot be reconstructed from information currently in the classfile.

A possible approach

If we're going to extend existing generic classes to support primitives -- such as ArrayList<int> -- we have many of the same compatibility concerns in migrating from "old generics" to "enhanced generics" as we did in migrating from non-generic to generic. We want to enable existing generic classes to be enhanced without having to throw them out and start over, and at the same time not force their existing clients to be recompiled or modified just to keep working with erased generics. Nor do we want to completely subvert the investment users have made in learning how generics work in Java. So we start with the same gradual migration compatibility requirement as we did the first time.

Opt-in

Existing generic classes are riddled with the assumption that a type variable T can always be converted to Object. This assumption takes many forms; conversions between T and Object; conversions between T[] and Object[]; assignment of null to T-valued variables; etc. These assumptions are fine for erased generics; they are not fine for enhanced generics. Which means the type system rules enforced by the compiler are different depending on the range of types a type variable can take on.

Which brings us to compromise conclusion #1: that a class should opt into "enhanced generic" treatment, rather than simply changing the rules for generics in the language and reinterpreting existing source files under the new rules. There are a few ways we could expose this in the source code; for purposes of the prototype we've chosen to write any as a type variable modifier:

class Box<any T> {
    private final T t;

    public Box(T t) { this.t = t; }

    public T get() { return t; }
}

There are other possible choices here: annotate the class as any-generic instead of individual type variables; use some sort of bound expression (T extends Any) rather than a type variable modifier; etc. (The problem with Any as a bound, while evocative, is that it's a fiction. So users will then try and declare variables of type Any and feel deceived when the compiler says no.)

Translation to classes

Existing generics use a homogeneous translation to classes; can we continue doing so for any-generics? While this is in theory possible, it is not currently possible with the existing JVM bytecode instruction set and type system. At root is the asymmetry between primitives and reference types; while a variable of type Object can hold any reference type, there is no type that can hold either a reference type or a primitive type. Accordingly, we have no VM type that we can use to represent the t field in Box<T> and still be able to use it for both reference and primitive instantiations. Similarly, there is nothing to which we can erase the signatures of the methods of Box<T> that works for both reference and primitive instantiations.

If we were concerned only about primitive instantiations, we could consider some sort of "tagged fixnum" approach. However, this approach essentially reverts to boxing for arbitrary value types, so it does not get us to our goal.

On the other hand, a full heterogenous translation would cause compatibility problems; existing code is full of constructs that assume a homogeneous translation (e.g., raw types.) Which brings us to compromise conclusion #2 -- that we need a hybrid homogeneous-hetergeneous translation, where reference instantiations can continue to use the erased homogeneous translation (and therefore remain compatible with existing code), whereas value instantiations will require a heterogeneous translation. So while List<String> and List<Integer> will both be represented by the same runtime class List.class, List<int> will not be (more generally, for all value types V and W where V != W, List<V> and List<W> will likely be represented by different classes.)

Subtyping

For generics as they exist today, there is a subtyping relationship:

Box<Integer>  <:  Box<?>
Box<?>  <: Box

(The latter occurrence is a raw type.) Since subtyping is transitive, we then have:

Box<Integer> <: Box

Initially, it might also seem sensible that Box<int> could be a subtype of raw Box. But, given our translation strategy, the Box class cannot be a superclass of whatever class represents Box<int>, as then Box<int> would have a field t of type Object, whereas t should be of type int. So Box<int> cannot be a subtype of raw Box. (And for the same reason, Box<int> cannot be a subtype of Box<Integer>.) (The same problem with fields also happens, to a lesser degree, with methods: even if List<int> acquired bridge methods that bridged to the corresponding List<Integer> methods, this just moves the problem to overload selection, as we'd routinely get methods overloaded on return type (int get(K) vs Integer get(K).)

Similarly, Box<int> cannot be a subtype of Box<Integer> or Box<?>, because if it were, transitivity would take us back to Box<int> <: Box. Without introducing a fictitious Any type that would pretend to be a supertype of both primitive and reference types (though could have no physical representation), we don't get a relationship between primitive-instantated generic types and wildcard (or raw) generic types. We do have:

ArrayList<int>  <:  List<int>

but not

ArrayList<int>  <:  ArrayList<Integer>
ArrayList<int>  <:  ArrayList<?>
ArrayList<int>  <:  ArrayList
List<int>  <:  List<Integer>
List<int>  <:  List<?>
List<int>  <:  List

Since generics are invariant, it is not surprising that List<int> is not a subtype of List<Integer>. The slightly surprising thing here is that a specialized type cannot interoperate with its raw counterpart. However, this is not an unreasonable restriction; not only are raw types discouraged (having been introduced solely for the purpose of supporting the gradual migration from non-generic code to generic code), but it is still possible to write fully generic code using generic methods -- see "Generic Methods".

Slightly more frustrating is that there is not a very satisfactory treatment for Box<?>. While it is tempting to say Box<?> should mean any instantiation of Box, what should the runtime type of a Box<?> field be? (It can't be raw Box.) Further, there's plenty of code out there that assumes that Box<?> really means Box<? extends Object>; for compatibility reasons we may be forced into this interpretation. (While we're still looking for an acceptable treatment here, we're going to tentatively label this "Compromise conclusion #3.")

Representation

Given this hybrid translation, we have several choices of what the compiler should produce as the result of compiling an any-generic class. For example, it could produce a traditional erased classfile plus a template classfile to be used to generate specializations. Or it could produce only a template, deriving the erased classfile from the template just as other specializations are. (This is more general, but could result in a startup penalty for the very common case of using erased generic classes.) Or we could produce a classfile with additional metadata that allows the classfile to be both used directly as an erased classfile and as a template for specialization. It is this latter approach we've taken in the prototype.

Specialization

If we were to specialize Box<T> for T=int, we would expect occurrences of T in method signatures, field signatures, and constructors to be replaced with int. The following listing shows the same bytecode listing as earlier, but this time marked up to preserve erased type information. Here, a *T next to a type name or bytecode means that the type in the classfile is derived from the erasure of T, and therefore would need to be adjusted during specialization.

class Box extends java.lang.Object{
private final java.lang.Object*T t;

public Box(java.lang.Object*T);
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   aload_0
   5:   aload_1*T
   6:   putfield    #2; //Field t:Ljava/lang/Object*T;
   9:   return

public java.lang.Object* get();
  Code:
   0:   aload_0
   1:   getfield    #2; //Field t:Ljava/lang/Object*T;
   4:   areturn*T
}

When specializing for T=int, we would replace instances of Object*T with int, and replace starred a bytecodes with corresponding i ones. We encode this in the classfile with attributes that carry the "stars" in the previous example, mapping uses of type names or typed bytecodes to the type variable whose erasure they correspond to. Then, when specializing, replace the "starred" type names and bytecodes with the appropriate ones for the specialization, as follows:

class Box${T=int} extends java.lang.Object{
private final int t;

public Box${T=int}(int);
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   aload_0
   5:   iload_1
   6:   putfield    #2; //Field t:int;
   9:   return

public int get();
  Code:
   0:   aload_0
   1:   getfield    #2; //Field t:int;
   4:   ireturn
}

In order for on-demand specialization at runtime to be practical, specialization should ideally be entirely mechanical; we would prefer to not do any additional dataflow analysis or typechecking at runtime beyond existing verification. Accordingly, conversion metadata needs to be present in the classfile so that the specializer can be as simple as possible, and the resulting bytecode can then be verified as normal.

Classloading

The approach illustrated here used a naming convention such as Box${T=int} to represent a specialized type. While this is fine for exposition, nominal classes are unsatifying as a solution for specialization, because it requires the imposition of semantics assumptions based on the structure of a class name, which is brittle. An alternate, and far more flexible mechanism is outlined here, which uses a structured metadescription of the specialization desired. (In any case, instantiating an instance of a specialized class for the first time, or LDCing of a class literal, triggers the specialization of a class, which would in turn trigger the specialization of any specialized classes referred to by that class.)

The current prototype uses a nominal encoding convention, hooking into URLClassLoader to trigger specialization of classes whose names have the form Foo${T=t,U=u,...} (if these names are not found by the ordinary class loading process.) This is purely a short-term expediency; we have no intention of expecting the VM, compilers, or class loaders to understand that a class named Box${T=int} has anything to do with specializing Box.

A more complicated example

Specialization need not only adjust signatures and bytecode; it must also preserve subtyping relationships. This means, for example, that the language-level relationship

ArrayList<int>  <:  List<int>

must turn into a subclassing relationships of the form

ArrayList${T=int}  <:  List${T=int}

With a single type variable, this is relatively easy; when we specialize a class, we also specialize its supertypes (if they refer to the subclasses type variable.) This gets trickier in the presence of multiple type variables, since we may not specialize all type variables at once. Consider:

class Pair<T,U> { 
    final T t;
    final U u;

    Pair(T t, U u) { this.t = t; this.u = u; }
}

class IntPair<U> extends Pair<int, U> { 
    IntPair(int t, U u) { super(t, u); }
}

class IntLongPair extends IntPair<long> { 
    IntLongPair(int t, long u) { super(t, u); }
}

In this example, we partially specialize Pair as a supertype of IntPair, and then further specialize IntPair as a supertype of IntLongPair. Since IntLongPair indirectly extends Pair<int,long>, we need to ensure that the following subtyping relationships still hold after specialization:

IntLongPair <: IntPair<long>
IntLongPair <: Pair<int, long>

Since the supertype of IntPair is a specialized class, the compiler rewrites the supertype accordingly, and similarly with IntLongPair:

class IntPair<U> extends Pair${T=int}<U> { 
    IntPair(int t, U u) { super(t, u); }
}

class IntLongPair extends IntPair${U=long} { 
    IntLongPair(int t, long u) { super(t, u); }
}

Loading a class requires loading its supertypes, so when IntLongPair is loaded, that will trigger the loading (and hence the on-demand specialization of) IntPair${U=long}, which in turn will trigger the loading of IntPair, which in turn triggers the specialization of Pair${T=int}. When we specialize IntPair, we see that its supertype is already partially specialized; in this case we must merge in the specialization of the remaining type variables to the partially specialized supertype, as below:

class Pair${T=int}<U> { 
    final int t;
    final U u;

    Pair(int t, U u) { this.t = t; this.u = u; }
}

class IntPair${U=long} extends Pair${T=int,U=long} { 
    IntPair(int t, long u) { super(t, u); }
}

We now see that the desired subtypings are consistent with class inheritance:

IntLongPair <: IntPair${U=long}
IntLongPair <: Pair${T=int,U=long}

Implementation details

To make specialization efficient, specializing a class C should not require access to the bytecode for any class other than C. However, the current translation of generic classes to classfiles does not quite support this goal. (For example, the bounds of type variables for enclosing generic classes or generic methods of inner classes are not stored in the classfile for the inner class.) Accordingly, the compiler adds several new bytecode attributes -- TypeVariablesMap and BytecodeMapping -- to the classfile, which capture information that would otherwise be erased or whose definition lives in other classfiles.

TypeVariableMap attribute

The TypeVariablesMap is a catalog of all type variables that are in scope for a class, method, or field (including those of lexically enclosing generic classes or methods), and for each type variable identifies where the type variable is defined, whether it is an any type variable, and its erasure.

Consider the following class:

class Outer<any T> {
    class Inner<any U> {
        <any V> void method() {
            class Local<any W> {
                void munge(T t, U u, V v, W w) {  }
            }
        }
    }
}

The local class Local has a single type variable W, but is enclosed by the generic classes Outer and Inner, and the generic method method. Each of these contribute possibly-specializeable type variables to Local. The type variable map for Local contains all the active scopes that contribute type variables, and for each scope, all the type variables contributed by that scope:

TypeVariablesMap:
  LOuter$Inner$1Local;:
    Tvar  Flags  Erased bound
    W     [ANY]  Ljava/lang/Object;
  LOuter$Inner;::method()V:
    Tvar  Flags  Erased bound
    V     [ANY]  Ljava/lang/Object;
  LOuter$Inner;:
    Tvar  Flags  Erased bound
    U     [ANY]  Ljava/lang/Object;
  LOuter;:
    Tvar  Flags  Erased bound
    T     [ANY]  Ljava/lang/Object;

(We call type variables declared with a class or method the explicit type variables for that class or method, and any type variables associated with an enclosing generic class or method implicit type variables. Note that if an implicit type variable is shadowed by another type variable, the shadowed type variable may still appear in signatures and therefore the specializer must distinguish between type variables at multiple levels that have the same name.)

BytecodeMapping attribute

In the marked up version of the bytecode for Box, there were two places the *T annotations could occur; on types in declarations (classes, fields, methods) and on bytecodes. The Signature attribute (combined with information in the type variables map) contains enough information for us to specialize signatures; the remaining missing bits are the annotations on bytecodes.

For any method that contains bytecodes that might need to be specialized, the compiler generates a BytecodeMapping attribute which contains mappings from bytecode indexes to additional pre-erasure signature information describing the types being manipulated by that specific bytecode.

The use of bytecode indexes in attributes like BytecodeMapping is not completely satisfying, as these can easily get out of sync if the classfile is transformed with tools that do not have a full understanding of the attribute semantics. A more robust means of associating metadata with bytecodes that can survive common bytecode manipulations intact would be desirable.

opcode	category	UTF-8 Value
aloadXX	1	local variable type
astoreXX	1	top-of-static element type
aaload	1	array element type
aastore	1	array element type
areturn	1	enclosing method return type
dupXX	1	top-of-stack element type
if_acmpXX	1	top-of-static element type
new	2	class type
anewarray	2	array type
amultinewarray	2	array type
ldc	2	class literal type
checkcast	2	cast type
instanceof	2	instanceof type
XXfield	3	instantiated field descriptor
invokeXX	3	instantiated method descriptor

This table divides possibly-specializable bytecodes into three categories:

Data-movement operations. Here, we store the unerased type of the data being moved by this opcode, and such opcodes are specialized if and only if the associated unerased type is the instantiation of an any type variable or an array thereof. For example, the areturn method in the get() method of Box is moving a variable of type T, and so should be specialized.
Class-oriented operations. Here, we store the unerased instantiated type of the generic class being described by this operation, and such are specialized if and only if at least one type parameter is an instantiation of an any type variable or array thereof. For example, a new operation that instantiates a Box<int> describes a specialization of Box<T>, and so should be specialized.
Member-oriented operations. Here, the opcode has two signatures to keep track of; the instantiated type of the receiver, and the instantiated type of the member signature. These operations are specialized if the receiver, or of the signature has one or more type parmeters that are instantiations of any type variables. For example, the fetch from t in the get() method of Box must be specialized both for the type of the field and for the owning class.

Examples of these are shown in the document Overview of specialized classfile format; the BytecodeMapping attributes for Box is illustrated here:

class Box<T extends java.lang.Object> extends java.lang.Object
{
  public Box(T);
    descriptor: (Ljava/lang/Object;)V
    Code:
      stack=2, locals=2, args_size=2
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: aload_0
         5: aload_1
         6: putfield      #2                  // Field t:Ljava/lang/Object;
         9: return
    BytecodeMapping:
      Code_idx  Signature
          5:    TT;
          6:    LBox<TT;>;::TT;
    Signature: #15                          // (TT;)V

  public T get();
    descriptor: ()Ljava/lang/Object;
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: getfield      #2                  // Field t:Ljava/lang/Object;
         4: areturn
    BytecodeMapping:
      Code_idx  Signature
          1:    LBox<TT;>;::TT;
          4:    TT;
    Signature: #18                          // ()TT;
}
Signature: #19                          // <T:Ljava/lang/Object;>Ljava/lang/Object;

Language restrictions

Because any type variables can take on value as well as reference types, the type checking rules involving such type variables (henceforth, "avars"). For example, for an avar T:

Cannot convert null to a variable whose type is T
Cannot compare T to null
Cannot convert T to Object
Cannot convert T[] to Object[]
Cannot synchronize using an expression of type T as a lock
Cannot convert from Foo<T> to Foo<?> or Foo
No unbounded wildcards Foo<?>

Language enhancements

There are also some linguistic constructs that are possible on avars (and instantiations of any-generic types) that are not possible on erased type variables or instantiations of erased generic types, including:

You can instantiate an array of T with new T[size]. The behavior of this is to instantiate an Object[] when T is instantiated with a reference type and the correct value array when T is specialized to a value type.
Use of specialized classes (e.g., ArrayList<int>) wherever a reference type is allowed, such as in superclass definitions, member signatures, instanceof tests, casts.
Class literals for specialized types: ArrayList<int>.class.

When an any-generic type is instantiated (or partially instantiated) with concrete value types, the compiler generates references to the mangled name. So, for example, a method foo(List<int>) would be mangled to foo(List${T=I}). (As mentioned earlier, the name mangling is a temporary measure, until we have support for dynamically generated classes in the VM, at which point we'll switch over to that.)

Specializer transformations

When specializing an any-generic class, the specializer is going to perform a number of transformations, most localized, but some requiring a global view of a class or method, including:

The class name is mangled to reflect specializations
Constructors are renamed to the mangled class name
Type variable substitution (for any instantiated avars) is performed on the superclass and interfaces, and the resulting names mangled
Static fields and methods are stripped (see section Statics)
Type variable substitution and name mangling is performed on the types of all fields
Type variable substitution and name mangling is performed on the signatures of all methods
Methods that are not applicable to the current instantiation parameters are stripped (see "Layers", below)
If any type variable substitutions are long or double, local variable numbers for all bytecodes in the method are adjusted to reflect that these consume two local variable slots, not one
For category-1 bytecodes, type variable substitution is performed on the contents of the corresponding BytecodeMapping attribute entry, and, if any substitutions are needed, the opcode is adjusted to the appropriate primitive-typed opcode. DUP and similar are adjusted to DUP2 and similar if they are moving a long or double value. IF_ACMPxx bytecodes are to either IF_ICMPxx if moving int or shorter integral values, or an xCMP followed by IFEQ for long, double, or float values.
For category-2 bytecodes, type variable substitution is performed on the contents of the corresponding BytecodeMapping attribute entry, and, if any substitutions are needed, the name is mangled and the mangled name used instead.
For category-3 bytecodes, the corresponding BytecodeMapping attribute is split into a receiver part and a descriptor part; the bytecode owner component is adjusted based on the receiver part, and the bytecode descriptor component is adjusted based on the descriptor part. For invokedynamic instructions whose bootstrap semantics are associated with specialization (see Generic Methods, below), additional specialization-specific transforms are performed on the bootstrap arguments, such as type substitution and name mangling.

Statics

Consider a generic class today:

class Loader<T> {
    private static int thingsLoaded;

    T load(String name) {
        ++thingsLoaded;
        // load and return the thing
    }

    int getLoadCount() { return thingsLoaded; }
}

Because users are used to a homogeneous translation (due to erasure), user expectation is that the thingsLoaded field is shared across all instantiations of Loader. To remain consistent with this expectation under a partial hetergeneous translation, all static members (fields and methods) must be shared across all specializations of Loader as well. Consequently, static member references (putstatic, getstatic, and invokestatic bytecodes) always refer to the erased class, even from specializations. Similarly, static initializers are run only on loading of the erased class.

Generic methods

Even though specialized generics cannot interoperate with raw types, it is still possible to write code that is fully generic across all possible instantiations of T with generic methods:

<any T> void good(List<T> list) { ... }
void bad(List list) { .... }
...
List<int> list = new ArrayList<>();
good(list);  // good
bad(list);   // bad

When we invoke a generic method inferring a primitive or value type for a type parameter, we need to specialize a version of that method. Specializing generic methods is harder than specializing classes, because VM implementations are often organized around the assumption that the set of methods in a class is fixed.

Our approach is to use invokedynamic to invoke generic methods that might be specialized. We provide a bootstrap (GenericMethodSpecializer) whose static argument list includes the requested specilizations. For invocations where the instantiations of the avars are statically known, this is simple. For invocations where they are not statically known, this also turns to to be simple; this only happens when the generic method invocation exists in another generic method or class. When this happens, the generic method invocation can be specialized at the same time as the containing generic method or class, and the type information is again therefore statically known by invocation time. (One challenging corner case is when method bodies rely on instance context, such as "super" calls in generic methods.)

Key to either strategy is the observation that, for specialized invocations of generic methods, the specialized parameters will always be known at bytecode-generation time (which may be at compile time, as in the above example, or may be at specialization time for the case where one generic method calls another) and therefore the linkage can always be static. This is because none of the manipulations on types (e.g., T => ? extends T) can take as input a reference type and produce as output a value type, so the "value-reference footprint" (which type params are values and which are references) of the type parameters will always be statically known, as will the identities of any value-typed parameters.

Migration challenges

Certain patterns of usage are incompatible with specialization because they are tied to implicit assumptions about reference-ness of type variables. Unfortunately, the libraries we most wish to any-fy, such as Collections, exhibit many of these patterns. As a result, additional techniques will be needed for migrating these existing libraries.

Assumptions of reference-ness

The following idiom in classes like ArrayList would fail under specialization because it embodies the incorrect assumption that T[] and Object[] are related by subtyping:

T[] array = (T[]) new Object[n];

These would have to be updated to use:

T[] array = new T[n];

More generally, there are other implementation idioms that rely on the assumption that T <: Object; these will either need to be adjusted to use compatible idioms, or restricted to erased generics.

Reference-primitive overloadings

Some overloadings that are valid today would become problematic under specialization. For example, these methods would have a problem if specialized with T=int:

public void remove(int position);
public void remove(T element);

Such overloads would be problematic both on the specialization side (what methods to generate) and on the overload selection side (which method to invoke.)

Incomplete generification

Classes like Collection also have methods that were incompletely generified (these were deliberate compromises made during the initial generification, to preserve compatibility with pre-generic clients). For example, the remove() method, which you might expect to be declared as remove(T), is really declared as remove(Object). For an erased instantiation, this is not a problem, but if the class were to be specialized, these methods should be specialized as if the method was remove(T). Similarly, some methods use wildcard-instantiated types in their signatures (for compatibility with pre-generic clients); Collection.removeAll takes a Collection<?> as an argument, rather than Collection<? extends T>.

Null

A consequence of the assumption that T <: Object is that T-valued variables can be assigned null or compared to null. Null is a valid value of every reference type, and is often used to represent "nothing is there." Primitive and value types, however, have no analogue of null. This is a challenge for methods like Map.get, which are defined to return null if the specified key cannot be found in the map.

An unsatisfying option would be to treat assignment to null as an assignment to the zero value, which is the value that uninitialized fields and array elements are set to (zero for numeric types, false for boolean, etc), and comparison to null as comparing to the zero value. This treatment introduces ambiguities where one cannot distinguish between "no value" and "value equal to zero" (which has a perverse symmetry with the behavior of Map.get today, where a return value of null could mean "no mapping" or could mean "mapped to null".)

Manual control over specialization

The mechanical translation of a generic class into a specialized one is straightforward, but may be too limiting; there are times when we may wish to substitute hand-written code for the mechanically specialized code, either because the the generic code cannot be specialized (say, because it uses constructs that are hostile to specialization), or because the hand-written replacement code is more efficient. There are a number of possible features that could affect the specialization process.

Implementation by parts

There are situations where the natural implementation of a generic class (or a specific method in a generic class) for erased reference instantiations differs from the natural implementation for value instantiations. One such example is the Optional class, which can use a single field for its representation for references (since it can overload null to mean "not present"), but requires two fields for a value implementation (a separate "present" bit.) (The inability to do this in C# is a common complaint among expert library developers, suggesting that the one-size-fits-all approach is insufficent.)

Overriding specific instantiations of specialized classes

Sometimes, it may be desirable to write a hand-written implementation for specific data types where properties of that specific data type can be exploited. For example, while the generic implementation of ArrayList is probably good enough for most types, for ArrayList<boolean> one might want to use a packed bitfield representation instead, which would use only one eigth the memory of the auto-specialized version -- a substantial savings.

Specializing the specializations

An alternate form of control would be to use the automated specialization, but to inject new (or override existing) instantiation-specific method implementations. (For example, a List class might want a sum() method when instantiated with a numeric type.)

Overriding specific instantiations of generic methods

The "Overriding specific instantiations of a specialized class" issue arises with generic methods as well as generic classes; it is possible that for a generic method:

<any T> void accept(T t) { ... }

there is a substantially better implementation for a specific instantiation, such as T=int.

The "Peeling" Technique

The migration challenges listed above are not theoretical; they show up in the core Collection classes. A possible way out is via peeling a generic class into a generic layer and one or more layers corresponding to "all erased reference instantiations", "all value instantiations", or specific type instantiations. This technique has the potential to address many of the migration challenges that have been posed so far.

Consider the overload pair in a List-like class:

interface ListLike<T> {
    public void remove(int position);
    public void remove(T element);
}

Existing uses of ListLike will all involve reference instantiations, since those are the only instantiations currently allowed in a pre-specialization world. Note that while compatibility requires that reference instantiations have both of these methods, it requires nothing of non-reference instantiations (since none currently exist.)

The intuition behind peeling is to observe that, while we're used to thinking of the existing ListLike as the generic type, that it might really be the union of a type that is generic across all instantiations, and a type that is generic across only reference instantations.

If we were writing this class from scratch in a post-specialization world, we might have written it as:

interface ListLike<any T> {
    void removeByIndex(int position);
    void removeByValue(T element);
}

But, such a change now would not be either source-compatible or binary-compatible. However, peeling allows us to add these methods to the generic layer, and implement them in a reference-specific layer, without requiring them in the specializations, restoring compatibility:

interface ListLike<any T> {
    // New methods added to the generic layer
    void removeByValue(T element);
    void removeByIndex(int pos);

    layer<ref T> {
        // Abstract methods that exist only in the ref layer
        void remove(int pos);
        void remove(T element);

        // Default implementations of the new generic methods
        default void removeByIndex(int pos) { remove(pos); }
        default void removeByValue(T t) { remove(t); }
    }
}

Now, reference instantiations have remove(T), and remove(int) (as well as the new methods removeByIndex and removeByValue), ensuring compatibility, and specializations have the nonproblematic overloads of removeByValue(T) and removeByIndex(int). Existing implementations of ListLike would continue to compile since the new methods have a default implementation for reference specializations (which simply bridges to the existing remove methods). For value instantiations, removeByIndex and removeByValue are seen as abstract and must be provided, but remove does not exist at all.

This technique also enables the "implementation by parts" technique; it is possible to declare a method abstract in the generic layer and provide concrete implementations in both the value and reference layers. If we allowed layers for T=int, it would also enable the "specializing the specializations" technique.

This same approach works for the problematic method Map.get, eliminating the problem with nullable results in the API for primitive-specialized types: we simply move the existing Map.get into the reference layer, and steer primitive users towards the (added in Java 8) getOrDefault method:

interface Map<any K, any V> { 
    V getOrDefault(K key, V defaultValue);

    layer<ref V> {
        V get(K key);

        default V getOrDefault(K key, V defaultValue) {
            return containsKey(key) ? get(key) : defaultValue;
        }
    }
}

Further investigation

While our experiments have proven that specialization in this manner is practical, much more investigation is needed. Specifically, we need to perform a number of targeted experiments aimed at any-fying core JDK libraries, specifically Collections and Streams. The Streams library is a particularly ripe candidate for specialization, as it already contains manually specialized types for primitive streams; a successful conversion would be to end up with IntStream extends Stream<int> and be able to jettison the large amount of hand-specialized code that the streams library contains.

Similarly, while the "peeling" technique is promising, the devil is in the details. Significant further exploration is needed to determine whether the approach addresses the known challenges, whether it scales well to multiple type parameters without excessive additional complexity, and whether it can be made sufficiently intuitive to developers.