Model 3 Classfile Enhancements

Brian Goetz and Maurizio Cimadamore

January 2016

Our earlier experiments ("Model 1") in generic specialization involved rewriting both signatures and bytecodes during the specialization process. While effective, the transformation is extensive, intrusive, complex, and error-prone, and the result is that there is little extractible sharing between the specializations Foo<long> and Foo<int>. Our current strategy ("Model 3") focuses on being able to represent a parametrically polymorphic class in a more natural way, so that specialization is simple, mechanical, and unintrusive. Our goal is to move all information that may require specialization into a section of the constant pool, so a class may be specialized merely by updating that section of the pool, and leaving the rest of the classfile untouched.

In addition to potentially increasing sharing, this means that specialization need not have access to the entire raw classfile for the template (generic) class, but merely only access to the constant pool. This opens up a variety of additional implementation strategies. This document pertains only to the representation of member descriptors; the representation of polymorphic behavior in bytecodes will be treated separately.

Currently, all types in the JVM are described nominally. This allows us to create, for example, nominal descriptions of method signatures by concatenating together type descriptors. This is a useful trick for implementation, but not a great one for representation. Our strategy entails migrating to a structural representation of parameterized types (and other artifacts that depend on them, such as array types or method descriptors) in the classfile, but allows these to be reduced to nominal forms at runtime.

The GenericClass attribute

A classfile representing a generic class must be identified as such, so that when we try to specialize it, we know that it is actually specializable, and we know how to map the requested type parameters to the classes type variables. For an any-generic class (one with one or more "any" type variables), including those inherited from lexically enclosing contexts, we add a GenericClass attribute to the class. This acts as both a declaration that this class is specializable, and also acts as a "table of contents" for any type variable uses that might appear in this class.

GenericClass {
    u2 name_index;
    u4 length;
    u1 classCount;
    struct {
        u2 clazz;
        u1 tvarCount;
        struct {
            u2 name;
            u1 isAny;
            u2 bound;
        } tvars[tvarCount];
    } classes[classCount];
}

GenericClass is structured as a table of class frames, which describe this class as well as any enclosing generic classes. The current class is first, and the remaining frames describe the enclosing classes (if any) proceeding outwards. Each class frame describes all the type variables for that class, whether they are any-tvars ("avars") or not. For example, in a the following class:

class Outer<any T> {
    class Inner<any U> {
    }
}

Outer would have a GenericClass attribute that has one class frame, for itself, and that class frame contains one type variable, T. Inner would have two class frames, one for itself and one for Outer, and the frame for Outer will include the type variables of Outer. The information about enclosing class type variables is necessary for specialization of the inner class, since signatures inside Inner may refer to both T and U.

This layout leads to an implicit numbering of the type variables by flattening the class frames. In our Inner class, Inner.U would be tvar #0 and Outer.T would be tvar #1. As we'll see later, when we refer to type variables in the classfile, we'll refer to them by number, which is an index into the GenericClass attribute according to this flattening rule. (While the classfile uses numeric indexes, the GenericClass attribute contains the nominal information describing the owner and name of the type variable, so tools like javap can present a human-readable view.)

Type Entries

We define a family of CP entries called type entries, which can be one of:

Various constant pool entities will have references to other constant pool entries which can be any kind of type entry; this includes field descriptors, the operand of the new bytecode, and the new constant pool forms themselves.

To represent the signature for array of some type, historically we have prepended the [ character to the type descriptor for the component type. However, if a type does not have a nominal descriptor, such as List<int>, this trick no longer works. We replace this trick with a constant type to describe "array of something":

CONSTANT_ArrayType_info {
    u1 tag;
    u1 arrayDepth;
    u2 componentType;   // TYPE_ENTRY
}

Here, we represent an array of T as the array dimension and a type entry for the component type.

We do the same thing with method descriptors, which have historically be constructed by concatenting the argument types together, but again this falls apart when argument or return types do not have a nominal representation in the classfile. We switch to a structural representation of a method signature:

CONSTANT_MethodDescriptor_info {
    u1 tag;
    u1 argCount;
    u2 returnType;      // TYPE_ENTRY
    u2[argCount] args;  // TYPE_ENTRY
}

Method descriptors can refer to either a UTF8 or a MethodDescriptor constant. (As a beneficial side-effect, switching to this form of method descriptor also has the effect of compressing the classfile, since there is a lot of redundancy in method signatures.)

CONSTANT_ParameterizedType_info {
    u1 tag;
    u2 enclosing;          // ParameterizedType 
    u1 basicType;          // 'L' or 'Q'
    u2 templateClassName;  // UTF8
    u1 count;
    u2 params[count];      // TYPE_ENTRY
}

The templateClassName field is a UTF8 referencing the name of the template class, as in CONSTANT_Class_info. The count field is the number of type parameters; the params field refers to type entries describing the type parameters. The basicType field indicates whether the template class describes a reference class or a value class.

The enclosing field is used when a class is nested inside another parameterized class, such as the example with Outer and Inner above, and provides type parameter bindings for the enclosing context. In this case, the ParameterizedType constant describes the type variable bindings for the template class being described, and references another ParameterizedType for describing the enclosing context. Otherwise, for classes with no enclosing generic classes, it is null (CP 0).

Below are some examples of encodings of types, using R to describe the mapping function from a language-level type to its classfile type descriptor using the function R, and a constructor-like notation for the above constant pool forms:

Finally, we introduce a CP form to describe a type variable by name:

CONSTANT_TypeVar_info {
    u1 tag;
    u1 tvarNumber;  // index into tvar table
    u2 ifErased;    // TYPE_ENTRY
}

The ifErased field is explained in "Erasure", below.

Examples (omitting ifErased temporarily, and using tvar names instead of numbers for clarity):

As we can see, a type or method descriptor can be visualized as a tree, whose leaves are nominal types or type variables, and whose intermediate nodes ParamType, ArrayType, and MethodDescriptor.

Erasure

Special care has been taken to encode erasure in such a way as to maintain compatibility with existing erased generics, while at the same time, keep the JVM out of the business of computing erasure. We do this by introducing a special type token to represent "this type parameter is erased", which we represent by the single-character descriptor "_". This does not mean we are introducing erasure into the JVM type system; by the time the new constant pool forms are resolved, they will have been scrubbed of erased tokens.

To keep the JVM from having to compute erasure for each use of a type variable, the TypeVar constant contains, in addition to the index of the type variable being used, a type entry that describes the erasure to be used if that type variable has been erased. This is because not all uses of a type variable T in a given class are equal; depending on the context in which it appears, sometimes we simply want to propagate the notion that something has been erased, and other times we want to substitute an erasure. Rather than exposing the VM to this complexity, we let the static compiler figure out which cases are which, and let the VM perform a mechanical substitution.

To illustrate this, consider the class below:

class Bar<any V> { }

class Foo<any T extends Runnable, any U> extends Bar<T> {
    T aT;
    Foo<T,U> aFoo;
}

In the parameterized type Foo<Integer, int>, the first type parameter is erased, and its erasure is Runnable. But it would be a mistake to simply propagate the erasure of T into Bar's parameterizations; this would result in specializing Bar with V=Runnable. But what we want here is for the supertype to be Bar<erased>. We want to propagate the erased token here, not expand it. The same is true for the variable aFoo; we want to express Foo<erased,int>, not Foo<Runnable,int>. On the other hand, in the signature of aT, we want a concrete type; here we do want Runnable. The rules for computing erasure are complex; it is better to leave this to the static compiler.

Accordingly, the TypeVar constant carries with it an alternate type to use in the event the associate type variable is erased. We encode Foo as follows:

GenericClass[any T, any U]
class Foo extends ParamType[Bar, TypeVar[T, "_"]] {
    TypeVar[T, "LRunnable;"] aT;
    ParamType[Foo, TypeVar[T, "_"], TypeVar[U, "_"]] aFoo;
}

In the supertype declaration and the signature of aFoo, we propagate the "erased" token when T is erased; in the aT signature, we expand it to Runnable. At runtime, resolution of a TypeVar constant becomes a mechanical process; look up the type variable in the specialization context; if it has a concrete binding, use that, otherwise use the erasure provided at the use site.

A more complicated example would be when we have a class whose type var bound depend on other type vars:

class Foo<any X, any Y extends Bar<X>> {
    void m(X x, Y y) { }
}

We translate this as:

class Foo {
    void m(TypeVar[X, "LObject;"] x,
           TypeVar[Y, ParamType[Bar, TypeVar[X, "_"]] y) { }
}

When X is int and Y is erased, then the signature of m reduces to (int, Bar<int>)V; when both X and Y are erased, it reduces to (Object, Bar)V.

Member References

To illustrate further the complexity of erasure, and why we want to leave it in the hands of the static compiler, suppose we have a method invocation:

class Foo<X> { void m(X x) { } }

class Bar<any Y extends Bound> {
    Foo<Y> foo = ...
    Y y = ...
    foo.m(y)

How do we describe the invocation foo.m(y)? The invokevirtual instruction describes both the receiver type and the descriptor. If Y is erased, we want the descriptor to be List::m(Object); if not, we want it to be List<Y>::m(Y). Note that the as-erased treatment of Y in the method descriptor should describe the erasure of X in Foo, not the erasure of Y in Bar -- because the method being described is a member of Foo.

In the proposed system, we can represent this as:

invokevirtual owner=ParamType[Foo, TypeVar[Y, "_"]]
              desc=m(TypeVar[Y, "LObject;"])

Constant Pool Reduction

The new constants are designed to be amenable to a reduction process, by which they are reduced to "old" constants, which can then be loaded on a non-modified JVM. Each of the new forms reduces to either a UTF8 or Class constant.

This means that ultimately, we need a nominal representation for a parameterized type like List<int>. But ... wasn't the point of this exercise to get away from nominal descriptions? Sort of; the actual point was to get away from associating semantics with a particular name encoding -- we don't want compilers to embed mangled names of the form Foo${T=int} and expect the VM to understand this as Foo<int>. However, at runtime, the JVM is free to dynamically generate nominal identifiers for particular specializations, so it may continue to use string comparison and interning for matching type and method descriptors.

Reduction is performed in a specialization context, which, for each type variable, maintains whether that type variable is erased, and if not erased, its binding (which is a concrete type descriptor). If specialization is triggered by resolving a ParameterizedType constant, we populate this specialization context from the contents of the ParameterizedType constant; if specialization is triggered by ordinary class loading (in which case we want the all-erased "specialization" of the class), we populate the context by treating all type variables as erased.

Populating the specialization context from a ParameterizedType constant involves some care. It is possible that, due to separate compilation, the client could have a "stale" picture of the type variables of the class being specialized. The structure of the ParameterizedType and GenericClass have been chosen to support binary compatibility for common cases (see Compatibility, below). The chain structure of the ParameterizedType and the frames list of the GenericClass work together. We initially populate the specialization context with an array of bindings, whose length is the total number of type variables in the GenericClass, and each binding is initialized to 'erased'. Then we walk the chain of ParameterizedType, matching each ParameterizedType with the GenericClass frame for the class named in the ParameterizedType. If the generic class frame has more type variables than are present in the ParameterizedType, extra variables are assumed to be erased; if the ParameterizedType has more type variables than the GenericClass, this is an ICCE. If an erased type variable is given a non-erased binding, this is also an ICCE (as it indicates that a type variable was either added in the middle, or changed from any to erased.) Otherwise, the bindings are copied into the specialization context from the ParameterizedType to the slots corresponding to that class frame's variables.

Reducing the new constant pool forms is always done bottom-up. This means that for any type entry parameters, we first resolve those (to their UTF8 form) before proceeding. If any of the type entries parameters resolve to a Class constant, we convert this into a UTF8 type descriptor (by prepending L or Q and appending ; to the class name.)

For a TypeVar[T, erased] constant, if T is erased in the current specialization context, we resolve this to UTF8[erased], otherwise we resolve this to the binding of T.

For an ArrayType[n, descr] constant, we resolve this to a UTF8 containing n leading [ characters followed by descr.

For a MethodDescriptor[retDescr, argDescr[]] constant, we resolve this to a UTF8 with a leading (, the argument descriptors concatenated, a ), and the return descriptor.

For a ParameterizedType[template, params[]] constant, the process is somewhat more complicated, as this involves maintaining a cache of loaded parameterizations, mapping parameterizations to a runtime-generated name for the class. We first look in the cache to see if there is already an entry for this (class, parameters) combination; if so, we resolve to a UTF8 describing this class. If it is not present, we dynamically allocate a new class name, put that mapping in the cache, and then resolve to a UTF8 constant describing that new name.

When we go to load a class whose name is one of these new dynamically generated names, we fetch the specialization metadata (class and parameters), and perform specialization accordingly, and enter the new class in the system dictionary.

When reduction is complete, it is an error if there are any remaining erased tokens referred to by method, field, or supertype descriptors.

Reduction Example

Given:

class Foo<any T> {
    void m(Foo<T> foo, Bar<T> bar) { }
}

We would generate a constant pool that looks like:

#1: UTF8["Ljava/lang/Object;"]
#2: UTF8["Foo"]
#3: UTF8["Bar"]
#4: TypeVar[0, #1]                 // T
#5: ParamType[0, 'L', #2, #4]      // Foo<T>
#6: ParamType[0, 'L', #3, #4]      // Bar<T>
#7: UTF8["V"]
#8: MethodDescriptor(2, #7, #5, #6)   // (Foo<T>, Bar<T>)V

We'll generate a CONSTANT_Methodref_info for m that refers (indirectly) to #8. If we specialize Foo<erased>, our constant pool reduces to:

#1: UTF8["Ljava/lang/Object;"]
#2: UTF8["Foo"]
#3: UTF8["Bar"]
#4: UTF8["Ljava/lang/Object;"]
#5: Class[#2]
#6: Class[#3]
#7: UTF8["V"]
#8: UTF8["(LFoo;LBar;)V"]

On the other hand, if we specialize Foo<int>, we get:

#1:  UTF8["Ljava/lang/Object;"]
#2:  UTF8["Foo"]
#3:  UTF8["Bar"]
#4:  UTF8["I"]
#5:  Class[#9]
#6:  Class[#10]
#7:  UTF8["V"]
#8:  UTF8["(LFoo${I};LBar${I};)V"]
#9:  UTF8["Foo${I}"]  // or other nominal encoding
#10: UTF8["Bar${I}"]  // or other nominal encoding

Compatibility

Classes will evolve; some evolutions are compatible, and some are not. The following enumerates the compatibility consequences of the proposed approach.

Generic Methods

Specialization of generic methods was a significant pain point in the Model 1 translation. Here, we propose an alternate translation scheme for generic methods that works with, rather than against, our constant-pool-centric approach, by turning generic method specialization into class specialization. (The Model 1 approach did this behind the scenes at runtime; here, we embrace this at compile time, reducing the runtime overhead and complexity of the scheme.)

For a class with a generic method:

class Foo<any X> {
   static <any Z> void m(Z z) { ... }
}

We desugar as follows:

class Foo<any T> {
    static class Foo$m<any Z> {
       void m(Z z) { ... }
    }
}

Where m is a static method, Foo$m is a static nested class; where it is an non-static method, it is a non-static nested class. We can then rewrite generic method invocation as ordinary invocation on an instance of a suitable receiver object (some instantiation of Foo$m<..>), and we are able to turn method specialization into class specialization. (This fits nicely with the encoding of enclosing type variables in the GenericClass and the encoding of enclosing type parameters in the ParameterizeType chain.) Further, this simplifies a number of translation issues; synthetic methods (such as bridges or desugared lambda bodies) required for the compilation of m() can be placed in the Foo$m class, rather than placing them in Foo and leaving it to the runtime specializer to figure out which methods in Foo need to be specialized in conjunction with m().

Further, we rewrite invocation of generic methods using invokedynamic, primarily so that the language runtime can take care of the details of name mapping and of creating, caching, or finding a suitable receiver object for the desugared generic method.