Template Classes in the VM

November 2017 (v. 0.2)

John Rose and the Valhalla Expert Group

“Nature disclaims in thee; a tailor made thee.” – King Lear

Background

The JVM needs to support classes which can be customized directly, beyond the existing mechanisms of subclassing (with behavior overriding) and reference polymorphism (with generics and interfaces). For example, an ArrayList must be able to adjust its internal structures to support arrays with elements which are not object references. This will heal the classic rift between primitives like int, and short, and is all the more urgent as we introduce new primitive-like types which are not well represented by pointer polymorphism.

This note proposes a simple yet very powerful mechanism, called template classfiles, which allows single classfile to define not only a the usual class type with its methods, but also a parameterized family of classes and methods derived by the application of various parameters. Such derived types are called species types; the derived methods are called species methods.

Generically, a template consists of a partial definition of some construct, where one or more “holes” are left in the construct, to be filled in later. The crucial decision, in designing a template mechanism for some previous construct, is to choose where such holes may be. For template classfiles, we choose to allow template classfiles to contain constant pool holes, almost anywhere in the constant pool, with extra metadata that declares how and when those holes may be filled in.

After deciding to allow this sort of patching of constant pools, most of the rest of the design is a series of forced moves, working out the consequences of managing the holes. The result is not fully sufficient to implement the various proposals for extending the Java language, but on the one hand, it provides a framework for prototyping powerful new “hooks” to tailor-make bespoke JVM classes, based on template parameters as well as names.

On the other hand, at the same time, the framework provides a “first cut” for designing the remaining JVM-level support for the new features, notably any-generic classes and methods.

We start by observing the standard components of today’s (non-template) classfile, and the dependencies of those components on the contents of the constant pool. Then we “inject” new dependencies, in the form of holes and corresponding constant pool segments that depend on those holes. Along the way, we enforce various simplifying rules that allow the JVM to retain its strong type safety characteristics.

What’s in a regular classfile?

The JVM’s reason for existence is to manage code and data with strong runtime type safety, specifically for Java and languages with similar runtime requirements.

It is thus not a surprise that a classfile contains named code and data definitions, using and/or defining various (runtime) types, all decorated by metadata.

But every classfile starts with a constant pool (or CP for short), in which are stored, in the form of unresolved symbolic references, all of the types, names, and other constants needed by the rest of the classfile. The rest of the classfile depends on a constant by including the index of that constant. Constants can also depend on simpler constants. For example, a method reference can depend on the referenced method’s name, type, and declaring class, which also are represented as elements of the CP.

Once past the constant pool, we start with the classfile proper, defining the data and code of a type (the class of the classfile).

The data definitions are field names and types, which amount to an instance layout (for both value types and object types). A field’s name and type both depend on CP entries. If there are annotations or other metadata, they too depend on CP entries.

The code definitions are various kinds of named and typed methods, usually with bytecodes. As with fields, a method’s name and type both depend on CP entries. The bytecodes of a method (if present) can depend on the CP in very rich ways. For example, if a method makes a temporary array for some purpose, the element type of that array is usually determined by a CP entry, and similarly with temporary objects or values.

Additional metadata records pervasively decorate both code and data, as well as the class itself. These classfile components include relations to other types and annotations, as well as access control and source-level debug or reflective information. Some of these components, especially annotations, depend crucially on CP entries.

Currently, each classfile defines precisely one type, with at most one instance layout.

Here is a condensed list of specific classfile components:

CP: all types and names needed by the rest of the classfile
type: the type defined by the classfile
supers: the superclass and interfaces for the defined type
fields: an instance layout derived from typed, named fields (including inherited), plus statics
v-methods: virtual (overridable) methods
v-table: an instance v-table: a set of typed, named virtual methods (including inherited)
s-methods: non-virtual methods and constructors
bytecodes: a body of code for methods in the class
annotations: placed on the class, its members, method parameters, etc.
metadata: additional metadata (inner classes, reflective and debug info, etc.)

(Some of these terms are further defined below.)

Virtual methods and overriding

The JVM defines, as a central notion, receiver-based virtual method dispatch. For this reason, the JVM pays special attention to a subset of the methods of any concrete class, and organizes them so that an invocation of a particular name and type descriptor, on an instance of that class, will always link to and execute the unique most specific method, thereby overriding (JVM 5.4.5) any other methods of the same name and descriptor. For this account we use the term virtual method (or v-method for short) for any method that takes part in overriding relations. Generically (without prejudice to any particular implementation technique) we will call the JVM’s organization of v-methods, for any given concrete class, that class’s virtual method table, or v-table for short. For any method that is not a v-method we use the term statically dispatched method, or s-method for short. In fact, s-methods are exactly those which are marked static or private, or are constructors.

(Note that HotSpot assigns both virtual tables and interface tables to both concrete and abstract classes, but those details do not concern this account.)

Inheritance

Inheritance is a process by which a given class is deemed to contain components of one or more of its supertypes (a class and interfaces). These components are of two basic kinds, fields and methods. Inherited non-static fields are implicitly present in the layout of the class. Inherited v-methods are added to the v-table of the class. All inherited fields and methods are also resolvable as API points of the class. Note that private fields and methods are not inherited.

Here come the templates

Template classfiles are the new thing.

A template classfile has, in its constant pool, one or more holes (more formally called template parameters) which allow the class to be (in effect) loaded many times, defining a distinct template species for each distinct combination of species arguments used to fill the holes.

Of course, that’s just the effect; the classfile is organized to make it easy for the JVM to determine which components are shared by all species and which must be separetely recomputed for each species. This ability to share classfile components across species is the main reason template classes are more efficient than schemes which lazily generate a new classfile for each species in some other template scheme, outside of the JVM.

Most (not all) components of a classfile, and most (not all) entries in the constant pool, which today depend recursively on CP entries, can be adjusted to instead depend on CP holes. In effect, every component and CP entry in a template classfile is associated with set (possibly empty) of holes on which it depends.

Prime directive: No holes in the concrete

When we start adding holes to the CP, the complexity of the CP structure will increase greatly, leading to justified concerns that the resulting class structure will not be statically verifiable, and cannot be kept type-safe.

In order to meet this concern, and continue to benefit from the solidity and safety of today’s JVM, we will make one simple rule:

API types, class instances, and method invocations never have holes.

In other words, holes will always be an internal matter within a single CP. All real data, real method invocations, and verifiable types will be hole-free.

To emphasize this, we will speak of complete symbolic references to types, methods, and constants, which are referred to in such a way that all of their holes are filled (in the context of the reference).

Only complete type, method, and constant references are ever resolved. We will sometimes emphasize this point by speaking of “complete” runtime types, methods, and constants, but this is slightly misleading, since there is no such thing as an incomplete runtime type, since all runtime types are obtained by a resolution process, and resolution cannot proceed until all holes are filled.

Thus, the term concrete type is automatically strengthened to mean not just a type not marked abstract, but a non-abstract type which is also complete. When we refer to abstract runtime types, they too are automatically complete.

Within a CP, before holes are filled, unresolved type references and unresolved constants may be encountered which do depend on holes which are not assigned a value; such types and constants will (contextually) be called incomplete.

Thus, the types which may appear in a resolved symbolic reference to some API must always be either concrete or abstract (but never incomplete). Instances must always be instantiated from concrete types, which implies that all their field types (inherited and local) are themselves concrete or abstract.

Any method invoked on an instance must have a complete, hole-free method type descriptor. This is a slightly subtle point, since types in method are not always resolved, but in the end they must refer to types which are resolved, or at least in principle resolvable. Thus, although APIs may appear to be rich with incomplete type references, in the end when they are executed, it is always with the holes filled.

Likewise, inside an instance’s v-table all of the method type descriptors are complete. This means that, at times, the JVM may not create a v-table catalog which is shared across all species of a template. The most it can share is a catalog of that subset of the v-methods whose descriptors have no holes. (They can have holes in their bytecodes, which is a very powerful hook.) A very clever JVM might try to anticipate a v-table slot for an incomplete v-method, in advance of creating a species that completes that v-method, but such anticipations must be done with care, since the same incomplete v-method might turn out to need a new v-table slot in some species, and to be an override of an existing slot in others.

How is this “completeness” rule enforced for fields and methods? It is simple: The instructions that refer to fields and methods (getfield, invokestatic, etc.) cannot execute unless their supporting constants are resolved. And those constants cannot be resolved until the holes are filled.

In general, since bytecodes depend on resolvable CP entries, bytecodes cannot be type-checked until all those CP entries are complete. This means, in general, that the verifier must be run on each distinct species of a template. There may be partial verification algorithms available which are robust in the presence of some holes, so it may be possible to perform a partial verification on incomplete bytecodes, and later verify any additional type relations that were unsolvable before the holes were filled.

All of this begs the question of how holes are filled. Before we answer that, we must first say more about the structure of CP holes.

Punching holes in the pool

A constant pool hole is represented by a new entry in the constant pool, a CONSTANT_TemplateParameter structure, as follows:

CONSTANT_TemplateParameter_info {
   u1 tag;  // value 25, or TBD
   u1 kind;  // the kind of the parameter (CONSTANT_FieldType, etc.)
   u2 name_index;  // the name of the parameter
   u2 type_index;  // zero or type constraint on the parameter
   u2 dependency_index;  // zero or a previous hole this one depends on
   u2 default_value_index;  // zero or the index of a default value
}

The kind is one of the following values:

CONSTANT_FieldType: The hole must be filled by a field type.
CONSTANT_MethodType: The hole must be filled by a method type.
CONSTANT_MethodHandle: The hole must be filled by a method handle.
CONSTANT_Dynamic: The hole must be filled by computed value (not a method or type).

(Note: Currently, the role of CONSTANT_FieldType is played by CONSTANT_Class. This will probably change, and we will assume such a change in this document.)

The name index defines the name of the hole, which is used only by reflective APIs involved with species instantiation.

The type index may be zero, meaning no constraint beyond the kind. Otherwise, the interpretation of the type index depends on the kind:

CONSTANT_FieldType: The hole must be filled by that type or a subtype.
CONSTANT_MethodHandle: The hole must be filled by a method handle of that exact type.
CONSTANT_Dynamic: The hole must be filled by computed value of that type.

The JVM does not allow CP holes to depend circularly on themselves, either directly or indirectly. For example, a FieldType hole cannot declare itself as a type constraint, nor can its type constraint be a type species which in turn depends on the same hole.

All holes will be filled in by linkage logic (probably bootstrap methods) which can, if necessary, perform appropriate conversions to the required types, if those types are present.

The ordering of holes in the CP is significant and constrained, as is the ordering of CP entries which depend on holes. A constant that depends on a hole must always physically follow that hole; thus its index is greater than the hole’s index. Specifically, if a hole has a non-zero dependency_index, that index must refer to a previous hole.

Thus, each hole is associated with a list of previous holes, on which it is deemed to depend. The hole need not make actual use of the previous holes on which it depends, but the JVM is required to fill the holes in dependency order. Also, each constant is associated (deterministically and syntactically) with a principal dependency, a hole on which it may depend, and whose dependencies are a superset of the actual dependencies of the constant. All this simplifies the analysis of dependencies, in general, by turning the dependency relations into a tree-shaped order, indexed by holes.

(The representation of this tree-shaped order is TBD, but a likely form appears to be CP segments, so we will run with that for now.)

A CONSTANT_TemplateParameter constant implicitly introduces a new constant pool segment, containing constants which are deemed to depend on that hole and all of its dependencies. Their actual dependencies must be contained in the dependencies of the hole that leads the segment.

By organizing dependent constants in CP segments, the JVM has an easier job filling holes in the correct order, while ensuring that constants are never resolved before their holes are filled.

The first constant in a CP may not be a hole, and therefore the first CP segment consists of all the complete constants in the classfile.

Segments must be non-empty. Holes can be defined adjacently, and if a segment begines with two or more holes, those holes together define the dependencies of constants in the segment. Each successive adjacent hole must depend on the previous adjacent hole. The first or only hole in a segment may depend on nothing, or on a hole in any preceding segment.

The hole that begins a segment may have a dependency_index of zero, meaning that the new segment depends on no other segments except the first segment. Otherwise, the dependency_index must refer to a hole in a previous segment, in fact the last hole at the beginning of that segment.

Via hole-to-hole dependencies, segments are organized in a tree structure of dependencies. For any constant C, we define the set of usable template parameters for C as follows. First, if C is in the first CP segment, the set is empty. Otherwise, identify the CP segment of C by finding the last hole S in the CP sequentially before C. (I.e., find the last hole at the beginning of C’s segment.) Then follow the linear chain of dependency_index links from S until the zero index is reached; each hole encountered must be assigned a value before C can be resolved.

Also, define the usable constants of some constant C by taking the union of the initial segment (whose constants are always usable) plus the constants in the segments associated with all of C’s usable template parameters. Note that this union must always include the constants in C’s own segment. Constants in segments introduced by unrelated holes are not usable from C.

Constants in the first (or only) CP segment can be called root constants. They are always usable from any other constant, and never depend on holes.

Any composite constant C may contain holes, but it must always be the case (i.e., it is a structural constraint) that these holes must be usable constants of C. By induction, this condition is equivalent to requiring that every immediate sub-component of C is a usable constant of C. A constant in any given segment is not required to use all available holes in that segment, and in fact it may depend on none, being already complete in itself, and could be moved into the first CP segment (as a root constant) without semantic change.

Segments are not reified as separate CP structures, but rather arise from the presence of holes in the CP, as described above. (We could change this if we wanted, and add internal segment headers inside the CP. I have tried to avoid this as unnecessary.) A segment is uniquely named by the index of the hole which introduces it; if there are two or more such holes, it is the last hole that introduces it. The initial segment is unambiguously referred to by a zero index instead of the index of a hole.

The intention is that non-root constants are organized in small, easy-to-copy CP segments, containing mostly or only constants which actually depend on the holes the introduce the segments. Because we intend them to be small “afterthoughts” in the CP, they have been facetiously called “constant puddles” or “kiddie pools”.

Making templates

After the CP is defined, the main body of the class defines fields and methods and other structures. In a template classfile, these elements can be made parts of one or more templates which depend on one or more CP segments.

As we review the various classfile components already listed, we will note how they can participate in templates.

CP: the CP contains holes and is segmented as described above
type: the this_class field may point to a CP element T in a non-initial segment, if the class has a Template attribute which points to (the last hole in) that segment
supers: the supers may point to any constants usable from T
fields: field types must be usable from T
v-methods, s-methods: If a method has a Template attribute referring to a segment S, it may use any constant usable from S, else it can use only root constants; its must be type is a usable constant.
v-table: the v-table is populated with all v-methods which are not templates, or whose templates use no more constants than the enclosing class.
bytecodes: bytecodes can use any constants usable from their method.
annotations: an annotation must contain root constants only, unless it is defined as a CONSTANT_Annotation a segment usable by element it annotates.
metadata: metadata not mentioned here must use only root constants.

If the class C (as a whole) is marked as a template, then the class defines as many species types as are created by supplying actual arguments to the usable template parameters of C. The class also defines a single non-instantiated type, which refies the class apart from any specializations.

If any method M is marked as a template, then that method can only be resolved by first supplying all usable template parameters of M. Like the class, it can be reflected in an un-resolved, incomplete state, but as such cannot be invoked, and its reflected type information will be incomplete.

When a template is specialized, its relevant CP segments are first duplicated and the holes patched with the species arguments. At that point, all constants in the CP segment are ready to be resolved.

The JVM is allowed, but not required, to pre-resolve some of them, if it can do so. This is possible, for example, if a constant is complete in itself, and yet is placed after a hole in the CP.

Static fields can be given Template attributes, in which case they are instantiated along with their associated segments. If such a field has a DynamicConstantValue, this value may depend on any CP constants usable from the template segment. This in turn with CONSTANT_Dynamic gives a “hook” for creating arbitrary specialized constants or metadata, associated with any species of class or method. In the absence of the DynamicConstantValue feature, static but specialized access methods (which ldc the specialized value) can do the same trick, less directly.

In this way, a classfile can define any number of (related) species of type, method, and field.

(We could also choose to use modifier bits instead of Template attributes, but then we would have to define detailed rules for which constant segments are relevant to a given template class or method. A regular explicit Template attribute seems reasonable and is more flexible than implicit rules. There seems to be no strong reason to connect the template parameters of an class to its v-methods, even though the Java language makes such a connection automatically. Still, a Template attribute with a zero value could be used as a signal to opt out of some implicit default, such as having a non-static method or field depend only on root constants.)

A simple example

Let’s use the example of a version of List which has been upgraded to include a type hole for its element type.

In this case, the classfile for List will be have a Template attribute on the class which points to the non-root segment containing the template parameter for List; call this hole T (as is tradition).

For simplicity, let’s give List the methods size()int and get(int):T.

Here’s the approximate structure of the classfile:

CP {
  T=TemplateParameter[
        kind=FieldType, name="T", type=0, dep=0, def=Erased]
}
Template[T]
interface List {
  Template[T]
    abstract T get(int);
  abstract int size();
}

In this case, the size method is common to all species of List, but the get method must be invoked on a particular species, since only completed constants can be resolved.

Since size is defined without a template, it uses only root constants, and therefore can be resolved from the raw template type List, as InterfaceMethodref[List,size,()int]. This is an important hook for translation strategies, since it provides a way for all species of a given template to implement a common API. It does come with a restriction: The types of such methods must not depend on the type template parameters (T in this case).

Supertypes with holes and conditions

A supertype (class or interface) may be a symbolic reference to a template type species, and that symbolic reference may be incomplete, since it can use holes that are usable by the template subclass. In order to create a concrete type from the template, the holes must be filled, thus allowing the concrete type to refer only to concrete supertypes.

It may be that after holes are filled the list of super-interfaces will contain duplicates. This is permitted and the duplicates are merged.

In a template, the list of super-interfaces can refer to a CONSTANT_Dynamic whose value is either a single interface type or a list of them (reified as Class mirrors). This allows ad hoc programmatic selection of the interfaces to be implemented by any given species. For example, a List type whose element type is Comparable can itself implement Comparable, and provide a lexicographic element-wise comparison algorithm to back it up.

The management of conditional methods (such as compareTo only if the element type is comparable) is TBD, but could be provided by an auto-bridging mechanism, also driven by the CP constants and a BSM which processes them with ad hoc code. Likewise, the embarassing method List::remove(int) could be made conditional and visible only on the legacy-compatible species of List.

Conditional fields are probably also useful; they can provide additional state which is needed only in some type species. For example, an Optional<T> implementation can use a sentinel value to signal emptiness, if the type T admits a sentinel (such as null), but Optional<long> needs an additional bit to do its job. And Optional<float> and Optional<boolean> don’t. Managing this ad hoc logic probably requires some careful deployment of bootstrap methods to spin up species-specific constants for such varying behaviors.

Patching the holes

To make use of a type, method, or field species, a CP reference must be formed to it which incorporates the following elements:

the name of the template class C from which the species is derived
the name and type of the method or field (if relevant)
the template arguments, in the same order as the corresponding holes in C’s CP

Note that the template arguments are logically applied after the name and field or method type are specified.

Because of overloading, the type part of a method needs to be spelled with non-class holes intact, since that is how the API element seems to be distinguished from its peers. This seems to be a necessary exception to the rule that constants with holes cannot participate in resolution.

Suppose a method wishes to implement the species type List<float>. In that case, the form of its super will be something like FieldType[List,float]. The same is true for using this species type in an API point or an instanceof check.

If we wish to call the method List<float>::get, we resolve a CONSTANT_InterfaceMethodref constructed from List<float>, the name get, and the type (int)T, where the T is some placeholder that stands for the hole in List.

After holes are filled for an overloaded method, it may be that the method signature matches two methods of the name name and type. When this happens, the methods must have different sets of template parameters, and one set must be a superset of the other. The method with the superset is chosen over the one with the subset. This rule also applies to v-table packing; more specialized methods override less specialized ones with the same filled-in method type descriptor.

This rule helps us to solve a long-standing problem: Defining a “wildcard protocol” associated with a generic class. We can do this by adding a second overloading to List::get which uses only root types.

Template[T]
interface List {
  Template[T]
    abstract T get(int);
  abstract Object get(int);  //should be U-Object
  abstract int size();
}

In this scheme, List has a specialized entry point, for callers which are willing to commit to a particular instance, and a generalized one, for generic code.

Subclasses which implement this template must implement both, but translation strategies can arrange for bridges, in whichever direction is needed. The JVM doesn’t need to care about the particulars of the bridging policy.

Here is an implementation which is probably typical in new code:

Template[U]
class ArrayList implements FieldType[List,U] {
  // all instance fields are automatically under U
  private T[] elements;
  Template[U]
    T get(int i) { return elements[i]; }
  abstract Object get(int i);  // generic callees link to this
  Template[U] __Bridge Object get(int i) { return get<T>(i); }
  int size() { return elements.length; }
}

We picked a different name U for the subclass hole, just for clarity.

The abstract method is copied down from List, just to make it clear that it is in the mix.

There are three definitions of get, one which does the actual work and two which manage the generic access. The v-table for any species of this type contains two methods, since the two generic ones have the same signature. Of those two, the templated one “wins” and is installed in the v-table. Not only is it non-abstract, but it also has a larger set of template parameters ({U} > {}).

The result is that generic callers can do an invokeinterface or invokevirtual of the generic get(int)Object, on List or ArrayList, while every instance of ArrayList can take a direct call to a strongly typed get(int)U or (via List) get(int)T.

The use of the raw template type List also makes sense as a query to instanceof; after all, every instance of every species of List is in fact an instance of the template class List.

These uses of raw template types make it clear that constant pool references must be unambiguous about whether a templated type is being mentioned in its raw form or as a particular type species. The raw form has a subset of operations (perhaps a very small one) compared to the species types, but they always include instanceof and checkcast, even though those bytecodes also apply to species.

The fine print

Fixme: Write up the changes to structural checking rules to show exactly where holes can and cannot go. Write up the new CP forms that trigger creation of species.

Appendix: A glossary just for the JVM

Summary:

Classes = Object Classes ⊔ Value Classes
Instances = Object Instances ⊔ Value Instances
Object Instances = Object Class Instances ⊔ Value Class Boxes
Reference Values = Object Instances ⊔ null
Universal Values = Reference Values ⊔ Value Instances ⊔ Primitives

Here, ⊔ (Unicode 2294) denotes disjoint union.

class: A loaded classfile, defining code, data, and types (includes interfaces, etc.).
value class: A class which is specially marked to define one or more value types.
object class: A class which is not marked as a value class.
constant pool (CP): Catalog which includes all symbolic references used externally or internally by some class. May also include constant values.
CP constant: Any element of a constant pool. May be built recursively by reference to other CP constants or CP holes, in the same constant pool.
CP hole: An open space in a constant pool, serving as a formal parameter to a template.
CP segment: A group of constants in a single CP which as a group depend, directly or indirectly, on a particular set of CP holes. (Any kind of constant can be in any segment.)
template class: A class with one or more CP holes.
template argument: A value which is intended to fill a CP hole.
species: A type derived from a template class by filling one or more holes.
incomplete type reference: a class or species reference with one or more (unfilled) holes.
abstract type: an abstract class or species without (unfilled) holes.
concrete type: a non-abstract class or species without (unfilled) holes.
instance: A value or object of a specific concrete type.
concrete descriptor: A field type or method type reference without holes.
statically dispatched method (s-method): A static or private method, or a constructor.
virtual method (v-method): A method which is not statically dispatched.
override: Relation between two virtual methods with identical names and concrete descriptors, define and/or inherited by the same concrete class.
virtual method dispatch: The method selection performed by invokevirtual or invokeinterface on instances of some concrete class.
virtual method table (v-table): For some concrete class, an abstract mapping from names and concrete descriptors to all methods defined and/or inherited by that class, as used for dispatch.
field layout: A method for packing the non-static fields of a concrete class into a block of heap memory. Distinct species of the same class may pack their corresponding fields very differently.
universal carrier: An internal JVM type which can equally represent a normal reference, a value instance, or a primitive value, without losing the distinction between those types.

Appendix: A field guide to bootstrap methods

Bootstrap methods were introduced in Java SE 7 to aid the implementor of a dynamically typed language to assign language-specific semantics to an invocation site. The new call format was coded as a strongly typed invokedynamic instruction, without the requirement of “open coding” bytecodes around such a site to enact the language semantics.

The responsibility for linkage, argument transformation, and method dispatch (including caching) was transferred away from the call site and onto the bootstrap method itself, assisted by a new API for combining method behaviors (method handles) on the fly.

Though initially envisioned for dynamic languages, invokedynamic was designed in a JVM-centric manner, without artificially imposed “field of use” restrictions. (Arguably, it is really an improved static invocation mechanism, and should have been called invokestaticv2.) As a result, the feature was used in Java SE 8 to support lambda capture, which is about equally coupled to dynamic and non-dynamic languages. (Java is at neither extreme; like Objective-C it is between Smalltalk and C++.) The bootstrap methods to generate lambdas capture sites, called “lambda metafactories”, acted like macro-expanders to introduce potentially verbose inner class definitions, customized for each capture site.

Similarly, in Java SE 9, string concatenation expressions were refactored to use bootstrap methods, instead of requiring a loosely knit and hard-to-optimize web of StringBuilder method calls.

A future version of Java will have a feature called “CONSTANT_Dynamic” (“condy” to its friends, as “indy” is for invokedynamic), in which a bootstrap method defines the resolution and semantics, not of a call site, but of a constant pool entry. This will open up a series of new uses of bootstrap methods, to create complex constants and metadata bundles. Initially, a condy constant will be usable only from ldc and (recursively) from a bootstrap method argument list, but it is likely to travel elsewhere, such as arguments to annotations and to separately linked constants (see JDK-8186006).

Beyond condy, it is likely we will plant BSMs in more places. For example, just as indy is used to create lambda objects, using regular rules of composition, BSMs could be planted on bodiless methods and used to “macro expand” the method’s body on first invocation. Today’s native methods are like this, except that they “expand” to calls to native JNI entry points. Tomorrow’s BSM-generated methods can be used to generate many useful things that don’t need to be pre-generated as bundles of repetitive bytecodes:

more flexible and granular native bindings (Project Panama)
recipe-driven toString/equals/hashCode methods (Project Amber)
componentwise methods for comparison, serialization, etc.
auto-generated bridge methods, to implement non-JVM overrides
invisible compatibility methods, to bind old code to evolved APIs

Such a feature really shows its power when an inherited BSM-equipped method is re-instantiated in each concrete class where it is eventually invoked. This is a promising way to generate templated methods like toString or compareTo, which have a common “top level” structure that must be adapted to the fields of each concrete class.

If a whole method, or even a hierarchy of methods, can be generated from a BSM, then so can a type, or even a hierarchy of types. It is likely that BSMs will be useful in many parts of a template class, and especially at static speciation time, when a template is first expanded on a new set of template parameters. A BSM at that point might determine an optional group of fields or methods, that pertain to that particular template species. This is likely to be useful when a template class has many species with some ad hoc polymorphism, where the conditions of each species expansion are computed by one or more BSMs, planted in strategic locations within the template.

Appendix: Language vs. VM

The Java language and Java VM are not fully aligned in their types, operations, and access rules, and this is intentional. The Java language, as defined by the JLS, defines a static type system which is much richer than the JVM’s runtime types. For example, all of List<Number>, List<String>, List<?>, and raw List erase down to the same JVM type, whose descriptor is Ljava/util/List;.

In the other direction, the simple Java expression f(x) might be translated into the JVM as any of invokevirtual, invokeinterface, invokespecial, or invokestatic.

When it comes to overriding, the JVM and JLS use different rules. The JVM only pays attention to the runtime type descriptor of a method (as well as the name), and ignores many details of Java’s static types. For example, a method described as void f(List<Number>) cannot override one described as void(List<String>), but if two methods are described as f(Ljava/util/List;)V one can override the other, regardless of which version of List was in the source code.

Also, because the JVM respects only strict type descriptor identity for detecting overrides, some methods which do override at the Java level do not override at the JVM level. For example, a Java method described as String f() can override one described as Object f(), but not vice-versa, and the JVM will not override either by the other. The required language semantics are implemented by the JVM by using extra translation tricks, in the form of “bridge methods” which the translator generates to link the different descriptors together. In such cases, the JVM will end up allocated two v-table slots to implement the one override relation.

These differences can be summarized by saying that the JVM uses erased types and exact descriptor matching to take care of its business, while the Java language manages type checking and method resolution and selection at the level of a fuller set of unerased static types.

Sometimes access control rules are slightly different in the JVM than they are in the language. Sometimes the translator sometimes grants package-level access to certain private members, if a nestmate requires that access (this is obsoleted by the NestMembers attribute). A nested class which is declared private or protected in source code is translated with a JVM-level access modifier of package or public (respectively).

When it comes to lambdas and method references, there is a large difference of semantic level between the JLS and the JVM. The JVM knows nothing directly about lambdas, not even whether they are simulated by inner classes (when in fact that is the case). The JVM simply provides the right “hooks” for the translation strategy to use when creating and invoking lambdas. Remarkably, the existing invokeinterface instruction turns out to be the precise hook needed for invocation, but construction requires a relatively new JVM feature, invokedynamic, backed up by a bootstrap method which “knows all about” lambda construction, and “has an understanding” with the source language compiler.

The JVM simply connects up the invokedynamic instructions to their bootstrap methods, and gets out of the way. This pattern is expanding, as we place bootstrap methods in other places. It is likely that the source compiler will make more bootstrap methods for more language features in the future, and even that the bootstrap methods will be attached to places other than invocation instructions.

In the present note we are only concerned about JVM semantics, which means we concentrate on erased types. Although this sometimes requires the Java language implementor to play translation tricks, to force the JVM to execute some required Java semantics, it is a fair trade, because it allows the JVM to decouple from some of the more difficult complexities of the Java language, and focus on improving its execution of a simpler set of runtime semantics.

There is another benefit from this division of labor: If many similar constructs (such as many instances of a generic class or method) are translated to a single JVM artifact (a classfile or method), then the footprint complexity of the running JVM program will be more compact, since a common artifact is implementing many logical entities. Of course, this might make the JIT’s job harder, but JVMs already have a trick up their sleeve, called “inlining” (or “splitting”), where a hot method that is a bottleneck will often be copied into a specific usage site, at which point the JIT can see all the details of the particular generic instance being executed.

It seems obvious that the Java-level semantics for enhanced generics will be complex in something like the way generic classes and methods are complex. This is true whether the language designers give us explicit template classes or reified type parameters or some other form of parametric polymorphism at the source level. In any case, it will almost certainly be impractical to “bake in” JLS rules at the JVM level.

Therefore it seems best to require that the Java source compiler is solely responsible for the exact rendering of source-level rules, and for the JVM to provide relatively simple, flexible services for spinning the right types, data, and code at runtime. That is the reason this document is so bold to specify particular new JVM mechanisms, without committing to any particular future features in the JLS, only to the general problem of “healing the rift” between object types, value types, and primitive types.

Appendix: Kinds of polymorphism

(FIXME: This is the place to discuss parametric vs. ad hoc, subtypes, receiver/v-method, reflective invocation.)