a universal carrier in the vm

a universal carrier in the vm

John Rose & Valhalla team

January 2017

The JVM interpreter currently has five basic carrier types to hold temporary values during a computation. These types are reference, int, long, float, and double. They are isolated from each other by the verifier. (The verifier additionally tracks the named classes of references.) Since value types cannot be accurately carried by any of these carrier types, at least one new carrier type is needed. This paper proposes exactly one more, a universal carrier type, and explores possible semantics, implementation, and use cases.

The basic reality is that, since value types do not hide behind pointers, they act like variously-sized bundles of data, requiring either one carrier type per size, or (let’s be real) a single carrier type (or at most a small number) which can accommodate multiple sizes.

With respect to a system of value types built on top of the carrier type, there might be only one size for a given higher-level type, but the carrier type has to support all types, and so it must also support a multitude of sizes.

In some computations, it is necessary to create temporary storage in JVM stack frames for variables which can vary in size over time. We can call such variables “size-polymorphic”. (Some variables are “size-monomorphic”, such as one holding a hypothetical ComplexDouble value type.) Sometimes it will even be necessary to store and retrieve regular object references and/or value types and/or primitive values in the same variables. A well-designed carrier type can cover all these cases, with minimal size and allocation overheads. So let’s design one.

Background: What’s in a value?

(Note to reader: Skip this section if you want to get to the meaty details right away.)

Project Valhalla is experimenting with supporting value types, which are a hybrid between primitives and classes. Like classes, they have fields and methods, so they can be given domain-specific behavior, such as complex absolute value. But like primitives, they are stateless, so they are ideal for creating extra primitive-like types, such as ComplexDouble, UnsignedLong or Long128. Again, like classes their fields can be of any type, which means they can refer to stateful objects, enabling them to be used as cursor-like types, akin to C++ iterators (Java iterators are stateful objects).

Values will be a new and efficient way to create a wide range of semantically immutable data types like File, String, or BigInteger, which unlike objects can be created and destroyed at high rates without any overheads from automatic storage management (GC). A Java value type will be approximately as expensive to compute as a small bundle of Java primitives, or as a C struct value in an auto variable. They will (typically) be small and simple to allocate on stack or heap. On the JVM stack they will be passed by value between methods. In the JVM heap they can be stored compactly, without internal metadata like object headers or V-tables.

But like Java object types, value types will be defined as value classes, with strong encapsulation, fields, methods, constructors, and generic type parameters. We sum up this two-sided design for values in the slogan, “Codes like a class, works like an int.”

A new value can be assigned over an old value, but there is no side-effect which can change the meaning of any given value. Synchronization is not allowed on values, since synchronization has side effects which can be observed, using observations of thread behavior. Even object identity is not available on value type values. Object identity is the semantics of the equality operator “==” when applied to references; when applied to value types the equality operator must respect the payload of the value but not its identity.

The JVM will use an internal descriptor tag Q to keep track of value types, just as it uses L to refer to classes and interfaces. For this reason we will refer to value types as Q-types and legacy reference types as L-types. The verifier will prevent broken code from making risky assignments between Q-types and L-types, just as it has always prevented assignments between L-types and primitive types like integers or longs. (Primitives have descriptors I, J, etc., and we might want to call them I-types, J-types, etc., if there were more than one of each.)

In the internal descriptor language, Q-types will be decorated by class names just like L-types. Again, this prevents broken code from picking a Q-type of some value class Foo and trying to read a field from it named Bar.f, declared in some other class Bar.

Just as L-types and primitives types require distinct mechanisms for carrying around temporary values in the JVM, Q-types require a new mechanism, which we are calling a carrier type, to carry them around in the JVM. We might try to reuse L-types to carry Q-types, just as the type I is reused to carry booleans, but then every Q-type would have to be formatted as an object reference, which would have unacceptable overheads (probably to both Q-types and L-types!).

Unlike L-types which are always one machine word, Q-types are variable in size. This happens not only because an individual Q-type might vary in size over time, but also because the size of a Q-type is not fully predictable outside of its own class file. (One reason is invisible private fields; another is a recompilation that might add or delete public fields.) The size and layout of a value type can only be known by the JVM when the value type’s class file is actually loaded. Until then the Q-type must act like a forward declaration of a type, as in C++. (We might try to avoid this forward-declared state by eagerly loading a class file the first time the JVM encounters a particular Q-type, but this trick is impractical for a number of reasons.)

Side note: For the record, there is at least one alternative to having a single carrier type for value types, and that is to have the interpreter somehow rewrite bytecodes so that every Q-type occurrence is expanded and rewritten to an appropriate primitive or reference, or tuple of several of these, to represent the fields of the Q-type, or some similar construct of explicit size and structure. Such an approach would require at least three engineering feats which seem more difficult than creating a universal carrier type: (1) Expand all Q-types in a given block of bytecode before the first bytecode is executed. (This would require complex look-ahead for bootstrapping.) (2) Create ways for methods to return explicit bundles of values instead of single values, and probably create other bytecodes for working with explicit tuples. (3) Ensure that generic code can fully expand all occurrences of of any type variable which might be bound to a value type. (This would commit our platform to C++-style template expansion before execution of any block of generic code.) In the end, it does not seem likely that we can escape building ourselves a universal carrier type.

Polymorphism and Q-types

But the story is not yet over. One of the more useful things a value type can do is implement an interface. For example, a file system might use 128-bit block identifiers, and its directory nodes may rely on sorting those identifiers (say, as unsigned integers). Java with values can represent this state of affairs using (for example) an TreeMapOfUnsigned128<V> whose key is some value type Unsigned128 and whose value is some other type V. Of course, Java generics will be available to implement such a map, so that the java.util.TreeMap type could be instantiated as TreeMap<Unsigned128,V>. (Making this work out is an important goal in the long run, since we would prefer not to force users to choose between using values and generics in their code.) Now the connection to interfaces comes up, since the key-type of a TreeMap is required to implement the Comparable interface.

In the end, TreeMap (like many other generic classes and algorithms) is required to take a new kind of type parameter, <any K extends Comparable<K>, and the bounding interface Comparable is required to interoperate with value types. So the type Comparable is populated not only by certain references to objects which implement that interface, but also by certain values of value types which also implement it. In source code or bytecode, a given name of type Comparable might at some times be bound to some object reference and at others to some value.

It is customary to use the term polymorphic (“many-formed”) to describe any name or structure that can vary by subtype or shape. Interface-typed variables are inherently polymorphic, and so are type variables. Q-types (like L-types) must be able to hide the details of the values they carry, and so support polymorphism of value-typed variables (as well as the forward-declaration.

This polymorphism is seen with “any-fied” type variables like <any V>, and has a very interesting effect when the generic class itself is also a value type. (Value types are excellent candidates for generic algorithm building!) Such a generic value type will in general have a polymorphic internal layout. A value type representation like Optional<T> or Complex<T> should vary in size and layout along with the size of its T type parameter.

All of this means the JVM must be prepared to represent not only simple, monomorphic value types like ComplexDouble also polymorphic types like Complex<T> or Comparable<T> which can vary in size. In the case of an interface value (especially a wildcard like Comparable<?>) the variation may be dynamic, from moment to moment as new values are encountered. Yes, sometimes the variation can be tied to a code-generation context, as with Complex<double> and its methods, but it is difficult to make this happen always.

At the JVM level, there needs to be some way to process Q-types under the guise of interface types, and to accommodate Q-types as bindings for “any-fied” type variables. To do this we need to widen the JVM’s repertoire of types one more time. Luckily, this last move does not require a second new carrier type, although it seems to call for a second new type descriptor (the “U-type”).

Mixing up the universe

The final twist is that regular object references can sometimes show up in a mix with non-object types. Although a variable of some value type like Complex<T> cannot be assigned an object reference (it is of a Q-type), interface variables of type Comparable<T> can in general be assigned any mix of values and references (of Q-types and L-types), and even the odd null or primitive value.

N.B. The previous paragraph makes no claim about whether references, nulls, or primitives can be contained inside a value type as a field. The insides of both value types and object types are determined by the field declarations in their class files. “Codes like a class” means you can put any fields you like inside a value.)

Mixing polymorphism with value types does not negate the benefit of value types. Rather, open-ended polymorphism complements simple values by integrating them into generic algorithms in a flexible way. When you need an object, you can have an object; when you don’t need one, you don’t have to pay the overheads of object identity, headers, pointers, etc. And the same generic algorithms can work on all kinds of data.

The L-types currently used to represent interfaces and generically typed values do not extend to either primitive values or value types, because of overheads like boxing. Therefore we need a new type descriptor syntax, and a new carrier type, to inform the JVM that we expect a wide range of values and references to appear in some variable. Even the Q-types won’t work; since they suppress object identity and disallow null, they cannot accurately represent values of L-types. We need something that is a union of both L-types and Q-types, with all the primitives thrown in for good measure.

A carrier type that can transmit references and values and primitives unambiguously, and keep boxes and non-boxes distinct also, can justly be called a “universal carrier type” or “universal type kind”. We will use the letter “U” as a descriptor code for this type, and call any type spelled this way a U-type.

A U-type variable must therefore be able to efficiently represent any primitive (one of the descriptors “IJFD”), a reference (L-type), or any value-class value (Q-type). A TreeMap algorithm can be translated to use a U-type (perhaps marked with the interface name Comparable) and operate correctly and efficiently regardless of whether the keys of the TreeMap are Strings (L-type), 128-bit integer values (Q-type), or longs (primitive J-type).

(The letter “U” is supposed to suggest “universal”, while “Q” suggests “quantity” or Latin quod for “which”. The letter “V” for “value” is already taken by the descriptor “V” for “void”.)

As with L-types and Q-types, U-types need to be decorated with a class name. The standard use case for this is to mark the U-type with an interface name (or Object), not a concrete class name. By contrast, Q-types usually need to be marked with concrete value classes. (See below for more discussion of the relation between classes and L/Q/U-types.)

Looking closely at the implementation requirements on the carrier type for Q-types, it appears that, with only minor adjustments, the carrier type can be extended, with straightforward effort, to carry not only the full range of Q-values but also primitives, and not only all those values but also the special values that are object references. Put another way, the U-type carrier is not too strong to use for Q-types. Whatever metadata the JVM uses to distinguish between different Q-types in the universal carrier can be twisted a little more to make encoding space to inject the one-word carrier types “IJFDL”.

Therefore, there seems to be no reason to make an distinction, in the internal JVM representation, between Q-types and U-types. The descriptors “Q” and “U” can and should refer to the same internal data structure, rather than have a pair of almost-twin internal types.

The difference between a Q-type and a U-type is merely a matter of rules for value subsetting, so the U-types can carry a true object reference or null, while Q-types cannot. Bytecodes which operate solely on Q-types are allowed to assume their inputs cannot be null or true object references. Bytecodes which operate on U-types must be prepared for anything.

You can observe such value subsetting in the JVM by observing the verifier rules closely: Whenever there is an allowed implicit conversion between two type descriptors, it is certain that they share a common carrier type, and that the value set of the source type is a subset of the value set of the destination type. This is why “B” can convert to “I” but not vice versa, and likewise “[I” to “Ljava/lang/Object;”.

When Q-types appear in the heap, we expect a high-quality JVM will find a way to shrink the data structure to fit the actual Q-type, rather than use an open-ended carrier type in the heap. After all, a major goal of Valhalla’s is to “squeeze” metadata out of the heap, allowing it to be more contiguous and compact, with fewer pointers and object headers. Even if a Q-type of size two bits uses the U-carrier-type on the stack, it should pack down to a byte or less when stored to the heap.

(The previous paragraph is pretty obvious. For now we leave the question open of what should happen if we need to store a polymorphic variable of Q-type, or a U-type, in the heap. There may be use cases for this, but that question is more tricky than deciding about the types on the JVM stack of method parameters, returns, and locals. A full answer is outside of the scope of this document.)

The final set of carrier types can thus be identified with the descriptor code characters “IJFDLU”. We may speak of the unique “U-carrier-type”, “I-carrier-type”, “L-carrier-type”, etc., in order to emphasize the distinction between a single internal carrier type and a range of externally-visible JVM types. Thus, the integer subrange types (“ZCBS”) as well as the I-type all use the I-carrier-type. Since arrays are objects, descriptors starting with left-bracket “[” all use the L-carrier-type. Finally, all Q-type descriptors use the U-carrier-type. There is no separate Q-carrier-type, nor Z-carrier-type for that matter.

Boxing vs. buffering

None of the previous points are directly affected by the phenomenon known as “boxing”, whereby a primitive like 42 is converted to an object of type Integer. Value types, like primitives, can be boxed as objects. We reserve the term “box” for a first-class object, an instance of java.lang.Object or a subclass, which has a measurable identity. A box is therefore not a value, but is rather a object-shaped envelope for a value (or a primitive), and there are explicit box and unbox operations which convert between box and value.

If we must speak of a value being temporarily stored in a some block of temporary memory, we try to avoid the term “box” and say that the value is “buffered”. Buffering can occur on either heap or stack, and the same value can move around at will. It might be “unbuffered” to registers, or “rebuffered” to a different storage location, perhaps because a containing stack frame is being exited. By contrast, boxes cannot be discarded; only the GC can be trusted to deallocate them, and their individual identity (as Objects) is visible to the user.

Boxes are, in effect, object-like views on underlying values. It seems that we need them for best interoperability between value types and legacy types like Object and interfaces. User-visible box types may be needed to enable users to assign a value to a variable of type java.lang.Object. (Auto-boxing supports this today for primitives, which are the only value types in Java before Valhalla.) Assigning values to L-Object is necessary to interoperate with some important dynamically-typed APIs, such as Core Reflection, or method handles.

(Boxing values into L-Object introduces unavoidable overheads compared with working directly with the original unboxed values. This is why we are not simply using dynamically-typed APIs everywhere.)

Conversely, “un-boxes” may be defined as value-like views on plain old Java objects. These may also be useful, to integrate legacy types like String into value-based algorithms.

It it not yet clear whether the distinction between boxes and un-box must be present in the user model for programming with value types. Perhaps it can be pushed completely into the background; we tentatively give an account of boxes and non-boxes in case the distinction cannot be completely hidden from the user. We do believe that today’s awkwardly obtrusive distinction between Integer and int will not set the pattern for new value types.

Terminology

The following terms will be useful here:

We need to introduce new descriptor letters wherever there is a new type that the verifier must isolate from the pre-existing types. This is why we must talk of “Q” and “U” descriptors, and new carrier types.

Although carrier types are an implementation detail, they may also be detected in the JVM type system by their likely correspondence to “top types”, which are those JVM types which have no supertype, other than oneWord, twoWord, and Top itself.

In generic code, sometiems a variable (or field or parameter or return value) cannot be statically determined to be a Q-type or L-type or primitive. This is a feature not a bug, since generic algorithms are important to Java. Many generically typed variables will probably be U-types, since that gives the natural kind of freedom to most generic algorithms. A variable may also be an L-type, if the generic uses erasure; it probably won’t be a Q-type unless it is a derived type like optional<T>, in a world where optional<any T> is a generic value type.

It seems likely that local variables, method parameters, and method returns can benefit from being translated as U-types instead of instantiated types, since this allows one set of API points for a generic, instead of a potentially different set of API signatures per type variable binding. For example, List<T>.get(int) returns T at the source level, and T might vary greatly, but at the bytecode level a single API point List.get(int) might return U-Object, and use casts (just like today) to convert the return value to a specific type, in contexts where it is known. It is also possible, though not desirable, to translate fields of generic types to U-types; a much better path for such heap-resident variables is to shrink each object to fit a particular T-type. Techniques for this are beyond the scope of this document.

Side note 1: For the record, it may also be possible to use template-expansion techniques to replace U-types by other types. For example, a block of generic code could be expanded to one version which works on a single L-type (such as Object), plus one version which works on each primitive type, plus one version which works on all Q-types (perhaps expressed as Q-Object) or perhaps many versions, one for each distinct Q-type. Like the above-mentioned technique of expanding all Q-types away, replacing them by tuples of primitives and references, a full expansion of generic code to concrete types seems (to some) like a difficult climb to an uncertain perch. Existing prototypes have achieved part of this climb, but still have unsolved problems, such as the complex renumberings required when a one-slot type variable is replaced by a two-slot type (long or double). Even if these problems are solved, the approach seems to require link-time generation of many versions of each block of generic bytecode. Although this works for C++ as a static compilation tactic, it is not so appropriate for a dynamically-compiled system, where the JVM interpreter collects profile information, and the then JIT makes inlining, specialization, and optimization decisions in a profile-driven manner. The dynamic approach can be “walked back” to a static approach, as with AOT compilers, by performing some part of the specialization process off-line. But the static approach, when compulsory, is costly. Once an early phase (AOT or linking) creates many versions of a block of code, the JVM and JIT must deal with each as a unique entity. Cold blocks must be instantiated, even if they are executed only once. Hot blocks get instantiated many times, adding a multiplier to code management costs and taking away decisions from the JIT. For the JVM, where the JIT is a center of optimization decisions, it seems best to aim for as few versions of generic code as possible, preferably a single version, until the JIT works on it.

Side note 2: In this document we are dealing mainly with the types of local variables, method parameters, and return values, which will often be carried in U-types. By using a universal carrier type, a single block of polymorphic code can implement all instances of a generic algorithm or type. (The JVM JIT can of course choose to specialize U-types to specific types, under a variety of circumstances, including on-demand template expansion, so there is no ultimate loss of performance due to U-types.) In direct contrast to this, heap variables will almost certainly have specific monomorphic types, so that heap data structures can be as compact as possible. So all the up-front templating work we need to avoid for code blocks must be performed for object layouts. In short, we want data to be monomorphic, while code is polymorphic. These mismatched tactics are appropriate to the JVM, where object layouts are heavily protected by linker-based abstractions, while variables in bytecode are explicitly sized and typed (before Q-types). The mechanisms for managing this mismatch are very interesting, but outside the present scope.

It is the U-types that require the full breadth of the universal carrier type. Obviously useful universal types are U-Object and interfaces like U-Comparable. An “any-fied” generically typed variable with a type bound B would be typed in the bytecodes as “UB;”, where B must be an interface.

Side note: If for some reason generics are not implemented for Q-types, or if generic types are not allowed to range over both Q-types and L-types, or if such ranging is allowed but a static template expansion tactic is deemed sufficient for all such generics, and assuming we have no other need for a Lisp-like “universal variable” type to hold Q-types, L-types, and primitives, only in that case we will be able to avoid implementing the U-types, and we can get away with only Q-types. In that case, generics might be implemented by versioning with one version erased to L-Object and another erased to Q-Object. However, since most of the complexity of U-types is already found in Q-types, it is wise to plan for U-types.

A similar carrier must also be used for “Q” descriptors which are size-polymorphic. (An example would be Optional<T> where T can range over boolean, Object, and Int128.) In that case the carrier type will not carry object references at runtime. It is an open question whether it is worth tracking the class component of the Q-type (such as “Optional”) or whether the type can be detuned to something like Q-Object or U-Object. (As discussed later, the trade-off is verifier complexity on the one hand, and extra dynamic checks per bytecode on the other.)

Simple value types like UnsignedLong or DoubleComplex, and bound instances like Optional<short>, have fixed (“monomorphic”) layout. The information about the fixed layout should be represented exactly to the JVM using a Q-type descriptor. But those descriptors should also use the same size-polymorphic carrier type as U-types, even though the layouts do not vary. Using a “one size fits all” scheme means that the JVM can lay out interpreter stack frames without first determining the sizes of all relevant value types, an inconvenient task which requires class-loading.

(Since Optional<short> and Optional<long> should have different layouts, the Q-type descriptor for these concrete types will probably include not only the class name Optional but also the type parameters, at least in common use cases. A possible descriptor syntax for conveying this information would be QOptional<S>;, etc.)

In the end, as advertised earlier, we only need the one universal carrier type (the U-carrier-type) for all U-types and Q-types.

It would be possible to define a number of carrier types for a range of possible implementation classes of Q-types, such as “one word”, “two word”, “64 bits plus two references”, etc. But clearly the simplest way to go is to use the single universal carrier type for all U-types and Q-types.

Interestingly, it would be a valid translation strategy to use the U-carrier-type for all variables in a Java program (except object fields and array elements, of course). The main difficulty of doing this (besides a modest loss of performance in the interpreter) would be ensuring that distinct overloaded methods continued to possess distinct descriptors. For example, the overloaded method PrintStream.println has a separate overloading for each primitive type, plus the L-type Object. If javac were to translate all variables using U-types, those distinct overloadings would have to have argument types like U-boolean, U-short, U-char, etc., so that they would not all collapse to a single descriptor PrintStream.println(U).

Being kind: “Q” versus “U” versus “L”

When the universal carrier is used to carry pure value types, the compiler and JVM must make an agreement that they never carry true object references. (Heap references could occur, but would be to non-Object buffers only.) As with other aspects of the JVM’s type system, the agreement to use constrained U-values (to carry Q-types) can be enforced one of two ways: Either the agreement is checked dynamically by every bytecode that consumes a suitably constrained value, or the verifier can be enlisted to track the constraints in its type system. In that case, bytecodes would be free to assume that the type constraint is in effect, without checking.

This design tradeoff can be seen in the way the verifier tracks classes, so that the getfield bytecode does not need to re-check the type of the base object, even though it could do so. If the verifier did not track classes, but only carrier types, then the getfield bytecode would have to perform the dynamic check. In the case of invokeinterface there is always a dynamic check, since the verifier type system does not attempt to track non-class supers. The treatment of Q-types by the verifier and interpreter will similarly apply some mix of these complementary techniques.

We could keep the verifier out of the business of checking Q-types, use only U-type descriptors everywhere for locals, method parameters, and return values. We would rely on the javac compiler to enforce its own static type system; it would be picky about Q-types but erase them when writing code for the JVM. (This is how erased generics are implemented today.) But it seems reasonable that Q-types, although carried by U-types, can be kept verifiably free of null values and true object references, without much complexity in the verifier.

Let’s assume that we will ask the verifier to track all three kinds “LQU”, and to track class names (or at least some names) for each kind also. This leads to some key questions: What operations convert between the kinds “LQU”? And what are the allowed operations on each of the three kinds?

We can start with L-types, since those are more ore less set in stone. An L-type can:

Since U-types can assume L-type values, one can expect that many of the L-type operations will lift to U-types. The identity-sensitive operations will not lift, but generic algorithms will require that the Object and interface API points will lift from to U-types from both L-types and Q-types (and probably from primitives also).

If U-type names are restricted to Object and interfaces, the following operations are natural:

If U-type names are allowed to mention specific classes, then the L-type operations which are class-specific would also apply to U-types. It it not clear yet whether this detail is required for U-types.

Meanwhile, the valid operations on the Q-types are almost exactly the same as for U-types. The list is:

The main difference is that Q-types do not support the object-identity operations of L-types. There is, however, a generalization of the reference equality test (acmp) which applies to Q-types, called a substitutability test (or vcmp). Two values (of any kind whatever) are substitutable for each other if they are both of the same class, and either the same reference, or else both are values and in fact copies of each other.

Note that if two values are “substitutable” in this sense, then their Object.equals method (assuming it is validly constructed) will certainly return true when applied to them. Also, any method applied to two substitutable values will return the same results, subject only to the caveat that the method is pure, i.e., that its behavior depends only on its parameters. The substitutability check can be lifted to U-types also (ucmp) and could provide a low-level semantics for the Java “==” and “!=” operators, when applied to generically typed values.

(The details of the substitutability test are straightforward, but out of scope for this document.)

Since object operations are a superset of value operations, it is semantically safe to allow object references to “leak” into value code. It would be a viable JVM design (regardless of language design) to allow L-type references in all Q-type contexts.

If V is a value type, then the L-type L-V (if it exists) is simply an immutable, value-based POJO box for a value of type V, and there can be no doubt as to how to perform a Q-type operation on it (as long as it is relevant to V’s API). If a POJO box of type L-V somehow “sneaks” into the Q-type Q-V, the Q-type API operations apply cleanly and regularly to the stowaway POJO. Even the vcmp operation is clearly a component-wise substitutability test, not a subversive acmp pointer comparison.

(An overarching point is that it is occasionally reasonable for the JVM to be permissive even where the Java language per se is restrictive. This usually happens when Java language restrictions would be difficult to implement in the JVM, and are not necessary for secure operation.)

But on balance, there are a number of reasons to exclude L-type references (and especially nulls) from Q-types, in which case there must be explicit conversion bytecodes to go between the types, with the verifier enforcing the distinction. It’s worth noting that the unboxing operation for V (from L-V to Q-V) could be implemented as a simple tag bit setting, where the POJO box reference is retained on the interpreter stack (and maybe in the compiler), but with a bit of context that says, “don’t ever treat this reference as a full L-type”. In fact, these considerations suggest a goal (which we will pursue) for the universal carrier type, that it can efficiently convert from some L-V to Q-V, and back again, preferably without frequent storage allocation.

It is an open question whether the JVM needs “U-classes”. A “U-class” U-Foo, as a type, must mean a union of either a true object reference to class Foo, or else a value of class Foo. Since the source code for Foo must be just one or the other, it seems that U-Foo is not very interesting. But it may arise in the course of defining a substitution-based semantic specification for a generic that starts out as U-Object or some such. More practically, it seems likely that every Q-type will be associated with its boxing L-type. In that case Q-UnsignedLong could be a value type (for uint64_t), while L-UnsignedLong could be the type of its boxes, and again U-UnsignedLong would be the union of those two types. All API points of the class UnsignedLong would be applicable to to all three kinds of type (“QLU”). To emphasize: This is an open question.

It is an even more open question how the Java language user model should make these distinctions, which are almost always negligible for programmers. For the JVM, the basic requirement is that legacy L-types and U-types be kept isolated from each other.

(It’s also possible, in principle, for the JVM to omit the box types even if the source language admits their existence to the programmer. The source type L-UnsignedLong could be rendered as either L-UnsignedLong or L-Object in the JVM, and there would be no need for the L-UnsignedLong descriptor to appear anywhere. But this seems like a needless wrinkle, given that the verifier is already in the business of tracking classes of L-types. Also, it is helpful to support L-UnsignedLong if that class was originally an object class, but migrated to a value class.)

In the end, the usefulness of concrete U-classes (as opposed to polymorphic U-interfaces) seems to depend on the usefulness of L-types for value classes, and/or the usefulness of Q-types of object classes. Otherwise U-classes just need to be interfaces and Object.

A complementary question is whether “Q-interfaces” are desirable in the JVM. The type Q-Comparable seems to mean “any value-type that implements comparable”, but that is not necessarily an interesting concept. The type U-Comparable has the same operations; the main difference is that U-Comparable admits the value null, and its substitutability test will sometimes detect object identity.

A “Q-interface” type would be useful as a way to express non-nullable, identity-insensitive values, and would be practical if “un-boxes” of POJOs could be assigned to variables of this type. In the end, the usefulness of Q-interfaces depends whether un-boxes of POJOs are supported.

This leads to some tentative conclusions for interactions between kinds, classes, and interfaces:

The ‘L’ and ‘U’ kinds are nullable, and the ‘Q’ kind is not.

All kinds support a substitutability check (and a corresponding substitutability-hash algorithm).

Deriving box types from value types

In order to integrate value types more closely with legacy L-types such as L-Object, is desirable to inject (via boxing) the Q-type values for any given value class into the L-type for the same class. It is probably also desirable to denote these box types at the JVM level. In that case:

A plain Q-type value is never a box, nor can it be null. That is, L-type and Q-type value sets are totally disjoint, even though there is an injective mapping between those value sets.

Deriving value types from reference types

For object classes, a reverse mapping may also be useful. If it is possible to box a Q-value as an L-value, it is also possible to do the reverse, “unbox” an L-value merely by suppressing its object identity. The rules of operation for such an identity-suppressed reference are documented in the JDK, under the name of Value Based Classes.

Any given object reference, to an object class, could be converted to a Q-type value, called an un-box. This would not copy anything, but would hide the identity of the object, perhaps using a tag bit on the object pointer, or some other contextual indication. Any code using the Q-type would be forced to play by the rules of value-based classes, even though there was a stowaway POJO under the value.

The Q-type version of an object class would also reject null values. For example, under the type Q-String code would be forbidden to store null in a string variable, and would be unable to synchronize on the string value or test its object identity. Converting the same reference back to the normal type L-String or L-Object would restore those reference-sensitive operations.

In principle, any plain old Java object can be wrapped in an un-box. Its information content, including any state, would continue to be encapsulated behind the reference, but its identity would not be directly observable, until the un-box is re-boxed as an L-type. In the case of value-based classes, the JVM could be given additional freedom to discard the original POJO reference, when convenient, and reconstruct an equivalent one, if the L-type value is ever reconstructed.

“Un-boxes” are only a promising speculation at this point, but they may be useful as a way to codify the usage of value-based classes. In particular, calculations which use un-boxes are immune to changes in object identity, allowing (in principle) the JVM freedom to merge equivalent objects into single copies, or split objects into multiple localized copies. Such transformations have been useful in some systems (like Icon) for strings or immutable lists or arrays, and would be legal in the JVM if re-boxing an un-boxed POJO is allowed to produce an equivalent POJO of an indeterminate identity.

So we may add these cases:

Carrier types and conversions

Both Q-types and U-types (in fact, all types) can use the same universal carrier type. The verifier allows implicit conversion from Q-types to U-types since the representation is the same and the set of U-type values is a superset of the set of Q-type values. Other conversions require runtime checks, and/or format adjustment.

An explicit conversion operation (u2v) must be used to convert a U-type value to a Q-type value of the same class. This checked conversion, when applied to a Q-value, is a no-op, since there is no change of carrier type. If there is a null or primitive, a NullPointerException or ValueKindException (or something like that) must be thrown. If the value is an object reference, there are two possibilities, corresponding in fact to two opcodes: u2v will throw the same exception as for primitives, and unbox2v will unbox the desired value type from the POJO. This works if the POJO is a true box, and also if the POJO can be converted to an un-box. If the POJO is not amenable to unboxing, ValueKindException is thrown.

An explicit conversion from a U-type to an L-type of the same class is necessary, and in fact two operations seem desirable again: u2a, u2box. Both conversions, applied to null or L-value, would simply produce the appropriate reference. If the U-value is a primitive or value, there are two possible responses: Either throw a ValueKindException, or box the value into an appropriate POJO.

The opposite conversion to u2a is a2u, and never throws an exception. The opposite to u2box is box2v, which might throw an exception if the input is null or if the object class does not support unboxing. (Recall that all boxes support unboxing, and that value-based classes might also be convertible to un-boxes.) Note that the box2v operation might as well return a Q-type; it can be used as a U-type factory also since Q-types implicitly widen to U-types.

Direct conversions between Q-types and L-types are also plausible, and must be consistent with the composed conversions involving a U-type intermediate. As noted, the box2v operation is already a direct conversion from an L-type to a Q-type. The reverse operation v2box would be composed of v2u plus u2box, but since v2u is implicit, there is no need for a separate operation.

Adding in primitive conversions and type queries, we get a list of U-type operations like this:

Unless a conversion takes a type parameter (PT or C) the value’s kind may change but its class stays the same. Since the class stays the same, the conversion can ignore it and concentrate on checking the storage class of the value and perhaps adjusting its base pointer. This is why kind-changing bytecodes do not need a type field.

The type PT is the primitive pseudo-type (java.lang.Integer.TYPE = int.class, etc.) or perhaps a descriptor (a CONSTANT_Utf8). The JVM has plenty of opportunity to decode type and class names at link time, so the actual bytecode behavior will usually be very simple.

What about primitives?

Java primitive types contribute much complexity to the JVM and language specifications; they are tolerated because on balance they pay their way by avoiding the overheads of object references and allocation, as seen in languages (like Lisp or Smalltalk) which have only one carrier type (a universal one). For performance, the JVM defines the I-carrier-type and three other primitive carrier types. But with the advent of Q-types, there is a cleaner way to avoid those overheads (or most of them), and the complexity of supporting primitive carrier types is less justified.

If primitive carrier types could be removed, then their corresponding type descriptors could be retired–except for backward compatibility use cases. We are not proposing this, currently. Independently, at the JLS level, the primitive types could be defined as built-in value types, and either assigned Q-type descriptors, or special-cased to the legacy descriptors.

Ideally, the language and JVM should just dump the primitive types as special cases, retroactively construing them to be built-in value types like Q-int, etc. The special descriptor letters “I” would simply be mandated abbreviations for these ordinary value classes. Although we can be inspired by this ideal and aim in that direction, it does not seem practical to discard the primitives as special things in their own right; perhaps they can be deprecated and replaced some day but that will take a very long time.

Short of wholesale removal, there are a couple of ways that the special-casing of primitives can be narrowed, and the type system as a whole can be made more regular.

It is very desirable to define true APIs for proxy value classes for each of the primitives, using the standard wrapper classes (java.lang.Integer, etc.) as starting points. The generic type TreeMap should be able to use int or long keys, which requires that there be a story for endowing this primitive types with the interface Comparable. (As an aside, such primitive-friendly generics will probably be integral to creating arrays with long indexes and multi-dimensional arrays.)

The opcode set above can be adjusted to allow primitive values to pass as Q-types (the same Q-types as are required for generic programming with primitives). The following adjustments would achieve this:

Unless primitive carrier types “IJFD” are eliminated or ret-conned as special Q-types, a few special opcodes remain in order to convert between their corresponding Q-types and the original primitive carrier types. In the end, retaining those extra four carrier types gives the JVM an easy way to optimize values carried by those types.

A concrete representation for U-types

The preceding considerations have given us many hints as to what a good implementation of U-types should look like. The following seem to be important:

This suggests a design which attempts to make boxed and unboxed states as similar as possible. Ideally, it should be possible to strip tag bits from a base pointer (or ignore tag bits on the side) in order to access a value’s payload.

At a minimum, a U-type must contain and distinguish two states, L-type and Q-type (and perhaps no other states, except for primitives). Let us start by endowing the U-type with a machine word which can hold a managed pointer, and also with a kind-code which can be one of the logical values Q and L. Let’s also be clever and encode the kind-code in the low bits of the pointer, so that U-types can be passed around easily as machine words. Here is a first cut:

struct UCarrierType {  // take 1
  enum KindBits {
    L_OBJ  = 0,  // word is an oop (ordinary object pointer)
    Q_HBUF = 1,  // word is a stowaway box object (an oop)
    };
  static const int KIND_MASK = 7;
  uintptr_t _word;
  static UCarrierType make(Kind kind, uintptr_t word) {
    assert((word & KIND_MASK) == 0);
    UCarrierType self;
    self._word = uintptr_t(kind) + word;
    return self;
  }
  KindBits kind() {
    return KindBits(_word & KIND_MASK);
  }
  address as_address() {
    return address(_word & ~KIND_MASK);
  }
  bool is_object() {
    return kind() == L_OBJ;  // the only one
  }
  bool is_value() {
    return !is_object();
  }
  bool is_null() {
    return _word == uintptr_t(NULL) + uintptr_t(L_OBJ);
  }
  bool is_on_heap() {  // on heap, as either Q or L?
    switch (kind()) {
      case L_OBJ:  return !is_null();
      case Q_HBUF: return true;
    }
    return false;
  }
  oop as_oop() {
    assert(is_on_heap());
    return oop(address());
  }
  Klass* klass() {
    if (is_null())  return null_klass();
    if (is_on_heap())  return as_oop()->klass();
  }
  address payload() {
    if (is_null())  return NULL;
    if (is_on_heap())  return as_oop()->payload();
  }
  // anything on the heap is easy to represent:
  static UCarrierType from_heap(Kind kind, oop managed_pointer) {
    assert(kind == L_OBJ || kind == Q_HBUF);
    if (kind() == Q_HBUF)  throw_if_null(managed_pointer);
    return make(kind, uintptr_t(managed_pointer));
  }
  UCarrierType box() {
    if (is_object())  return *this;  //or error
    if (kind() == Q_HBUF)  return from_heap(L_OBJ, as_oop());
    return from_heap(L_OBJ, box_on_heap(klass(), payload()));
  }
  UCarrierType unbox() {
    if (!is_object())  return *this;  //or error
    throw_unless_unboxable(as_oop());
    return from_heap(Q_HBUF, as_oop());  // just flip the bit
  }
  static oop box_on_heap(Klass* value_klass, address payload) {
    oop box = value_klass->make_new_box();
    int size = value_klass->size_in_bytes();
    memcpy(box->payload(), payload, size);
    return box;
  }
  // values on the stack or in registers get shoved into the heap:
  static UCarrierType from_stack(Klass* value_klass, address payload) {
    return from_heap(Q_HBUF, box_on_heap(value_klass, payload));
  }
  static UCarrierType from_int(jint payload) {
    return from_stack(int_value_klass(), &payload);
  }
  static UCarrierType from_int(jlong payload) {
    return from_stack(long_value_klass(), &payload);
  }
}

The Q_HBUF state means that the Q-typed value is buffered on the heap. (This is a safe but expensive way to store values.) In fact, we should use the same heap nodes as represent user-visible box objects to buffer the values in the Q_HBUF state. This means that in the Q_HBUF state the box itself (as a POJO) will be invisible to the user; it also means the box can be made visible by simply flipping the tag to L_OBJ, as in the unbox operation above.

(This raises questions about the propagation of object identity through the Q-type. Is it valid for a user-visible box to disappear into a Q-type variable, and then reappear when that variable is reboxed? Probably yes–and who doesn’t enjoy recycling their old boxes? Should users rely on the identity of an L-type value to persist always after conversion and reconversion through a Q-type? Certainly not. Should the language and/or JVM try to obscure such identities, to prevent users from falsely depending on occasional reappearances due to recycling? This is a tricky user-model question which the JVM does not need to settle. The JVM works best when it simply provides low-level “hooks” to implement and enforce whatever policy is decided on by the language and bytecode compiler. Thus, the JVM should provide low-level box and unbox operations, which recycle heap nodes whenever possible, but the language may elect to make new boxes under some circumstances, even if the JVM has one handy.)

Thread-specific storage

The above representation is straightforward and compact, but it violates the requirements for efficient representation of primitives and small values. In order to fix that, we need a better place to store payloads and value classes. In an ideal sense, that place is on the stack; in reality any storage which is somehow private to the thread will work. To be precise, we can say that this storage is thread-specific storage (TSS).

Note that thread-specific storage can be read and written without any care about concurrency or race conditions, and the processing of such storage does not need to perform any handshakes with the GC. These factors tend to make TSS cheaper than object storage. TSS might be in the C heap, the thread stack, the GC’s own heap region (as slices checked out from the GC), or any combination thereof.

The most important interaction between TSS and the GC is that TSS must be treated as a root set, a region where live pointers must be traced. This means that the management of TSD must be structured well enough that all such pointers can be found, any time the GC might need to inspect the thread’s root set.

Here is the updated sketch:

struct UCarrierType {  // take 2
  enum KindBits {
    L_OBJ  = 0,  // word is an oop (ordinary object pointer)
    Q_HBUF = 1,  // word is a stowaway box object (an oop)
    Q_TSS  = 2,  // word points to TSS block, with heap layout
    };
  ...
  bool is_on_tss() {
    return kind() == Q_TSS;
  }
  tssp as_tssp() {
    assert(is_on_tss());
    return tssp(address());
  }
  Klass* klass() {
    ...
    if (is_on_tss())  return as_tssp()->klass();
  }
  address payload() {
    ...
    if (is_on_tss())  return as_tssp()->payload();
  }
  ...
  static UCarrierType from_stack(Klass* value_klass, address payload) {
    if (!value_klass->can_use_tss())
      return from_heap(Q_HBUF, box_on_heap(value_klass, payload));
    tssp buf = value_klass->allocate_tss(value_klass);
    int size = value_klass->size_in_bytes();
    memcpy(buf->payload(), payload, size);
    return make(Q_TSS, buf);
  }
}

The TSS blocks can perhaps be allocated in the same stack frame which contains the interpreter’s U-typed locals and stack values. In that case, care must be taken when returning a U-type from a stack frame: The returned value may need to be copied somewhere safe before its backing buffer goes away. Copying to the heap will always work, but that may be expensive. Similarly, copying to a thread-local TSS block will work, until that resource is exhausted. In that case some sort of thread-local scan may have to garbage-collect and recycle unused TSS blocks, and perhaps compact the remaining ones.

A more localized handshake is also possible, where the caller pre-allocates a TSS block and passes it to the callee, along with its size. The callee would use this block if the returned value fits. (Recall that in many but not all cases the returned value is of a known size. For generic code the returned value’s size is unpredictable.) If the callee cannot use the suggested block, it can use other TSS, or else the heap.

This pre-allocation handshake for return values is especially useful for compiled code to receive return values from unknown sources, because the compiler can probably guess the size accurately, and arrange to recycle TSS space in the caller more vigorously than the callee. For example, if the caller contains a loop over many generic values, it can reuse a TSS buffer many time without talking to the GC or to a TSS manager. If the TSS is allocated in the callee, there is no opportunity for such an optimization.

With a little forethought, we can ensure that the TSS blocks and the heap blocks have a standard layout, so that when loading the value class or computing the payload address, the same code applies in both cases. In that case those crucial operations become simpler:

  oop_or_tssp as_standard_layout() {
    assert(is_on_heap() || is_on_tss());
    return oop_or_tssp(address());
  }
  Klass* klass() {
    if (is_null())  return null_klass();
    return as_standard_layout()->klass();
  }
  address payload() {
    if (is_null())  return NULL;
    return as_standard_layout()->payload();
  }

Note that we are using “stowaway” boxes to buffer value payloads on the heap. This has a couple of advantages. First, boxing and unboxing (for such values) is almost free, consisting of a trivial tag bit adjustment. Sometimes user code has a long chain of boxing and unboxing, because it can’t decide which carrier type to use. In those cases, after an initial boxing on the heap, subsequent unbox and re-box operations are cheap, as long as the value is kept in a U-variable.

Another advantage of using “stowaway” boxes is interoperability. Value-class behavior, on both boxes and values, can then be implemented by common code which simply pulls out the class and payload base address, and goes to town. This is especially true when the TSS block layout is aligned with the heap object format.

Small fry stay for free

Finally, we need to make allocation time for small values sub-linear (see [Small] above). This means that when the interpreter allocates a block of storage for temporary variables (stack and locals), it should allocate another block of storage, somewhat larger, to use as buffer space for those variables, as needed. (Clearly if no U-types or Q-types are present, then the second allocation can be omitted.) Let’s call this storage mini-TSS. It is a kind of TSS, but managed as an immediate, bulk-allocated buffer within the same stack frame.

The mini-TSS block for each U-value will have a fixed size, large enough to buffer primitives and small value types, but not so large that pre-allocation will cause thread stacks to balloon in size. Something in the range between 64 and 160 bits is sufficient. (Note that alignment and atomicity is not an issue in TSS, so we are not limited to power-of-two sizes.)

Let’s add this feature to our sketch. We will put the extra storage block inside the structure of the U-type, although in practice a JVM implementor may wish to segregate these extension blocks in their own array, alongside the locals array. We will add two versions of the feature, one which preserves the standard heap layout everywhere and one which splits the class metadata from the payload. The first preserves more uniform, branch-free code, but requires 96 to 128 bits of mini-TSS, while the second can work even with 64-bit mini-TSS blocks. There is no need to use both techniques at the same time.

struct UCarrierType {  // take 3
  enum KindBits {
    L_OBJ  = 0,  // word is an oop (ordinary object pointer)
    Q_HBUF = 1,  // word is a stowaway box object (an oop)
    Q_TSS  = 2,  // word points to TSS block, with heap layout
    Q_IMM  = 3,  // word points to mini-TSS block, with heap layout
    Q_SPL  = 4,  // word is klass only; mini-TSS has payload only
    };
  static const int MINI_TSS_BYTES = 16;  // could be 8 with Q_SPL
  char _mini_tss[MINI_TSS_BYTES];
  ...
  bool is_on_tss() {
    return (kind() & ~1) == Q_TSS; // Q_TSS=2 or Q_IMM=3
  }
  tssp as_tssp() {
    assert(is_on_tss());
    return tssp(address());
  }
  address mini_tss() {
    return address(&_mini_tss);
  }
  bool is_mini_tss() {
    return kind() >= Q_IMM;
  }
  bool is_split() {
    return kind() == Q_SPL;
  }
  Klass* split_klass() {
    return (Klass*)(address());
  }
  Klass* klass() {
    if (is_null())  return null_klass();
    if (is_split())  return split_klass();
    return as_standard_layout()->klass();
  }
  address payload() {
    if (is_null())  return NULL;
    if (is_split())  return mini_tss();
    return as_standard_layout()->payload();
  }
  ...
  static const bool USE_SPLIT = ...;  // true or false
  static const int HEADER_BYTES = sizeof(Klass*) + ...;
  static UCarrierType from_stack(Klass* value_klass, address payload) {
    if (!value_klass->can_use_tss())
      return from_heap(Q_HBUF, box_on_heap(value_klass, payload));
    int size = value_klass->size_in_bytes();
    UCarrierType result;
    if (USE_SPLIT && size <= MINI_TSS_BYTES) {
      result = make(Q_SPL, address(value_klass));
    } else if (!USE_SPLIT && size + HEADER_BYTES <= MINI_TSS_BYTES) {
      result = make(Q_IMM, result.mini_tss());
    } else {
      result = make(Q_TSS, value_klass->allocate_tss(value_klass));
    }
    memcpy(result.payload(), payload, size);
    return result;
  }
}

The tricky part about using bulk-allocated mini-TSS blocks is that each one closely corresponds to a particular U-type local variable. This will cause difficulties in the interpreter when assigning U-values between different locals and stack slots. Whenever a U-value is moved inside the JVM frame, the interpreter must check whether it is in the Q_IMM or Q_SPL state, and also copy the data between the mini-TSS blocks associated with source and destination. In the case of the Q_IMM state, the word pointer inside the U-value must also be updated to point at the destination mini-TSS block. This may seem like too much trouble, but the alternative is a different kind of trouble: Every conversion of a primitive to a U-value will need to allocate a new plain TSS block, requiring garbage collection cycles.

The extra bookkeeping for mini-TSS blocks looks like this:

  void assign_from_local(UCarrierType& src) {
    (*this) = src;
    if (src.is_mini_tss()) {
      memcpy(mini_tss(), src.mini_tss(), MINI_TSS_BYTES);
    }
  }

Even if mini-TSS blocks are not used, similar extra care can be taken to recycle plain TSS blocks when overwriting old values. This probably requires reference counting of TSS blocks, while mini-TSS blocks do not require reference counts, since they are allocated 1-1 with U-variables.

Packing for a trip

The basic size of a U-variable is one machine word. This makes it easy to store in side data structures. Buffering of values is accomplished by up to three different techniques: heap, TSS, immediate (mini-TSS). In some cases, if the U-variable gets copied from one context to another, it may need to change its buffering technique.

As we already saw, an immediately-buffered U-value will need to be re-buffered before being returned to a caller. If there is a suitable handshake with the caller, the immediate buffering can be preserved. Otherwise, the value must be re-buffered in non-local TSS, or on the heap.

In some cases, the JVM may need to inspect a stack frame during execution (while it is suspended). If any U-values are extracted from the stack frame for later processing, it is probably wise to re-buffer them on the heap.

If U-values need to be communicated to Object-based APIs (like method handles or Core Reflection), only the L_OBJ and Q_HBUF tactics are usable, and (again) TSS values must be re-buffered in the heap.

Universal arrays and lists

For expressing lists of parameters which might include unboxed values (Q-types), the parameter and return type U-Object and the list and array types U-Object[] and List<U-Object> seem necessary. As a poor alternative, plain L-Object values (including boxes) could be paired with boolean parammeters and array elements, to carry the Q-versus-L distinction explicitly in a user-managed channel.

By using the Q_HBUF bit the format of a U-Object[] array can be kept quite simple, differing only from a regular L-Object[] array by the possible presence of the tag bit, on each element. The GC will have to mask off these bits as it processes the array elements, just as it does when scanning U-values in root sets on the stack.

It seems tempting to add something like TSS to an array which contains many unboxed values. However, this may prove unworkable, since arrays (in general) must be prepared to accept concurrent updates from multiple threads. Even the simple act of setting up a mini-TSS with a primitive value, just before setting the control word to use it, would have race conditions that could expose broken intermediate states to concurrent readers of the array. For clever buffering in on-heap value types, the best options appear to be (a) pick the right size and type for the variable in the first place, (b) buffer the value on the heap, and maybe (c) invest in software transactional memory algorithms.

(Option (c) is probably necessary anyway, to support volatile reads and writes to value types larger than 64 bits. But it would be wise to adopt STM technology cautiously and slowly.)

Revisiting the verifier

The verifier tracks class types in detail. Class tracking allows bytecodes which interact with class types to assume that their inputs are well-formed instances of the desired class. The same state of affairs holds true after adding the universal carrier type and bytecodes proposed above. Here are some details:

In general, the verified class of an input tells the JVM that it can assume a certain layout (of object, value, or both) in the memory pointed at by the payload pointer of the Q-type or U-type (or today’s L-type). This class information enables safe field access and method invocation.

For invocations and field reads, the same API points apply to both Q-types and L-types, and so can be uniformly lifted into U-types. But field writes apply only to (some) L-types, and field updates (“withers”) apply only to Q-types, so only field reads can play a uniform role with U-types. It seems easier (given the “stowaway box” feature proposed above) to allow U-types to omit support even for field reads. After all, a cheap u2v instruction, followed by a vgetfield instruction, is sufficient to load a field from a boxed value, or a mix of boxed and unboxed values. For object types, a cheap u2a instruction followed by plain getfield is sufficient to get a field. The same point applies to methods on classes, since every class is either a value class or an object class.

So in the case of U-types it is reasonable to limit access to interface methods and methods on Object (an honorary interface).

Given a U-type (or Q-type) value of class C, and an API point in C or one of its supers, the verifier must check and enforce the symbolic reference class of the API point as follows:

For symmetry with the new opcodes we might retroactively consider certain legacy API point bytecodes as starting with “a”: agetfield, aputfield, ainvokevirtual, ainvokespecial, ainvokeinterface, acheckcast, anew, etc. There is no need to rename getstatic, putstatic, invokestatic, or invokedynamic.

Tabular summary of operations

Here is a table that summarizes the operations mentioned above.

Operations which are described above as optional or doubtful are included, but marked as such.

Natural operations, according to kind
I/J L Q U

can have identity?

no

yes

no

yes

must have identity?

no

yes

no

no

query kind

(nop)

(nop)

(nop)

ukind

query class

aclass? (.getClass)

uclass? (.getClass)

uclass? (.getClass)

check or cast class

checkcast instanceof (or API)

ucheckcast uinst* (or API)

ucheckcast uinst* (or API)

cmp

bitwise
(icmp/lcmp)

identity
(acmp) (or API)

fieldwise
(vcmp) (or API)

kindwise
(ucmp) (or API)

equals, hashCode API

yes

yes

yes

invoke on interface

invokei*

uinvoke

uinvokei*

invoke on class

invokev*

vinvoke

uinvoke (optional)

get field

getfield

vgetfield

ugetfield (optional)

change field

vwithfield

put field

putfield

System.iHC, Obj.wait

yes

synch.

monitor*

box

k2box (i2box or T.valueOf?)

u2box

u2box

unbox

.TValue, unbox2k

unbox2k

change kind
(no box or unbox)

k2u

(i2u or API?)

k2u

(l2u or API?)

u2k

u2k (u2v,u2a,u2i, etc.?)

The case of the getfield family

When a getfield instruction is linked, the JVM’s link resolver can summarize the results of linkage as a single small integer, the offset of the field within the class. Since size of the field depends only on the basic type of the field (one of “IJFDL” and a few more subword types), the getfield instruction only needs to remember the offset and basic type in order to execute. It can blindly add the offset to the object reference base pointer and load the required datum. The only check needed is a null check, if the base object is an L-type or Q-type.

If the verifier did not guarantee the class of the input to getfield, the operation of getfield would have to include at least a type test, and maybe a field lookup also. This would be far more expensive than a simple field load. It is arguable that the efficiency of getfield (and the other field instructions) is the main reason the JVM verifier exists at all. Of course, the other API access instructions (the invoke family) also perform some optimizations, reducing names to small integers such as v-table indexes, but the relative performance gain from these techniques is smaller for method invocation than for field access.

For value types, field access is not much different from getfield, except that instead of an L-type for a base address, there is a Q-type or U-type. Exactly as with objects, the offset and type of the field in a value type can be summarized in a small integer plus a basic type code, and the overall loading process is the same: Form a base address, and extract the typed field at the given offset.

What might be more complex in the case of value types is that they could be buffered in a variety of places. Forming the base address might involve some tag bit testing and branching. For this reason, the proposed design for a U-type carrier tries to minimize this overhead by using a more uniform representation for pointers in the various buffering cases.

Since boxes (the stowaway Q_HBUF state) are allowed in the representation of a Q-type, then the pointer to the boxed object should use the same offsets as for the other Q-states (Q_TSS, Q_IMM).

This implies that the values stored in TSS should have headers that are just as large as headers of heap objects. HotSpot object headers are approximately three times as large as the minimum required metadata overhead (the Klass* pointer). Perhaps the extra overhead can be put to work to help manage TSS blocks, or to hold reference counts to enable in-place modification of values (by vwithfield).

Alternatively, the machine word in the UCarrierType structure could be biased to point past the waste part of the object header. In that case, the GC (when scanning U-types) would simply be required to adjust for the bias. It already does this trick for offset pointers produced by optimized JIT code.

Based on early results from Frederic Parain’s work on the Valhalla interpreter, storing a full header in TSS appears to be a reasonable expenditure. If that is not the case, we can try a more “tense” header-free representation (Q_SPL).

It is probably more important to enable the JVM to intensively recycle unused TSS, and that is why frame-local mini-TSS buffers are probably more interesting to investigate. Also, supposing reference counts are kept on TSS blocks, if vwithfield operates on a TSS block with a single reference, then the vwithfield can immediately recycle the block, patching in the new field value. Reference counts are difficult to use in a multi-threading environment, but since every TSS block is confined to a single thread, there are no race conditions to worry about.

Polymorphic getfield

Given the new kinds of “any-fied” generic types, object and value layouts will sometimes turn out to be polymorphic. A common example of a simple polymorphic layout is ArrayList<any T> where the backing array has a polymorphic layout, varying from byte[] to Object[] to long[], but the object itself has a non-varying (“monomorphic”) layout.

The example of Optional<any T> is more instructive. This value-based object (or purevalue) has a field of type boolean and a variable-sized field of type T. Any particular instance of this type “knows” what its type T is, and its layout is right-sized for the type. So an Optional<byte> might have a two-byte payload, while Optional<long> requires nine bytes.

Just to make things interesting, let’s suppose that the field of type T comes first, and the field of type boolean follows it. (The JVM implementors could make either choice, or change up their choices based on the phase of the moon.) In that case, the first field has a fixed offset (zero bytes after the header) but a variable size, and the second field has a variable offset but a fixed size (one byte).

Here is some example code of this generic class, plus the layouts of two of its instances, as if they were their own classes:

class Optional<any T> {
  T value;            // size = sizeof(T), offset = HDR + 0
  boolean isPresent;  // size = 1, offset = HDR + sizeof(T)
}
class OptionalByte {  // ~~ Optional<byte>
  byte value;         // size = 1, offset = HDR + 0
  boolean isPresent;  // size = 1, offset = HDR + 1
}
class OptionalLong {  // ~~ Optional<long>
  long value;         // size = 8, offset = HDR + 0
  boolean isPresent;  // size = 1, offset = HDR + 8
}

An instruction getfield Optional<byte>.isPresent:Z can be linked exactly like its non-generic non-generic getfield OptionalByte.isPresent:Z. The same small-integer offset (1) can be used in both cases to access the boolean.

(By the way, the identical points apply to the layouts of value classes. We are working with object class examples to simplify things.)

But the cases are not perfectly analogous. We know that the verifier will ensure the exact layout of the OptionalByte operand, but what did it “think” when it pushed the Optional<byte> operand? Did it prove that the operand is of the exact species Optional<byte> of the class Optional, or did it only observe the class Optional itself?

The answer to that question affects two things which must be traded off, the complexity of the verifier type system, and the amount of work the getfield instruction can skip, based on its trust in verified the operand type.

If the verifier only proves the class component Optional, then the getfield instruction will need to perform a quick runtime check that the operand is in fact the right species of Optional. This is a pointer load and compare, probably; it is cheap but perhaps significant relative to the cost of the whole getfield instruction.

The same reasoning applies to getfield Optional<byte>.value:byte, where the offset is zero and the size is one byte. At most the getfield instruction must test that the object is of the right species, and then work the same as getfield OptionalByte.value:byte.

It likely to be possible to access a any-fied generic type under a wildcard, which means the caller doesn’t know the actual species of the class. An instruction getfield Optional<?>.isPresent:Z will link to some description of how to find the layout for any species of Optional, given an instance of that species. But it cannot link with any particular offset; it might be 1 (for T=byte) or 8 (for T=long). At runtime, the getfield instruction can thank the verifier for proving that the input is of class Optional, and then get busy loading a field layout table from the object’s species. The second element of that table will contain the offset of the desired field (1 or 8 or something else). The getfield instruction will then load the boolean at that offset, and declare victory.

What about the complementary case, where the offset is known but the type isn’t? The instruction getfield Optional<?>.value:? needs to push something more well-defined than “?”. In fact, this is a job for U-types. The actual instruction is getfield Optional<?>.value:U. (Or getfield Optional<?>.value:Ujava/lang/Object;, if we don’t go with the cute hyphen syntax for elidable classes.) As with the other wildcard case, the getfield instruction must arrange to load a field layout table, but in this case the table will also report the size and type of the field stored in the object.

The actual work done by the getfield will vary depending on the field’s type, and the getfield code must be ready for anything. If the field is a reference, it is loaded, tagged, and wrapped in a U-type. If it is a primitive or value, the properly sized bundle of bits is loaded into a TSS buffer, and the buffer tagged and wrapped in a U-type. (There are a number of edge cases, such as for very large or very small value types, for volatile variables, and for value types that contain embedded managed pointers.) In all cases (except plain references) the metadata of the field’s class must also be loaded (from the ever-trusty field layout table), and stored in the header of the TSS buffer. It seems likely that code sequences to perform these actions will be “canned” and replayed in the interpreter and maybe the JIT code.

What we have just described is the beginning of the story of how U-types allow generic code to stay generic, even while generic data structures are always tightly packed on the heap. That is a long story told elsewhere. The important point here is that there is a plausible way to write the generic code generically, and that getfield performance does not have to suffer greatly.

Appendix: JVM verification type hierarchy

Here is an updated diagram of the JVM’s verification type hierarchy, showing the position of U-types and Q-types under a common universal carrier type:

 Verification type hierarchy:

                                 top
                     ____________/\____________
                    /                          \
                   /                            \
               oneWord                        twoWord
           ___/  / \  \___                   /       \
          /     |   |     \                 /         \
         /    int float reference        long        double
        /                /     \
 universalCarrier       /       \____________
    /       \          /                     \
   |      +-------+   /                       \
   |    ..|U-types|   |                      +------------------+
   |   .. +-------+   |                      |     L-types      |
   |  ..              |                      |   (Java object   |
+-------+             |                      |  type hierarchy) |
|Q-types|       uninitialized                +------------------+
+-------+        /         \                          |
                /           \                         |
     uninitializedThis  uninitialized(Offset)        null

Like the L-types, the Q-types and U-types have a type structure derived from the Java class hierarchy. Some Java class types, such as PrintStream, might only appear in one of those three places, while others, such as Object or Comparable, might appear in all three.

Within each box containing Java class structure, the verifier consults parts of the class hierarchy to determine subtype/supertype relationships. Because interfaces do not significantly affect verification, the type hierarchy in each box is tree-structured. That is, inheritance is single, not multiple, within each box.

The dotted line between the Q-type and U-type boxes refers to the fact that, if a class Foo shows up in both boxes, then the verification type Q-Foo has a supertype U-Foo, in addition to the Q-types derived from supers of Foo. This is a restricted type of multiple inheritance.

In addition, and independently of the value/object distinction, the verifier is likely to recognize “species types” as a refinement of class types, denoting specific instances of reified generic classes . So Optional<boolean> will be a species of the class Optional, and a subtype of it. As discussed above, pushing this much detail into the verification type system seems necessary to get reasonable performance from getfield.

But there are likely to be complex relations between a generic class, its species, its wildcards, its supers, and the species and wildcards of its supers. (And then there are arbitrarily complex additional constraints that may appear among type variables and captures.) It is impossible that the JVM verification type system should completely encode all such relations, so (as with Java interfaces) there will be some type information lost when projecting the full Java type system down into to verification types. As discussed above, this simply means that some bytecodes will (as with invokeinterface) be required to query the dynamic types of their operands, even if it is somehow “obvious from context” that there is an interesting static type available. The important thing is that such dynamic checking must not swamp the simplest operations, like getfield on a concrete class or species.

Appendix: Digression into the land of uniformly classed descriptors

This section is highly speculative. The concept of classed primitives (and classed voids) is not necessary to a successful implementation of value types and generics. Rather, it is a way that the Java platform might gain additional benefits from the sunk cost of supporting primitive types. Even if such a road is not taken, it is useful to record as a rejected alternative.

It is possible (and perhaps worthwhile) to extend that special remaining advantage to new Q-types like Q-UnsignedLong. The idea would be to retroactively endow the primitive carrier types with classes (just like the “LQU” family). A primitive type descriptor “J” would be ret-conned to be shorthand for “J-long;”, allowing space for additional classes to appear, as “J-UnsignedLong;”. (Something like the hyphen is necessary to signal the newly appended class after the primitive type letter.)

There must be certain enforceable agreements about value classes that would benefit from this special treatment. The source language for the value class would need some way to signal the intended compatibility with the primitive carrier type, perhaps using a special super-type, and the JVM would have to look for it. The obvious main requirement is that the class fit into the proposed primitive carrier type. (No JVM magic can make a ComplexDouble fit into a 32-bit “I” carrier.) This size limit can be verified at link time; a class which turned out to be too big would simply fail to link with code which was attempting to jam it into the wrong carrier. A more subtle limit on these primitive-compatible value classes would be the requirement that the special treatment must not violate encapsulation of the value class. It is probably reasonable to require that the value class allow free conversions to and from its carrier type, as the price of admission to the special efficiency of the carrier type.

For example, an interface like the following could be used to declare compatibility with the long (“J”) carrier type:

interface longCompatible<This extends longCompatible<This>> {
  long toLong();
  This fromLong(long value);  // usage: This.default.fromLong(42L)
}

Or, a generic interface (or an annotation) like this could be used to bind a value-type to a particular carrier type:

interface CarriedBy<any Carr, any This extends CarriedBy<Carr, This>> {
  Carr toCarrier();
  This fromCarrier(Carr c);  // usage: This.default.fromCarrier(42L)
}

Must we avoid void?

The JLS tries to avoid calling void a type, but this is just a rhetorical tactic. In fact, void is an example of what is called unit type, which is any type which can take on at most one value. As such, every unit type has a highly compact representation of zero bits, a property that makes it useful in some settings. And the JVM’s void type (descriptor “V”) in fact occupies no storage; it is a place-holder for something that might have required a non-trivial carrier type. We can treat void as a fifth primitive carrier type (“V” along with “IJFD”). This becomes useful if the primitive types are allowed to carry classes and interoperate with user-defined value types. Such a “void value” occupies zero bits and serves simply as a witness that a call or return was performed.

A type like “V-NotReached;” can be used to indicate (even with error checking) that a method is not intended to be called, or to return. An enum-like type like “V-CLUBS;” can be used to create orderly patterns of overloaded methods.

There is another use of the raw carrier void type (“V” or V-void;), which is to supply the default value of a U-type variable, as a substitute for the L-type value null. If U-types are used only for locals, method parameters, and returns, there is no need for a default value, but if U-types are ever stored in arrays or instance variables, a special empty initial U-value will be technically useful. The payload of such a value should be zero bits, and void, viewed as a unit type and a carrier type, perfectly fits this usage.

Classes everywhere

Accepting all of the above proposals could lead to a modified JVM descriptor language that is much more symmetrical than the current one. A method parameter, return type, or local variable type would consist of one of the letters “UQLIFLDV”, optionally followed by a class annotation. The letter stands for a carrier type. The class annotation is a hyphen followed by a class name and then a semicolon.

If the class annotation is omitted, it is supplied by picking a standard representative class for the carrier type. The standard classes for the carrier types “UQLIFLDV” are any, any, Object, int, long, float, double, and void. To assist in signature normalization, these class names must be omitted. So a method taking a float, an unsigned int, a string, and returning an object might have signature “(FI-java/math/Unsigned;L-java/lang/String)L”. The old form “Ljava/lang/Object;” or “Ljava/lang/String;” gets a hyphen added, and then the string “-java/lang/Object;”, if it occurs, must be omitted.

The verifier tracks both kinds (carrier types) and classes for all values. All values convert implicitly to their unadorned carrier types, but a checkcast-type operation is needed make other changes to classes. As with references, the cast type operation might have “teeth”: It might be able to enforce class-specific invariants either by clipping values to valid ranges, or by throwing exceptions. For example, a cast to the type “I-boolean;” might take steps to ensure that the result value was one of the two valid boolean values (0 or 1).

There are ways to encapsulate primitive-carried value types without hardwiring too many rules into the JVM. Value classes have sole rights for manufacturing their values. (Exception: The one default value per class can be accessed by any code.) The verifier would only allow a cast from a raw carrier type to a given value class C inside the body of C itself. Elsewhere, verifier-enforced static typing would prevent forgeries of extraneous C values.

(This static typing is not sufficent, however, to secure values with secret components. Without further rules for encapsulation, a value squeezed into a primitive carrier can always be forced to cough up its payload, as uninterpreted bits. Object-quality encapsulation can probably only be obtained with the polymorphic kinds “UJQ”.)

If using signed values as carrier types is too unfavorable to unsigned values, it would be possible to declare more neutral standard classes for the “IJ” descriptors. They could be value types Bits32 and Bits64, with no adherence to any arithmetic properties. A slight downside of this scheme would be that normal Java int parameters would have more verbose signatures of “I-int;”, since plain “I” would be reserved for Bits32.

When the verifier first sees a class name associated with a particular carrier type, it may load the class in order to inquire whether the class is compatible with the claimed carrier type. This is not necessary for L-types and U-types, since any type can be boxed as a reference or stored in a U-type. For Q-types, the class must either be a value class, or a reference class that supports unboxing. (In the latter case, either it is a wrapper for another type, or, as a value-based class, it allows viewing as an “un-box”.)

The meaning and representation of any value on the JVM stack would be determined by its kind and its class. Regardless of kind, there would be an internal standard representation in terms of a base pointer and layout, and the class would determine the size and structure of that layout. The kind would also put certain limits on the class, size, layout, and semantics of the value.

The U-kind is universal, but has absolutely no values apart from L-kind and Q-kind values. Except for the L-kind and U-kind, all other kinds are strictly for values and can be efficiently and losslessly stored in Q-kind variables. Nullness and object identity are confined to the L-kind; they can appear in the U-kind but only when it wraps an L-kind value.

Some of the Q-kind values can be further “squeezed” into one of the primitive kinds “IJFDV”, gaining (perhaps) additional performance, at least in the interpreter. In any case, every primitive type is defined primarily as a value class, a Q-type, and secondarily marked as compatible with the primitive carrier type.

Also (and independently of value-squeezing) all Q-kinds can be boxed into L-kinds, and many L-kinds can be unboxed as Q-kinds.

Perhaps in the future the primitive kinds (“IJFDV”) will turn out to be needless detail, and compilers will emit bytecode which only works with “ULQ” types, rendering int as “Q-int;” instead of “I-int;”. On the other hand, if there continue to be practical advantages from primitive kinds, a few more descriptor kinds may be added, such as “K” for Bits128.

In terms of the current JVM verification type system, int, float, long, and double would expand (in place) to a small, independent box, of I-types, F-types, J-types, and D-types, respectively. Each box would contain a small subset of the Java class hierarchy, of only those classes which are “hooked up” to the corresponding carrier type. The void quasi-type could also appear on the map, with an isolated box of its own.

There would be no new relations between boxes. For example, no I-type would be a verification subtype of any Q-type, even if the two verification types were marked with the same class (e.g., I-Unsigned32, Q-Unsigned32), because the carrier types (I-carrier-type, U-carrier-type) are disjoint. Discounting the uninitialized reference types, we would have:

 Verification type hierarchy:

                                   top
      void                     ____/\________
    +-------+                 /              \
    |V-types|                /                \
    +-------+            oneWord               \
            ____________/  / \  \________       \
           /              /   \          \       \
          /              |    int       float     |
   universalCarrier      |  +-------+ +-------+   |
         /\              |  |I-types| |F-types|   |
        /  \             |  +-------+ +-------+   |
       /    \            |                       /
      /  +-------+   reference                  /
     /   |U-types|       |                  twoWord
    /    +-------+   +-------+                /\
   |     ..          |L-types|               /  \
   |    ..           +-------+          long     double
 value ..                |            +-------+ +-------+
+-------+                |            |J-types| |D-types|
|Q-types|               null          +-------+ +-------+
+-------+

(But it seems best to keep on the main road to values, where our classes are brought to us only by the letters U, L, and Q.)