“What we do in the shadows reveals our true values.”
In the two years since the first public proposal [values] there have been vigorous discussions [valhalla-dev] of how to get there, and what specific changes to make to the JVM and its classfile format, in order to unify primitives, references, and values in a common platform that supports efficient generic, object-oriented programming.
Much of the discussion has concentrated on generic specialization [goetz-jvmls15], as a way of implementing full parametric polymorphism in Java and the JVM. This concentration has been intentional and fruitful, since it exposes all the ways in which primitives fail to align sufficiently with references, and forces us to expand the bytecode model. After solving for List<int>
, it will be simpler to manage List<Complex<int>>
.
Other discussions have concentrated on details of value semantics [valsem-0411] and specific tactics implementing [simms-vbcs] new bytecodes which work with values. A few experiments have employed value-like APIs to perform useful tasks like vectorizing loops [graves-jvmls16].
Most recently, at the JVM Language Summit (2016), and at the Valhalla EG meeting that week, we got repeated calls for an early-access version of value types that would be suitable for vector, Panama, and GPU experiments. This document outlines a subset of experimental value type support in the JVM (and to a smaller degree, language and libraries), that would be suitable for early adopters.
Looking back, it is reasonable to estimate that there have been many thousands of engineer-hours devoted to mapping out this complex future. Now is the time to take this vision and choose a first version, a sort of “hello world” system for value types.
The present document proposes a minimized but viable subset of value-type functionality with the following goals:
Our non-goals are complementary to the goals:
In other words, before releasing our values to the full light of day, we will prototype with them in the shady area between armchair speculation and public specification. Such a prototype, though limited, is far from useless. It will allow us to experiment with various approaches to the design and implementation of value types. We can also discard approaches as needed! We can also begin to make better estimates of performance and usability, as power-users (most of whom will work closely with the designers and implementors) exercise various early use cases.
The specific features of our minimum (but viable) support for value types can be summarized as follows:
Int128
, etc.) from which the VM may derive value types.Q
-types”) for describing new value types in class-files.vload
, etc.) for moving value types between JVM locals and stack.int.class
).Object
type.Standard Java source code, including generic classes and methods, will be able to refer to values only in their boxed form. However, both method handles and specially-generated bytecodes will be able to work with values in their native, unboxed form.
This work relates to the JVM, not to the language. Therefore non-goals include:
java.util.Optional
.Given the slogan “codes like a class, works like an int,” which captures the overall vision for value types, this minimal set will deliver something more like “works like an int, if you can catch one”.
By limiting the scope of this work, we believe useful experimentation can be enabled in a production JVM much earlier than if the entire value-type stack were delivered all at once.
The rest of this document goes into the proposed features in detail.
A class may be marked with a special annotation @DeriveValueType
(or perhaps an attribute). A class with this marking is called value-capable, meaning it has been endowed with a value type, beyond the class type itself.
(The details are TBD, but will be similar to the restrictions on internal annotations like
@Contended
or@PolymorphicSignature
.)
Example:
@jvm.internal.value.DeriveValueType
public final class DoubleComplex {
public final double re, im;
private DoubleComplex(double re, double im) {
this.re = re; this.im = im;
}
... // toString/equals/hashCode, accessors, math functions, etc.
}
The semantics of the marked class will be the same as if the annotation were not present. But, the annotation will enable the JVM, in addition, to consider the marked class as a source for an associated value type.
As with the full value type proposal, the value-capable class may define fields and methods and implement interfaces. The fields and methods will be available directly on both boxed and unboxed values.
(Until there are bytecodes for directly accessing members of unboxed value, method handles will be available for the purpose, as they are for members of regular objects. See below.)
The super-class of a value-capable class must be Object
, and even that should be omitted in the source code of the class.
A class marked as value-capable must in qualify as value-based, because its instances will serve as boxes for values of the associated value type. In particular, the class, and all its fields, must be marked final
, and constructors must be private.
A class marked as value-capable must not use any of the methods provided on Object
on any instance of itself, since that would produce indeterminate results on a boxed version of a value. The equals
, hashCode
, and toString
methods must be replaced completely, with no call via super
to Object
methods.
As an exception, the getClass
method may be used freely; it behaves as if it were replaced in the value-capable class by a constant-returning method.
The other object methods (clone
, finalize
, wait
, notify
, and notifyAll
) may not be used, and will not be visible on the forthcoming value type derived from the value-capable class.
Here is a larger example for a “super-long”:
final class Int128 extends Comparable<Int128> {
private final long x0, x1;
private Int128(long x0, long x1) { ... }
public static Int128 zero() { ... }
public static Int128 from(int x) { ... }
public static Int128 from(long x) { ... }
public static Int128 from(long hi, long lo) { ... }
public static long high(Int128 i) { ... }
public static long low(Int128 i) { ... }
// possibly array input/output methods
public static boolean equals(Int128 a, Int128 b) { ... }
public static int hashCode(Int128 a) { ... }
public static String toString(Int128 a) { ... }
public static Int128 plus(Int128 a, Int128 b) { ... }
public static Int128 minus(Int128 a, Int128 b) { ... }
// more arithmetic ops, bit-shift ops
public int compareTo(Int128 i) { ... }
public boolean equals(Int128 i) { ... }
public int hashCode() { ... }
public boolean equals(Object x) { ... }
public String toString() { ... }
}
Similar types [Long2.java] have been used in a loop vectorization prototype. This example has been defined in a prototype version of the java.lang
package. But value-capable types defined as part of this minimal proposal will not appear in any standard API. Instead, at first, they will be segregated somewhere hard to reach, in a package like jdk.experimental.value
.
Initial value-capable classes are likely to be extensions of numeric types like
long
. As such they should have a standard and consistent set of arithmetic and bitwise operations. There is no such set codified at present, and creating one is beyond the scope of the minimal set. Eventually we will need to create a set of interfaces that captures the common operation structure between numeric primitives and numeric values.
A crucial part of being able to provide an experimental release is the ability to mark features as experimental and subject to change. While the ideas expressed in this document are reasonably well baked, it is entirely foreseeable that they might change between an experimental release and a full Valhalla release.
Within a single version of the JVM, the experimental features are further restricted to classes loaded into the JVM’s initial module layer, or a module selected by a command line option, and is otherwise ignored. These modules are called value-capable modules.
In addition, the class-file format features may be enabled only in class files of a given major and minor version, such as 53.1. In that case, the JVM class loader would ensure that classes of that version were loaded only into value-capable modules, and then consult only the version number when validating and loading the experimental extended features proposed here. It is possible that some minor versions will be used only for experimental features, and never appear in production specifications.
Any use of any part of any feature of this prototype must originate from a class in a value-capable module. The JVM is free to detect and reject attempts from non-value-capable modules. Annotations like @DeriveValueType
may be silently ignored.
However, a prototype implementation of this specification may omit checks for such usage, and seem to work (or at least, fail to throw a suitable error). Any such non-rejection would be a bug, not an invitation.
In value-capable modules, the class-file descriptor language is extended to include so-called Q-types, which directly denote unboxed value types. The descriptor syntax is “Q
InternalName;
”, where InternalName is the internal form of a class name. (The internal form substitutes slashes for dots.) The class name must be that of a value-capable class.
By comparison, a standard reference type descriptor may be called an L-type. For a value-capable class C, we may speak of both the Q-type and the L-type of C. Note that usage of L-types is not correlated in any way with usage of Q-types. For example, they can appear together in method types, in arbitrary mixtures.
A Q-type descriptor may appear as the type of a class field defined in a value-capable module. But the same descriptor may not appear in a field reference (CONSTANT_Fieldref
) for that field (even in a value-capable module). Thus, the getfield
family of instructions does not enter into the implementation of this proposal.
(Method handle factories, described below, will support field loads and updates.)
A Q-type descriptor may appear as an array element type in a class of a value-capable module. (Again, this is only in a value-capable module, and probably in a specific experimental class-file version. Let’s stop repeating this, since the limitation has already been set down as a blanket statement.) There are no bytecodes for creating, reading, or writing such arrays, but the prototype makes method handles available for these functions.
A field or array of a Q-type is initialized to the default value of that value type, rather than null. This default value is defined (at least for now) as a value all of whose fields are themselves of default value. Such a default may be obtained from a suitable method handle, such as the MethodHandles.empty
combinator.
A Q-type descriptor may appear as the parameter or return type of a method defined in a class file. As described below, the verifier enforces the corresponding stacked value for such a parameter or return value to match the Q-type (not the corresponding L-type or any other type).
Any method reference (a constant tagged CONSTANT_Methodref
or CONSTANT_InterfaceMethodref
) may mention Q-types in its descriptor. After resolution of such a constant, the definition of such a method may not be native, and must use new bytecodes to work directly with the Q-typed values.
Likewise, a CONSTANT_Fieldref
constant may mention a Q-type in its descriptor.
Note that the Java language does not provide any direct way to mention Q-types in class files. However, bytecode generators may mention such types and work with them. It is also likely that work in the Valhalla project will create experimental language features to allow source code to work with Q-types.
Since our value types will have names and members like reference types, but are distinct from all reference types, it is necessary to extend some constant pool structures to interoperate with Q-types.
Naturally, as a result of extending descriptor syntax, method and field descriptors can mention Q-types. Doing this requires no additional format changes in the constant pool.
However, some occurrences of types in the constant pool mention “raw” class names, without the normal descriptor envelope characters (L
before and ;
after). Specifically, a CONSTANT_Class
constant refers to such a raw class name, and is defined to produce (at present) an L-type with no provision for requesting the corresponding Q-type. What is a class-file to do if it needs to mention the Q-type?
There is a simple answer: Pick a character which is illegal as a prefix to class names, and use it as an escape prefix within the UTF8 string argument to a CONSTANT_Class
constant. If the escape prefix is present, the rest of the UTF8 string is a descriptor, not a class name.
In order to preserve normalization of names, UTF8 strings for CONSTANT_Class
constants may not begin with “;L
” or “;[
”.
(To avoid confusion between current forms of class names and these additional forms, there will only be one way to express any particular type as a
CONSTANT_Class
string. Therefore, the descriptor itself may not begin withL
or[
, since type names that begin with those descriptors are already expressible, today, as “raw” class names in aCONSTANT_Class
constant. Otherwise,Class[";[Lfoo;"]
andClass["[Lfoo;"]
would mean the same thing, which is surely confusing.)
This minimal prototype adopts this answer, using semicolon ;
(ASCII decimal code 59) as the escape character. Thus, the types int.class
and void.class
may now be obtained class-file constants with the UTF8 strings “;I
” and “;V
”. The choice of semicolon is natural here, since a class name cannot contain a semicolon (unless it is an array type), and descriptor syntax is often found following semicolons in class files.
(Alternatively, we could repurpose
CONSTANT_MethodType
to resolve to aClass
value, since it already takes a descriptor argument, albeit a method descriptor. But this seems more disruptive than extendingCONSTANT_Class
.)
The L-type and Q-type for the example Int128
can now be expressed as twin CONSTANT_Class
constants, with UTF8 strings like “pkg/Int128
” and “;Qpkg/Int128;
” (where pkg
is something like jdk/experimental/value
).
When used with the ldc
or ldc_w
bytecodes, or as a bootstrap method static argument, a CONSTANT_Class
beginning with an escaped descriptor resolves to the Class
object for the given type (which, apart from a Q-type, must be a primitive type or void
). The resolution process is similar to that applied to the descriptor components of a CONSTANT_MethodType
constant.
When used as the class component of a CONSTANT_Methodref
or CONSTANT_Fieldref
constant, a CONSTANT_Class
for a Q-type implies that the receiver will be a Q-type instead of the normal L-type. Eventually there may be bytecodes which use such member references directly. (These may be some vinvoke
, vgetfield
, or just an overloading on invokespecial
and getfield
) For now, as noted below, such member references are limited to the specification of CONSTANT_MethodHandle
constants.
When resolving a CONSTANT_Methodref
against a Q-type, none of the methods of java.lang.Object
may appear; the JVM or method handle runtime may require special filtering logic to enforce this.
As an exception, the Object.getClass
method may be permitted, but it must return the corresponding L-type, as a constant.
There does not yet appear to be any advantage to customizing the
getClass
method on a Q-type to return the Q-type itself, and the dangers of confusion are significant.
In the full value-type design, a Q-type must inherit default
methods from its interface supertypes. This is a key form of interoperabilility between values and generic algorithms and data structures (like sorting and TreeMap
). Making this work in the minimal version requires boxing the value and running the default method on the box. Further steps are necessary but not part of this minimal design: The execution of default methods must be optimized to each specific value type. Also, there must a framework for ensuring that the interface methods themselves are appropriate to value-based types (no nulls or synchronization, limited ==
, etc.).
Q-types, like other type descriptor types, can be mentioned in many places. The basic list is:
method_info
and field_info
structures)CONSTANT_NameAndType
)CONSTANT_Class
constants)[
) in any descriptorCONSTANT_Class
references)CONSTANT_Class
) of some bytecodes (described below)The JVM might use invisible boxing of Q-types to simplify the prototyping of many execution paths. This of course works against a key value proposition of values, the flattening of data in the heap. In fact, the minimal model requires special processing of Q-types in array elements and object (or value) fields, at least enough special processing to initialize such fields to the default value of the Q-type, which is not (and cannot be) the default null
of an L-type.
So when the class loader loads an object whose fields are Q-types, it must resolve (and perhaps load) the classes of those Q-types, and inquire enough information about the Q-type definition to lay out the new class which contains the Q-type field. This information includes at least the size of the type, and may eventually include alignment and a map of managed references contained in the Q-type.
(The minimal model will probably not support putting references in value-types, in order to simplify connections to the GC. But object references stored in values are just as necessary to the final design as values in objects.)
Array types must be created whose component type is a Q-type. They will differ from arrays of corresponding L-types just as Integer[].class
differs from int[].class
. Likewise, the super-type of a value-bearing array will (like a int[]
) be Object
only, and not a different array type. Such arrays will not convert any other array type, and must be manipulated by explicitly obtained method handles.
(In the minimal model, we will not attempt to make value-bearing arrays inherit from interfaces implemented by the value types. Although it seems desirable, further work on JVM type structure is needed to make this happen. Interface types are firmly in the L-type camp, at present, and interface arrays are arrays of references.)
The following new bytecode instructions are added:
vload
pushes a value (a Q-type) from a local onto the stack.vstore
pops a value (a Q-type) from the stack into a local.vreturn
pops a value (a Q-type) from the stack and returns it from the current method.(N.B. These are macro-instructions, encoded with a prefix. Read on.)
Values are stored in single locals, not groups of locals as with long
and double
(which consume pairs of locals). (The slot pairing convention for long
and double
is likely to go away by the time specialized generics are introduced.)
The syntax of these instructions uses a bytecode type prefix syntax, with a bytecode called typed
analogous to the wide
bytecode, but taking a constant pool reference as a parameter. The type prefix must be followed by one of the standard bytecodes aload
, astore
, or areturn
, to compose vload
, vstore
, or vreturn
bytecodes.
(Although it is most intriguing to think of other uses for bytecode type prefixes, this proposal defines only these three specific usages. In addition, more code points may be allocated, either to represent
vload
, etc., more directly, or to perform other operations. In addition, if a bytecode instruction can incorporate a type prefix, it has considerably more use cases than just Q-types. Such “universal instructions” may be though of in terms of names likeuload
instead ofvload
or legacy codes likeiload
oraload
. It may be possible to retire or repurpose the existing data movement bytecodes with a more general type model. More experiments are inevitable!)
The code point for typed
is decimal 212 (hex 0xd4), just as the code point for wide
is decimal 196 (hex 0xc4). Every typed
bytecode is followed immediately by a two-byte reference into the constant pool.
The referenced constant must be of type CONSTANT_Class
(not CONSTANT_Utf8
as for “naked” descriptors). The class constant must be for a Q-type (other types may be allowed in the future). Thus, its UTF string must be of the form “;Q
InternalName;
”.
The first use of such a prefix resolves the given class constant to the corresponding Q-type. This process ensures that it in fact the underlying class is value-capable. As usual, a LinkageError
is thrown if this resolution process fails.
(A resolution step is not appropriate for
CONSTANT_Utf8
constants in some JVM implementations such as HotSpot, which is why the prefix cannot refer directly to a UTF8 constant. If there were aCONSTANT_Descriptor
constant we would use that, butCONSTANT_Class
is close enough. This encoding requires thatCONSTANT_Class
constants be enhanced to resolve to types other than L-types, which is a separate part of this proposal.)
The JVM may use Q-type resolution to acquire information about the Q-type’s size and alignment requirements, so as to properly “pack” it into the interpreter stack frame. Or the JVM may simply use boxed representations (L-types) internally and ignore sizing information.
Initially, the only valid use of a Q-type as the class component of a CONSTANT_Methodref
is as a CONSTANT_MethodHandle
constant.
In the minimal prototype, the receiver of an invokevirtual
or invokeinterface
instruction may not be a Q-type, even though the constant pool structure can express this (by referring to a Q-type as the class component of a CONSTANT_Methodref
). Method handles and invokedynamic
will always allow bytecode to invoke methods on Q-types, and this is sufficient for a start. Such a method handle may in fact internally box up the Q-type and run the corresponding L-type method, but this is a tactic that can be improved and optimized in Java support libraries, without pervasive cuts to the interpreter.
When setting up the entry state for a method, if a Q-type appears in the method’s argument descriptors, the verifier notes that the Q-type (not the L-type!) is present in the corresponding local at entry.
When returning from a method, if the method return type is a Q-type, the same Q-type must be present at the top of the stack.
When performing an invocation (in any mode), the stack must contain matching Q-types at the positions corresponding to any Q-types in the argument descriptors of the method reference. After the invocation, if the return type descriptor was a Q-type, the stack will be found to contain that Q-type at the top.
As with the primitive types int
and float
, a Q-type will not convert to any other verification type than itself, or the verification super-types oneWord
or top
. This affects matching of values at method calls, and also at control flow merge points. Q-types do not convert to L-types, not even their boxes or the supertypes (Object
, interfaces) of their L-types.
Bytecodes which interact with Q-types are only these:
typed
(operand is a class which must be a Q-type)typed
: areturn
, aload
, astore
, and slot-specific variants)Methodref
) may not, not even for static membersldc
and ldc_w
(of a Q-type, or perhaps a dynamically generated constant)Many existing bytecodes take operands which are constant pool references, any of which might directly or indirectly refer to a Q-type. Unless specified otherwise, these bytecodes will reject occurrences of Q-types. They include:
getfield
and its variants (use accessor method handles instead)aaload
and its variants (use accessor method handles instead)new
, anewarray
, multianewarray
(use factory method handles instead)checkcast
, instanceof
(Q-types like primitives do not exhibit polymorphism)In a fuller implementation of value types, some of these (but not all) are candidates for interoperation with Q-types.
The public, all-static class jdk.experimental.value.ValueTypeSupport
(in an internal module) will contain all methods of the runtime support for values in this initial prototype.
ValueTypeSupport
will contain the following public member class with public methods for reflecting Q-types:
static class ValueType<T> {
static boolean classHasValueType(Class<T> x);
static ValueType<T> forClass(Class<T> x);
Class<T> valueClass();
Class<T> boxClass();
}
The predicate classHasValueType
is true if the argument represents either a Q-type or (the L-type of) a value-capable class. The factory forClass
returns the Q-type for the L-type of a value-capable class. (If given a Q-type class, it returns it directly. If given any other type, it throws IllegalArgumentException
; users might want to test with classHasValueType
first to avoid the exception.)
The two accessors valueClass
and boxClass
return distinct java.lang.Class
objects for the Q-type and the original (value-capable) L-type, respectively.
(Note that the original value-capable class does not have special status with respect to this API; from the point of view of someone working with value types, it is merely the box class for the value. Eventually, value types will be directly defined by class files, and the box type will be derived indirectly.)
The legacy lookup method Class.forName
will continue to return the L-type, for reasons of compatibility. This condition is likely to persist. (In the future, the source language construct T.class
is likely to produce something more natural to the source code type assigned to T
, under the slogan “works like an int”.)
The pseudo-class returned from valueClass
is distinct from (unequal to) the class returned from boxClass
, or perhaps originally passed to forClass
(e.g., from code which has no other access to Q-types). This pseudo-class directly reflects the Q-type just as a pseudo-class like int.class
or void.class
directly reflects a primitive type (or even void
).
(Note: The use of pseudo-classes has precedent, with the primitive pseudo-classes like
int.class
. But it is not yet clear whether pseudo-classes for Q-types will be a permanent part of the design. For now, they are necessary to enable use of existing reflection mechanisms, such asMethodType
objects to encode Q-types for the lookup of method handles.)
The members reflected by a Q-class are identical to those reflected by the corresponding L-class, except their “declaring class” properties (e.g., Method.getDeclaringClass
) refer back to the Q-class instead of the L-class.
As is normal with reflection, invoking the methods of a Q-class must work exclusively with boxed forms of the receiver, arguments, return values, and field values.
Classes for Q-types may appear in reflective APIs wherever primitive pseudo-types (like int.class
) can appear. These APIs include both core reflection (Class
and the types in java.lang.reflect
) and also the newer APIs in java.lang.invoke
, such as MethodType
and MethodHandles.Lookup
. Constant pool constants that work with these types can refer to Q-types as well as L-types, and the distinctions are surfaced, reflectively, as suitable choices of Class
objects (either box or value).
It is undefined (in this proposal) how or whether legacy wrapper types (java.lang.Integer
) or primitive pseudo-types (int.class
) interact with the methods of ValueType
.
(When pseudo-classes need to be distinguished from normal
java.lang.Class
objects, we can use the shorthand term “crass”, where the “r” sound suggests that the thing exists only to reify a distinction necessary at runtime. The main class is the thing returned byClass.forName
, and which represents a class file in 1-1 correspondence; a “crass” is anything else typed asjava.lang.Class
. A more principled approach to reflection [cimadamore-refman] uses “type mirrors” of a suitably refined interface type hierarchy.)
You can use the reflective APIs to create and manipulate arrays, load and store fields, invoke methods, and obtain method handles.
Method handle transforms which change types (such as asType
) will support value-type boxing and unboxing just as they can express primitive boxing and unboxing. Thus, the following code creates a method handle which will box a DoubleComplex
value into an object:
Class<DoubleComplex> lt = DoubleComplex.class;
Class<DoubleComplex> qt = ValueType.forClass(lt).valueClass();
MethodHandle mh = identity(qt).asType(methodType(Object.class, qt));
Of course, the type-converting method MethodHandle.invoke
will allow users to work with method handles over Q-types, either in terms of L-types as supported by the current Java language, or (in suitable bytecodes) more directly in terms of Q-types.
As noted before, instances of a value-capable class (which is an L-type) serve as boxes for values of the corresponding Q-type. The various reflective APIs work directly with these boxes. The method handle APIs also allow conversion operators to be surfaced as method handles or applied implicitly for argument conversions.
Since the value-capable class is value-based, it is inappropriate to synchronize on them, make distinctions on them by means of reference equality comparisons, attempt to mutate their fields, or attempt to treat a null
reference as a point in the domain of the boxed type.
A future JVM may assist in detecting (or even suppressing) some of these errors, and it may provide additional optimizations in the presence of such boxes (which do not require a full escape analysis).
However, such assistance or optimization appears to be unnecessary in this minimal version of the design. Code which works with Q-types will, by its very nature, be immune to such bugs, since Q-types are non-synchronizable, non-mutable, non-nullable, and identity-agnostic.
Given the ability to invoke method handles that work with Q-types, all other semantic features of value types can (temporarily) be accessed solely through method handles. These include:
The MethodHandles.Lookup
and MethodHandles
APIs will work on Q-types (represented as Class
objects), and surface methods which can perform nearly all of these functions.
Pre-existing method handle API points will be adjusted as follows:
MethodType
factory methods will accept Class
objects representing Q-types, just as they accept primitive types today.invoke
, asType
, and explicitCastArguments
will treat Q-type/L-type pairs just as they treat primitive/wrapper pairs.Lookup.in
will allow free conversion (without loss of privilege modes) between Q-type/L-type pairs.findVirtual
method of Lookup
will expose all accessible non-static methods on a Q-type, if the lookup class is a Q-type.findConstructor
method of Lookup
will expose all accessible constructors of the original value-capable class, for both the Q-type and the legacy L-type. The return type of a method handle produced by findConstructor
will be identical with the lookup class, even if it is a Q-type.identity
method handle factory method will accept Q-types.empty
method handle factory method will accept Q-types, producing a method handle that returns the default value of the type.arrayConstructor
, arrayLength
, arrayElementGetter
, and arrayElementSetter
, plus eventually the var-handle variants.)(Yes, a value type method is obtained with
findVirtual
, despite the fact that virtuality is not present on afinal
class. The poorer alternatives are to co-optfindSpecial
, or make a new API pointfindDirect
to carry the nice, fine distinction. Since Java is already comfortable with the notion of “final virtual” methods, we will continue with what we have.)
Similarly, core reflection API points will be adjusted:
java.lang.reflect.Method
, Field
, and Constructor
may have self-types (getDeclaringClass
) that are Q-types. Such members are derived from the Class
objects representing Q-types.java.lang.reflect.Method
will work with Q-types, as discussed earlier. Reflected method types will correctly report the distinction between Q-type and their boxes (L-types). The invocation method will accept boxed L-types where Q-types are required.java.lang.reflect.Field
will work with Q-types. However, fields of boxed Q-types types may only be read, not written.java.lang.reflect.Constructor
will work with Q-types. The newInstance
method of a Q-type constructor will be reinterpreted as a factory method; the boxed value returned will not be guaranteed to be a fresh object. (This reinterpretation may be extended later to the L-type constructor, since the class is value-based.)java.lang.reflect.Array
will accept Q-types as component types.Some care must be taken in the reflection APIs to ensure that Q-types are not accidentally tied into the subtype/supertype relations of their corresponding L-types. No Q-type is a sub-type or super-type of any other Q-type or any other L-type. No Q-type is a subtype of Object
, and Q-types declare only their own methods (which therefore never use virtual-dispatch polymorphism). As an exception, default methods from interfaces are inherited into Q-types.
As value-based classes, value-capable classes are required to override all relevant methods from Object
. The derived Q-types do not inherit or respond to the standard methods of Object
.
The following additional functions do not (as yet) fit in the MethodHandle
API, and so are placed in the runtime support class jdk.experimental.value.ValueTypeSupport
.
ValueTypeSupport
will contain the following static methods:
static MethodHandle defaultValueConstant(Class<?> type);
static MethodHandle substitutabilityTest(Class<?> type);
static MethodHandle substitutabilityHashCode(Class<?> type);
static MethodHandle findWither(Lookup lookup, Class<?> refc, String name, Class<?> type);
The defaultValueConstant
method returns a method handle which takes no arguments and returns a default value of that method handle. It is equivalent (but is probably be more efficient than) creating a one-element array of that value type and loading the result. This method may be useful implementing MethodHandles.empty
and similar combinators. (The method may support non-Q-types. If it does, an L-type will result in a method handle that returns null
, not a box containing the default value.)
The substitutabilityTest
method returns a method handle which compares two operands of the given type for substitutability. Specifically, if the type is a Q-type, fields are compared pairwise for substitutability, and the result is the logical conjunction of all the comparisons. Primitives and references are substitutable if and only if they compare equal using the appropriate version of the Java ==
operator, except that floats and doubles are first converted to their “raw bits” before comparison. (The method may support non-Q-types. If it does, an L-type will be compared using acmp
reference comparison, with a possible exception for Q-type boxes.)
Likewise, the substitutabilityHashCode
method returns a method handle which accepts a single operand of the given type, and produces a hash code which is guaranteed to be equal for two values of that type if they are substitutable for each other, and is likely to be different otherwise. (The method may support non-Q-types. If it does, an L-type will be hashed used System.identityHashCode
, and primitives hashed using their own bit patterns.)
(It is an open question whether to expand the size of this hash code to 64 bits. It will probably be defined, for starters, as a 32-bit composition of the hash codes of the value type fields, using legacy hash code values. The composition of sub-codes will probably use, at first, a base-31 polynomial, even though that composition technique is deeply suboptimal.)
The findWither
method works analogously to Lookup.findSetter
, except that the resulting method handle always creates a new value, a full copy of the old value, except that the specified field is changed to contain the new value. Since values have no identity, this is the only logically possible way to update a field value.
In order to restrict the use of wither primitives, the refc
parameter will be checked against the lookup-class; if they are not the same type (and not a coordinated pair of Q-type and L-type), the access will fail. The access restriction may be broadened later. A value-type may of course define named wither methods that encapsulate primitive wither actions. Eventually, as withfield
bytecode might be created to express field update directly, in which case the same issues of access restriction must be addressed.
(The name wither method does not mean a way to blight or shrivel something–certainly a shady activity. It refers to a naming convention for methods that perform functional update of record values. Asking a complex number
c.withRe(0)
would return a new pure-imaginary complex number. By contrast,c.setRe(0)
, a call to a setter method, would seem to mutate the complex number, removing any non-zero real component. Setter methods are appropriate to mutable objects, while wither methods are appropriate to values. Note that a method can in fact be a getter, setter, or wither method even if it does not begin with one of those standard words. The eventual conventions for value types may well discourage forms likewithRe(0)
in favor of simplyre(0)
.)
It is likely that these methods in ValueTypeSupport
will eventually become virtual methods of Lookup
itself (if that is the leading argument), else static methods of MethodHandles
.
This minimal proposal is by nature temporary and provisional. It gives a necessary foundation for further work, rather than a final specification. Some of the further work will be similarly provisional in nature, but over time we will build on our successes and learn from our mistakes, eventually creating a well-designed specification that can takes its place in the sun.
This present set of features that support value types will be difficult to work with; this is intentional. The rest of this document sketches a few additional features which may enable experiments not practical or possible in the minimized proposal.
Therefore, this last section may be safely skipped. Any such features will be given their own supporting documentation if they are pursued. It may be of interest, however, to people who have noticed missing features in the minimal values proposal.
At a minimum, no language changes are needed to work with Q-types. A combination of JVM hacks (value-capable classes), annotation-driven classfile transformations, and direct bytecode generation are enough to exercise interesting micro-benchmarks. Method handles supply a useful alternative to direct bytecode generation, and they will be made fully capable of working with Q-types (as described below).
Nevertheless, there is nothing like language support. It is likely that very early experiments with javac
will create simple ways to refer to Q-types and create variables for them, directly in Java code (subject to contextual restrictions, of course).
In particular, constructors for objects have a very different bytecode shape than seemingly-equivalent constructors for value types. (The syntax for Java object constructors is a perfectly fine notation for value type constructors, as long as all fields are final.) It would be reasonable for javac to take on the burden of byte-compiling both versions of each constructor of a value-capable class.
Likewise, direct invocation of value type constructors, and direct access of value type methods and fields, would be convenient to use from Java source code, even if they had to be compiled to invokedynamic calls, until bytecode support was completed.
Additional enhancements to the constant pool may allow creation of constants derived from bootstrap methods. Such features are not in the scope of present document. They are described in the OpenJDK RFE JDK-8161256. This RFE mentions the present enhancement of CONSTANT_Class
.
If this RFE is implemented, it may be possible to delay a few of the steps described in this section, such as using Q-types as receiver types for CONSTANT_MethodHandles
. The key requirement, in any case, is that invokedynamic instructions be able to refer to a full range of operations on Q-types, since the invokedyanmic instructions are standing in as temporarily place-holders for bytecodes we are not yet implementing.
Independently of user-bootstrapped constants, Q-types in the constant pool might be carried, most gracefully, by variations on the CONSTANT_Class
constant. Right now, we choose to mangle type descriptors in CONSTANT_Class
constants as an easy-to-implement place-holder, but the final design could introduce new constant pool types to carry the required distinctions.
For example, CONSTANT_Class
could be kept as-is, and re-labeled CONSTANT_ReferenceType
. Then, a new CONSTANT_Type
constant could support arbitrary descriptors. (Perhaps it would have other substructure required by reified generic parameters, but that’s probably yet another kind of constant.) Or, a CONSTANT_ValueType
tag could be introduced for symmetry with CONSTANT_ReferenceType
, and some other way could be found for mentioning primitive pseudo-classes. (They are useful as parameters to BSMs.)
A value-capable class, compiled from Java source, may have additional annotations (or perhaps attributes) on selected fields and methods which cause the introduction of Q-types, as a bytecode-level transformation when the value-capable class’s file is loaded or compiled.
Two transformations which seem useful may be called Q-replacement and Q-overloading. The first deletes L-types and replaces them by Q-types, while the second simply copies methods, replacing some or all of the L-types in their descriptors by corresponding Q-types. This set of ideas is tracked as JDK-8164889.
An alternative to annotation-driven Q-replacment would be an experimental language feature allowing Q-types to be mentioned directly in Java source. Such experiments are likely to happen as part of Project Valhalla, and may happen early enough to make transformation unnecessary.
The library method handle defaultValueConstant
could be replaced by a new vnew
bytecode, or by a prefixed aconst_null
bytecode.
The library method handle substitutabilityTest
could be replaced by a new vcmp
bytecode, or by a prefixed if_acmpeq
bytecode.
The library method handle findWither
could be replaced by a new vwithfield
bytecode.
The library method handle findGetter
could be replaced by a suitably enhanced getfield
bytecode.
The library method handle arrayConstructor
could be replaced by a suitably enhanced anewarray
or multianewarray
bytecode.
The library method handle arrayElementGetter
could be replaced by a new vaload
bytecode, or a prefixed aaload
bytecode.
The library method handle arrayElementSetter
could be replaced by a new vastore
bytecode, or a prefixed aastore
bytecode.
The library method handle arrayLength
could be replaced by a suitably enhanced arraylength
bytecode.
In some cases, supplying Q-replaced API points in classes is just a matter of providing suitable bridge methods. Bytecode transformers or generators can avoid the need to specify the bodies of such bridge methods if the bridges are (instead of bytecodes) endowed with suitably organized bootstrap methods. This set of ideas has many additional uses, including auto-generation of standard equals
, hashCode
, and toString
methods. It is tracked as JDK-8164891.
As suggested above, L-types for values are value-based, and some version of the JVM may attempt to enforce this in various ways, such as the following:
IllegalMonitorStateException
.==
, or the acmp
instruction) may report “true” on two equivalent boxed Q-type values, even if the references previously returned false, or “false” when they previously returned “true”. Such variation would of course be subject to the logic of substitutability, of the underlying Q-types. Two boxes that were once detected as equal references would be permanently substitutable for each other.setAccessible
is called.setAccessible
is called.A box whose identity status is uncertain from observation to observation is called a “heisenbox”. To pursue the analogy, a reference equality (==
, acmp
) observation of true
for two heisenboxes “collapses” them into the same object, since they are then proven fully inter-substitutable, hence their Q-values are equivalent also. Two copies of the reference can later decohere, reporting inequality, despite the continued inter-substitutability of the boxed values. The equality predicate could be investigated by wiring it to a box containing Schrödinger’s cat, with many puzzling and sad results…
This set of ideas is tracked as JDK-8163133.
[values]: http://cr.openjdk.java.net/~jrose/values/values.html
[valhalla-dev]: http://mail.openjdk.java.net/pipermail/valhalla-dev/
[goetz-jvmls15]: http://www.oracle.com/technetwork/java/jvmls2015-goetz-2637900.pdf
[valsem-0411]: http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-April/000118.html
[simms-vbcs]: http://mail.openjdk.java.net/pipermail/valhalla-dev/2016-June/001981.html
[graves-jvmls16]: http://youtu.be/Z2XgO1H6xPM?list=PLX8CzqL3ArzUY6rQAQTwI_jKvqJxrRrP_
[value-based]: http://docs.oracle.com/javase/8/docs/api/java/lang/doc-files/ValueBased.html
[goetz-jvmls16]: http://www.youtube.com/watch?v=Tc9vs_HFHVo
[Long2.java]: http://hg.openjdk.java.net/panama/panama/jdk/file/70b3ceb485cf/src/java.base/share/classes/java/lang/Long2.java
[cimadamore-refman]: http://cr.openjdk.java.net/~mcimadamore/reflection-manifesto.html
[JDK-8164891]: http://bugs.openjdk.java.net/browse/JDK-8164891
[JDK-8161256]: http://bugs.openjdk.java.net/browse/JDK-8161256
[JDK-8164889]: http://bugs.openjdk.java.net/browse/JDK-8164889
[JDK-8163133]: http://bugs.openjdk.java.net/browse/JDK-8163133