This document describes changes to the Java Virtual Machine Specification to clean up its treatment of types. Changes mainly fall into one of the following categories:
Clarifying the treatment of
boolean
,byte
,short
, andchar
: they are legitimate types, but when placed on the stack, values of these types are implicitly converted toint
.Revisions to verification: distinguishing between, and making appropriate use of, class/interface types and loaded classes; refinements to subtyping and instruction encoding; some bug fixes (including JDK-8122946).
Centralizing rules for identifying classes in descriptors that are subject to loader constraints.
Centralizing rules for testing that a value is an instance of a type.
Eliminating unnecessary references to
java.lang.Class
; instead, the representation of classes and interfaces is usually an implementation detail.Removing unnecessary references to the Java language.
One behavioral change for JVM implementations is proposed: tightening the verification check that an interface named by invokespecial is a direct superinterface (4.10.1.9.invokespecial).
All other changes only presentational, with no intention to change the behavior of JVM implementations.
Changes are described with respect to existing sections of the JVM Specification. New text is indicated like this and deleted text is indicated like this. Explanation and discussion, as needed, is set aside in grey boxes.
Chapter 2: The Structure of the Java Virtual Machine
2.2 Data Types
Like the Java programming language, the The Java Virtual Machine operates on two kinds of types: primitive types and reference types. There are, correspondingly, two kinds of values that can be stored in variables, passed as arguments, returned by methods, and operated upon: primitive values and reference values.
The Java Virtual Machine expects that nearly all type checking is done prior to run time, typically by a compiler, and does not have to be done by the Java Virtual Machine itself. Values of primitive types need not be tagged or otherwise be inspectable to determine their types at run time, or to be distinguished from values of reference types. Instead, the instruction set of the Java Virtual Machine distinguishes its operand types using instructions intended to operate on values of specific types. For instance, iadd, ladd, fadd, and dadd are all Java Virtual Machine instructions that add two numeric values and produce numeric results, but each is specialized for its operand type: int
, long
, float
, and double
, respectively. For a summary of type support in the Java Virtual Machine instruction set, see 2.11.1.
The Java Virtual Machine contains explicit support for objects. An object is either a dynamically allocated class instance or an array. A reference to an object is considered to have Java Virtual Machine type reference
. References are polymorphic: a single reference
may also be a value of multiple class types, interface types, or array types. Values of type reference
can be thought of as pointers to objects. More than one reference to an object may exist. Objects are always operated on, passed, and tested via values of type reference
.
2.6 Frames
2.6.1 Local Variables
Each frame (2.6) contains an array of variables known as its local variables. The length of the local variable array of a frame is determined at compile-time and supplied in the binary representation of a class or interface along with the code for the method associated with the frame (4.7.3).
A single local variable can hold a value of type boolean
, byte
, char
, short
,int
, float
, reference
, or returnAddress
. A pair of local variables can hold a value of type long
or double
.
Local variables are addressed by indexing. The index of the first local variable is zero. An integer is considered to be an index into the local variable array if and only if that integer is between zero and one less than the size of the local variable array.
A value of type long
or type double
occupies two consecutive local variables. Such a value may only be addressed using the lesser index. For example, a value of type double
stored in the local variable array at index n actually occupies the local variables with indices n and n+1; however, the local variable at index n+1 cannot be loaded from. It can be stored into. However, doing so invalidates the contents of local variable n.
The Java Virtual Machine does not require n to be even. In intuitive terms, values of types long
and double
need not be 64-bit aligned in the local variables array. Implementors are free to decide the appropriate way to represent such values using the two local variables reserved for the value.
The Java Virtual Machine uses local variables to pass parameters on method invocation. On class method invocation, any parameters are passed in consecutive local variables starting from local variable 0. On instance method invocation, local variable 0 is always used to pass a reference to the object on which the instance method is being invoked (this
in the Java programming language). Any parameters are subsequently passed in consecutive local variables starting from local variable 1.
2.11 Instruction Set Summary
2.11.1 Types and the Java Virtual Machine
Most of the instructions in the Java Virtual Machine instruction set encode type information about the operations they perform. For instance, the iload instruction (6.5.iload) loads the contents of a local variable, which must be an int
, onto the operand stack. The fload instruction (6.5.fload) does the same with a float
value. The two instructions may have identical implementations, but have distinct opcodes.
For the majority of typed instructions, the instruction type is represented explicitly in the opcode mnemonic by a letter: i for an int
operation, l for long
, s for short
, b for byte
, c for char
, f for float
, d for double
, and a for reference
. Some instructions for which the type is unambiguous do not have a type letter in their mnemonic. For instance, arraylength always operates on an object that is an array. Some instructions, such as goto, an unconditional control transfer, do not operate on typed operands.
Given the Java Virtual Machine's one-byte opcode size, encoding types into opcodes places pressure on the design of its instruction set. If each typed instruction supported all of the Java Virtual Machine's run-time data types, there would be more instructions than could be represented in a byte. Instead, the instruction set of the Java Virtual Machine provides a reduced level of type support for certain operations. In other words, the instruction set is intentionally not orthogonal. Separate instructions can be used to convert between unsupported and supported data types as necessary.
Table 2.11.1-A summarizes the type support in the instruction set of the Java Virtual Machine. A specific instruction, with type information, is built by replacing the T in the instruction template in the opcode column by the letter in the type column. If the type column for some instruction template and type is blank, then no instruction exists supporting that type of operation. For instance, there is a load instruction for type int
, iload, but there is no load instruction for type byte
.
Note that most instructions in Table 2.11.1-A do not have forms for the integral types byte
, char
, and short
. None have forms for the boolean
type. A compiler encodes loads of literal values of types Whenever values of types byte
and short
using Java Virtual Machine instructions that sign-extend those values to values of type int
at compile-time or run-time. Loads of literal values of types boolean
and char
are encoded using instructions that zero-extend the literal to a value of type int
at compile-time or run-time. Likewise, loads from arrays of values of type boolean
, byte
, short
, and char
are encoded using Java Virtual Machine instructions that sign-extend or zero-extend the values to values of type int
.byte
and short
are loaded onto the operand stack, they are implicitly converted by sign extension to values of type int
. Similarly, whenever values of types boolean
and char
are loaded onto the operand stack, they are implicitly converted by zero extension to values of type int
. Thus, most operations on values originally of actual types boolean
, byte
, char
, and short
are correctly performed by instructions operating on values of computational type int
.
Table 2.11.1-A. Type support in the Java Virtual Machine instruction set
opcode | byte |
short |
int |
long |
float |
double |
char |
reference |
---|---|---|---|---|---|---|---|---|
Tipush | bipush | sipush | ||||||
Tconst | iconst | lconst | fconst | dconst | aconst | |||
Tload | iload | lload | fload | dload | aload | |||
Tstore | istore | lstore | fstore | dstore | astore | |||
Tinc | iinc | |||||||
Taload | baload | saload | iaload | laload | faload | daload | caload | aaload |
Tastore | bastore | sastore | iastore | lastore | fastore | dastore | castore | aastore |
Tadd | iadd | ladd | fadd | dadd | ||||
Tsub | isub | lsub | fsub | dsub | ||||
Tmul | imul | lmul | fmul | dmul | ||||
Tdiv | idiv | ldiv | fdiv | ddiv | ||||
Trem | irem | lrem | frem | drem | ||||
Tneg | ineg | lneg | fneg | dneg | ||||
Tshl | ishl | lshl | ||||||
Tshr | ishr | lshr | ||||||
Tushr | iushr | lushr | ||||||
Tand | iand | land | ||||||
Tor | ior | lor | ||||||
Txor | ixor | lxor | ||||||
i2T | i2b | i2s | i2l | i2f | i2d | |||
l2T | l2i | l2f | l2d | |||||
f2T | f2i | f2l | f2d | |||||
d2T | d2i | d2l | d2f | |||||
Tcmp | lcmp | |||||||
Tcmpl | fcmpl | dcmpl | ||||||
Tcmpg | fcmpg | dcmpg | ||||||
if_TcmpOP | if_icmpOP | if_acmpOP | ||||||
Treturn | ireturn | lreturn | freturn | dreturn | areturn |
The mapping between Java Virtual Machine actual original types and Java Virtual Machine computational types is summarized by Table 2.11.1-B.
Certain Java Virtual Machine instructions such as pop and swap operate on the operand stack without regard to type; however, such instructions are constrained to use only on values of certain categories of computational types, also given in Table 2.11.1-B.
Table 2.11.1-B. Actual Original and Computational types in the Java Virtual Machine
Computational type | Category | |
---|---|---|
boolean |
int |
1 |
byte |
int |
1 |
char |
int |
1 |
short |
int |
1 |
int |
int |
1 |
float |
float |
1 |
reference |
reference |
1 |
returnAddress |
returnAddress |
1 |
long |
long |
2 |
double |
double |
2 |
2.11.2 Load and Store Instructions
The load and store instructions transfer values between the local variables (2.6.1) and the operand stack (2.6.2) of a Java Virtual Machine frame (2.6):
Load a local variable onto the operand stack: iload, iload_<n>, lload, lload_<n>, fload, fload_<n>, dload, dload_<n>, aload, aload_<n>.
Store a value from the operand stack into a local variable: istore, istore_<n>, lstore, lstore_<n>, fstore, fstore_<n>, dstore, dstore_<n>, astore, astore_<n>.
Load a constant on to the operand stack: bipush, sipush, ldc, ldc_w, ldc2_w, aconst_null, iconst_m1, iconst_<i>, lconst_<l>, fconst_<f>, dconst_<d>.
Gain access to more local variables using a wider index, or to a larger immediate operand: wide.
Instructions that access fields of objects and elements of arrays (2.11.5) also transfer data to and from the operand stack.
Instruction mnemonics shown above with trailing letters between angle brackets (for instance, iload_<n>) denote families of instructions (with members iload_0, iload_1, iload_2, and iload_3 in the case of iload_<n>). Such families of instructions are specializations of an additional generic instruction (iload) that takes one operand. For the specialized instructions, the operand is implicit and does not need to be stored or fetched. The semantics are otherwise the same (iload_0 means the same thing as iload with the operand 0). The letter between the angle brackets specifies the type of the implicit operand for that family of instructions: for <n>, a nonnegative integer; for <i>, an int
; for <l>, a long
; for <f>, a float
; and for <d>, a double
. Forms for type int
are used in many cases to perform operations on values of type byte
, char
, and short
(2.11.1).
This notation for instruction families is used throughout this specification.
2.11.8 Method Invocation and Return Instructions
The following five instructions invoke methods:
invokevirtual invokes an instance method of an object, dispatching on the (virtual) type of the object.
This is the normal method dispatch in the Java programming language.invokeinterface invokes an interface method, searching the methods implemented by the particular run-time object to find the appropriate method.
invokespecial invokes an instance method requiring special handling, either an instance initialization method (2.9.1) or a method of the current class or its supertypes.
invokestatic invokes a class (
static
) method in a named class.invokedynamic invokes the method which is the target of the call site object bound to the invokedynamic instruction. The call site object was bound to a specific lexical occurrence of the invokedynamic instruction by the Java Virtual Machine as a result of running a bootstrap method before the first execution of the instruction. Therefore, each occurrence of an invokedynamic instruction has a unique linkage state, unlike the other instructions which invoke methods.
The method return instructions, which are distinguished by return type, are ireturn (used to return values of type , lreturn, freturn, dreturn, and areturn. In addition, the return instruction is used to return from methods declared to be void, instance initialization methods, and class or interface initialization methods.boolean
, byte
, char
, short
, or int
)
Chapter 4: The class
File Format
4.3 Descriptors
A descriptor is a string representing the type of a field or method. Descriptors are represented in the class
file format using modified UTF-8 strings (4.4.7) and thus may be drawn, where not further constrained, from the entire Unicode codespace.
4.3.1 Grammar Notation
Descriptors are specified using a grammar. The grammar is a set of productions that describe how sequences of characters can form syntactically correct descriptors of various kinds. Terminal symbols of the grammar are shown in fixed width
font, and should be interpreted as ASCII characters. Nonterminal symbols are shown in italic type. The definition of a nonterminal is introduced by the name of the nonterminal being defined, followed by a colon. One or more alternative definitions for the nonterminal then follow on succeeding lines.
The syntax {x} on the right-hand side of a production denotes zero or more occurrences of x.
The phrase (one of) on the right-hand side of a production signifies that each of the terminal symbols on the following line or lines is an alternative definition.
4.3.2 Field Descriptors
A field descriptor represents the type of a class, instance, or local variable.
- FieldDescriptor:
- FieldType
- FieldType:
- BaseType
ObjectTypeClassType- ArrayType
- BaseType:
- (one of)
B
C
D
F
I
J
S
Z
ObjectType:ClassType:L
ClassName;
- ArrayType:
[
ComponentType- ComponentType:
- FieldType
The characters of BaseType, the L
and ;
of ObjectType, and the [
of ArrayType are all ASCII characters.
ClassName represents a binary class or interface name encoded in internal form (4.2.1).
A field descriptor mentions a class or interface name if the name appears as a ClassName in the descriptor. This includes a ClassName nested in the ComponentType of an ArrayType.
This definition of mentions allows us to eliminate boilerplate in a handful of other sections and gives us a single location to identify the classes that are subject to loading constraints, resolution, etc. The definition is flexible enough to support new kinds of types that may be added in the future.
The interpretation of field descriptors as types is shown in Table 4.3-A. See 2.2, 2.3, and 2.4 for the meaning of these types.
A field descriptor representing an array type is valid only if it represents a type with 255 or fewer dimensions.
Table 4.3-A. Interpretation of field descriptors
FieldType term | Type | |
---|---|---|
B |
byte |
|
C |
char |
|
D |
double |
|
F |
float |
|
I |
int |
|
J |
long |
|
L ClassName ; |
reference |
|
S |
short |
|
Z |
boolean |
true or false |
[ |
reference |
The field descriptor of an instance variable of type
int
is simplyI
.The field descriptor of an instance variable of type
Object
isLjava/lang/Object;
. Note that the internal form of the binary name for classObject
is used.The field descriptor of an instance variable of the multidimensional array type
double[][][]
is[[[D
.
The "Interpretation" column of Table 4.3-A is redundant; these details are better left to sections 2.2, 2.3 and 2.4.
4.3.3 Method Descriptors
A method descriptor contains zero or more parameter descriptors, representing the types of parameters that the method takes, and a return descriptor, representing the type of the value (if any) that the method returns.
- MethodDescriptor:
(
{ParameterDescriptor})
ReturnDescriptor- ParameterDescriptor:
- FieldType
- ReturnDescriptor:
- FieldType
- VoidDescriptor
- VoidDescriptor:
V
The character V
indicates that the method returns no value (its result is void
).
A method descriptor mentions a class or interface name if the name appears as a ClassName in the FieldType of a parameter descriptor or return descriptor.
The method descriptor for the method:
Object m(int i, double d, Thread t) {...}
is:
(IDLjava/lang/Thread;)Ljava/lang/Object;
Note that the internal forms of the binary names of
Thread
andObject
are used.
A method descriptor is valid only if it represents method parameters with a total length of 255 or less, where that length includes the contribution for this
in the case of instance or interface method invocations. The total length is calculated by summing the contributions of the individual parameters, where a parameter of type long
or double
contributes two units to the length and a parameter of any other type contributes one unit.
A method descriptor is the same whether the method it describes is a class method or an instance method. Although an instance method is passed this
, a reference to the object on which the method is being invoked, in addition to its intended arguments, that fact is not reflected in the method descriptor. The reference to this
is passed implicitly by the Java Virtual Machine instructions which invoke instance methods (2.6.1, 4.11).
4.7 Attributes
4.7.16 The RuntimeVisibleAnnotations
Attribute
4.7.16.1 The element_value
structure
- class_info_index
The
class_info_index
item denotes a class literal as the value of this element-value pair.The
class_info_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
structure (4.4.7) representing a return descriptor (4.3.3). The return descriptor gives the type corresponding to the class literal represented by thiselement_value
structure. Types correspond to class literals as follows:For a class literal C
.class
, where C is the name of a class,or interface, or array type, the corresponding type is C. The return descriptor in theconstant_pool
will bean ObjectType or an ArrayTypea ClassType.For a class literal T[]
.class
, where T[] is an array type, the corresponding type is T[]. The return descriptor in theconstant_pool
will be an ArrayType.For a class literal p
.class
, where p is the name of a primitive type, the corresponding type is p. The return descriptor in theconstant_pool
will be a BaseType character.For a class literal
void.class
, the corresponding type isvoid
. The return descriptor in theconstant_pool
will be V.
For example, the class literal
Object.class
corresponds to the typeObject
, so theconstant_pool
entry isLjava/lang/Object;
, whereas the class literalint.class
corresponds to the typeint
, so theconstant_pool
entry isI
.The class literal
void.class
corresponds tovoid
, so theconstant_pool
entry is V, whereas the class literalVoid.class
corresponds to the typeVoid
, so theconstant_pool
entry isLjava/lang/Void;
.
...
4.9 Constraints on Java Virtual Machine Code
The code for a method, instance initialization method (2.9.1), or class or interface initialization method (2.9.2) is stored in the code
array of the Code
attribute of a method_info
structure of a class
file (4.7.3). This section describes the constraints associated with the contents of the Code_attribute
structure.
Initialization methods are methods. No need to call them out separately.
4.9.1 Static Constraints
The static constraints on a class
file are those defining the well-formedness of the file. These constraints have been given in the previous sections, except for static constraints on the code in the class
file. The static constraints on the code in a class
file specify how Java Virtual Machine instructions must be laid out in the code
array and what the operands of individual instructions must be.
The static constraints on the instructions in the code
array are as follows:
Only instances of the instructions documented in 6.5 may appear in the
code
array. Instances of instructions using the reserved opcodes (6.2) or any opcodes not documented in this specification must not appear in thecode
array.If the
class
file version number is 51.0 or above, then neither the jsr opcode or the jsr_w opcode may appear in thecode
array.The opcode of the first instruction in the
code
array begins at index0
.For each instruction in the
code
array except the last, the index of the opcode of the next instruction equals the index of the opcode of the current instruction plus the length of that instruction, including all its operands.The wide instruction is treated like any other instruction for these purposes; the opcode specifying the operation that a wide instruction is to modify is treated as one of the operands of that wide instruction. That opcode must never be directly reachable by the computation.
The last byte of the last instruction in the
code
array must be the byte at indexcode_length - 1
.
The static constraints on the operands of instructions in the code
array are as follows:
The target of each jump and branch instruction (jsr, jsr_w, goto, goto_w, ifeq, ifne, ifle, iflt, ifge, ifgt, ifnull, ifnonnull, if_icmpeq, if_icmpne, if_icmple, if_icmplt, if_icmpge, if_icmpgt, if_acmpeq, if_acmpne) must be the opcode of an instruction within this method.
The target of a jump or branch instruction must never be the opcode used to specify the operation to be modified by a wide instruction; a jump or branch target may be the wide instruction itself.
Each target, including the default, of each tableswitch instruction must be the opcode of an instruction within this method.
Each tableswitch instruction must have a number of entries in its jump table that is consistent with the value of its low and high jump table operands, and its low value must be less than or equal to its high value.
No target of a tableswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a tableswitch target may be a wide instruction itself.
Each target, including the default, of each lookupswitch instruction must be the opcode of an instruction within this method.
Each lookupswitch instruction must have a number of match-offset pairs that is consistent with the value of its npairs operand. The match-offset pairs must be sorted in increasing numerical order by signed match value.
No target of a lookupswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a lookupswitch target may be a wide instruction itself.
The operands of each ldc instruction and each ldc_w instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be loadable (4.4), and not any of the following:An entry of kind
CONSTANT_Long
orCONSTANT_Double
.An entry of kind
CONSTANT_Dynamic
that references aCONSTANT_NameAndType_info
structure which indicates a descriptor ofJ
(denotinglong
) orD
(denotingdouble
).
The operands of each ldc2_w instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be loadable, and in particular one of the following:An entry of kind
CONSTANT_Long
orCONSTANT_Double
.An entry of kind
CONSTANT_Dynamic
that references aCONSTANT_NameAndType_info
structure which indicates a descriptor ofJ
(denotinglong
) orD
(denotingdouble
).
The subsequent constant pool index must also be a valid index into the constant pool, and the constant pool entry at that index must not be used.I don't know what this constraint means—perhaps something to do with two-slot constants (compare 4.4.5)? This is not the place to enforce constant pool layout rules.
The operands of each getfield, putfield, getstatic, and putstatic instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_Fieldref
.The indexbyte operands of each invokevirtual instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_Methodref
.The indexbyte operands of each invokespecial and invokestatic instruction must represent a valid index into the
constant_pool
table. If theclass
file version number is less than 52.0, the constant pool entry referenced by that index must be of kindCONSTANT_Methodref
; if theclass
file version number is 52.0 or above, the constant pool entry referenced by that index must be of kindCONSTANT_Methodref
orCONSTANT_InterfaceMethodref
.The indexbyte operands of each invokeinterface instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_InterfaceMethodref
.The value of the count operand of each invokeinterface instruction must reflect the number of local variables necessary to store the arguments to be passed to the interface method, as implied by the descriptor of the
CONSTANT_NameAndType_info
structure referenced by theCONSTANT_InterfaceMethodref
constant pool entry.The fourth operand byte of each invokeinterface instruction must have the value zero.
The indexbyte operands of each invokedynamic instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_InvokeDynamic
.The third and fourth operand bytes of each invokedynamic instruction must have the value zero.
Only the invokespecial instruction is allowed to invoke an instance initialization method (2.9.1).The method name of the
CONSTANT_Methodref
orCONSTANT_InterfaceMethodref
referenced by one of the instructions invokevirtual, invokestatic, or invokeinterface must not be<init>
.No other method whose name begins with the character '<
' ('\u003c
') may be called by the method invocation instructions. In particular, the class or interface initialization method specially named<clinit>
is never called explicitly from Java Virtual Machine instructions, but only implicitly by the Java Virtual Machine itself.Only the invokespecial instruction is allowed to invoke an instance initialization method (2.9.1). No instruction is allowed to invoke a class initialization method, because it cannot be referenced (4.4.2)—such methods are only invoked implicitly by the Java Virtual Machine (5.5).
The operands of each instanceof, checkcast, new, and anewarray instruction, and the indexbyte operands of each multianewarray instruction, must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_Class
.No new instruction may reference a constant pool entry of kind
CONSTANT_Class
that represents an array type (4.3.2). The new instruction cannot be used to create an array.No anewarray instruction may be used to create an array of more than 255 dimensions.No anewarray instruction may reference a constant pool entry of kind
CONSTANT_Class
that represents an array type with more than 254 dimensions.Rephrased to be more direct: the dimensionality of an array, as opposed to an array type, is not a well-defined property. (Consider an
Object[]
whose components are 255-dimension arrays.)A multianewarray instruction must be used only to create an array of a type that has at least as many dimensions as the value of its dimensions operand. That is, while a multianewarray instruction is not required to create all of the dimensions of the array type referenced by its indexbyte operands, it must not attempt to create more dimensions than are in the array type.
The dimensions operand of each multianewarray instruction must not be zero.
The atype operand of each newarray instruction must take one of the values
T_BOOLEAN
(4),T_CHAR
(5),T_FLOAT
(6),T_DOUBLE
(7),T_BYTE
(8),T_SHORT
(9),T_INT
(10), orT_LONG
(11).The index operand of each iload, fload, aload, istore, fstore, astore, iinc, and ret instruction must be a non-negative integer no greater than
max_locals - 1
.The implicit index of each iload_<n>, fload_<n>, aload_<n>, istore_<n>, fstore_<n>, and astore_<n> instruction must be no greater than
max_locals - 1
.The index operand of each lload, dload, lstore, and dstore instruction must be no greater than
max_locals - 2
.The implicit index of each lload_<n>, dload_<n>, lstore_<n>, and dstore_<n> instruction must be no greater than
max_locals - 2
.The indexbyte operands of each wide instruction modifying an iload, fload, aload, istore, fstore, astore, iinc, or ret instruction must represent a non-negative integer no greater than
max_locals - 1
.The indexbyte operands of each wide instruction modifying an lload, dload, lstore, or dstore instruction must represent a non-negative integer no greater than
max_locals - 2
.
4.9.2 Structural Constraints
The structural constraints on the code
array specify constraints on relationships between Java Virtual Machine instructions. The structural constraints are as follows:
Each instruction must only be executed with the appropriate type and number of arguments in the operand stack and local variable array, regardless of the execution path that leads to its invocation.
An instruction operating on values of typeint
is also permitted to operate on values of typeboolean
,byte
,char
, andshort
.As noted in 2.3.4 and 2.11.1, the Java Virtual Machine
internallyimplicitly converts values of typesboolean
,byte
,short
, andchar
to typeint
, allowing instructions expecting values of typeint
to operate on them.)If an instruction can be executed along several different execution paths, the operand stack must have the same depth (2.6.2) prior to the execution of the instruction, regardless of the path taken.
At no point during execution can the operand stack grow to a depth greater than that implied by the
max_stack
item.At no point during execution can more values be popped from the operand stack than it contains.
At no point during execution can the order of the local variable pair holding a value of type
long
ordouble
be reversed or the pair split up. At no point can the local variables of such a pair be operated on individually.No local variable (or local variable pair, in the case of a value of type
long
ordouble
) can be accessed before it is assigned a value.Each invokespecial instruction must name one of the following:
an instance initialization method (2.9.1)
a method in the current class or interface
a method in a superclass of the current class
a method in a direct superinterface of the current class or interface
a method in
Object
If an invokespecial instruction names an instance initialization method, then the target reference on the operand stack must be an uninitialized class instance. An instance initialization method must never be invoked on an initialized class instance. In addition:
If the target reference on the operand stack is an uninitialized class instance for the current class, then invokespecial must name an instance initialization method from the current class or its direct superclass.
If an invokespecial instruction names an instance initialization method and the target reference on the operand stack is a class instance created by an earlier new instruction, then invokespecial must name an instance initialization method from the class of that class instance.
If an invokespecial instruction names a method which is not an instance initialization method, then the target reference on the operand stack must be a class instance whose type is assignment compatible with the current class (JLS §5.2).
The general rule for invokespecial is that the class or interface named by invokespecial must be be "above" the caller class or interface, while the receiver object targeted by invokespecial must be "at" or "below" the caller class or interface. The latter clause is especially important: a class or interface can only perform invokespecial on its own objects. See 4.10.1.9.invokespecial for an explanation of how the latter clause is implemented in Prolog.
Each instance initialization method, except for the instance initialization method derived from the constructor of class
Object
, must call either another instance initialization method ofthis
or an instance initialization method of its direct superclasssuper
before its instance members are accessed.However, instance fields of
this
that are declared in the current class may be assigned by putfield before calling any instance initialization method.When any instance method is invoked or when any instance variable is accessed, the class instance that contains the instance method or instance variable must already be initialized.
- If there is an uninitialized class instance in a local variable in code protected by an exception handler, then (i) if the handler is inside an
<init>
method, the handler must throw an exception or loop forever; and- if the handler is not inside an
<init>
method, the uninitialized class instance must remain uninitialized.
- if the handler is not inside an
There must never be an uninitialized class instance on the operand stack or in a local variable when a jsr or jsr_w instruction is executed.
The type of every
class instancevalue that is the target ofa method invocationan invokevirtual, invokeinterface, or invokespecial instruction (that is, the type of the target reference on the operand stack) must beassignment compatible witha subtype (4.10.1.2) of theclass or interfacetype specified in the instruction.The types of the arguments to each method invocation must be
method invocation compatible withsubtypes of the types given by the method descriptor (JLS §5.3,4.3.3), where descriptor typesboolean
,byte
,char
, andshort
are interpreted as typeint
.Each return instruction must match its method's return type:
If the method returns a
boolean
,byte
,char
,short
, orint
, only the ireturn instruction may be used.If the method returns a
float
,long
, ordouble
, only an freturn, lreturn, or dreturn instruction, respectively, may be used.If the method returns a
reference
type, only an areturn instruction may be used, and the type of the returned value must beassignment compatible witha subtype of the return descriptor of the method (4.3.3).All
instance initialization methods, class or interface initialization methods, andmethods declared to returnvoid
must use only the return instruction.Initialization methods are already required to be declared to return
void
(4.6).
The type of every
class instancevalue accessed by a getfield instruction or modified by a putfield instruction (that is, the type of the target reference on the operand stack) must beassignment compatible witha subtype of theclasstype specified in the instruction.The type of every value stored by a putfield or putstatic instruction must be compatible with the descriptor of the field (4.3.2) of the class instance or class being stored into:
If the descriptor type is
boolean
,byte
,char
,short
, orint
, then the value must be anint
.If the descriptor type is
float
,long
, ordouble
, then the value must be afloat
,long
, ordouble
, respectively.If the descriptor type is a
reference
type, then the value must be of a type that isassignment compatible witha subtype of the descriptor type.
The type of every value stored into an array by an aastore instruction must be a
reference
type.The component type of the array being stored into by the aastore instruction must also be a
reference
type.Each athrow instruction must throw only values that are instances of class
Throwable
or of subclasses ofThrowable
.Each class mentioned in a
catch_type
item of theexception_table
array of the method'sCode_attribute
structure must beThrowable
or a subclass ofThrowable
.If getfield or putfield is used to access a
protected
field declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must beassignment compatible with the current classthe class type of the current class or one of its subclasses.If invokevirtual or invokespecial is used to access a
protected
method declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must beassignment compatible with the current classthe class type of the current class or one of its subclasses.Execution never falls off the bottom of the
code
array.No return address (a value of type
returnAddress
) may be loaded from a local variable.The instruction following each jsr or jsr_w instruction may be returned to only by a single ret instruction.
No jsr or jsr_w instruction that is returned to may be used to recursively call a subroutine if that subroutine is already present in the subroutine call chain. (Subroutines can be nested when using
try
-finally
constructs from within afinally
clause.)Each instance of type
returnAddress
can be returned to at most once.If a ret instruction returns to a point in the subroutine call chain above the ret instruction corresponding to a given instance of type
returnAddress
, then that instance can never be used as a return address.
4.10 Verification of class
Files
4.10.1 Verification by Type Checking
4.10.1.1 Accessors for Java Virtual Machine Artifacts
We stipulate the existence of 28 Prolog predicates ("accessors") that have certain expected behavior but whose formal definitions are not given in this specification.
- classClassName(Class, ClassName)
Extracts the name,
ClassName
, of the classClass
.- classIsInterface(Class)
True iff the class,
Class
, is an interface.- classIsNotFinal(Class)
True iff the class,
Class
, is not afinal
class.- classSuperClassName(Class, SuperClassName)
Extracts the name,
SuperClassName
, of the superclass of classClass
.classInterfaces(Class, Interfaces)classInterfaceNames(Class, InterfaceNames)Extracts a list,
Interfaces
InterfaceNames
, of the names of the direct superinterfaces of the classClass
.- classMethods(Class, Methods)
Extracts a list,
Methods
, of the methods declared in the classClass
.- classAttributes(Class, Attributes)
Extracts a list,
Attributes
, of the attributes of the classClass
.Each attribute is represented as a functor application of the form
attribute(AttributeName, AttributeContents)
, whereAttributeName
is the name of the attribute. The format of the attribute's contents is unspecified.- classDefiningLoader(Class, Loader)
Extracts the defining class loader,
Loader
, of the classClass
.- isBootstrapLoader(Loader)
True iff the class loader
Loader
is the bootstrap class loader.- loadedClass(Name, InitiatingLoader, ClassDefinition)
True iff there exists a class named
Name
whose representation (in accordance with this specification) when loaded by the class loaderInitiatingLoader
isClassDefinition
.- methodName(Method, Name)
Extracts the name,
Name
, of the methodMethod
.- methodAccessFlags(Method, AccessFlags)
Extracts the access flags,
AccessFlags
, of the methodMethod
.- methodDescriptor(Method, Descriptor)
Extracts the descriptor,
Descriptor
, of the methodMethod
.- methodAttributes(Method, Attributes)
Extracts a list,
Attributes
, of the attributes of the methodMethod
.- isInit(Method)
True iff
Method
(regardless of class) is<init>
.- isNotInit(Method)
True iff
Method
(regardless of class) is not<init>
.- isNotFinal(Method, Class)
True iff
Method
in classClass
is notfinal
.- isStatic(Method, Class)
True iff
Method
in classClass
isstatic
.- isNotStatic(Method, Class)
True iff
Method
in classClass
is notstatic
.- isPrivate(Method, Class)
True iff
Method
in classClass
isprivate
.- isNotPrivate(Method, Class)
True iff
Method
in classClass
is notprivate
.- isProtected(MemberClass, MemberName, MemberDescriptor)
True iff there is a member named
MemberName
with descriptorMemberDescriptor
in the classMemberClass
and it isprotected
.- isNotProtected(MemberClass, MemberName, MemberDescriptor)
True iff there is a member named
MemberName
with descriptorMemberDescriptor
in the classMemberClass
and it is notprotected
.- parseFieldDescriptor(Descriptor, Type)
Converts a field descriptor,
Descriptor
, into the corresponding verification typeType
(4.10.1.2).The verification type derived from descriptor types
byte
,short
,boolean
, andchar
isint
.- parseMethodDescriptor(Descriptor, ArgTypeList, ReturnType)
Converts a method descriptor,
Descriptor
, into a list of verification types,ArgTypeList
, corresponding to the method argument types, and a verification type,ReturnType
, corresponding to the return type.The verification type derived from descriptor types
byte
,short
,boolean
, andchar
isint
. A void return is represented with the special symbolvoid
.- parseCodeAttribute(Class, Method, FrameSize, MaxStack, ParsedCode, Handlers, StackMap)
Extracts the instruction stream,
ParsedCode
, of the methodMethod
inClass
, as well as the maximum operand stack size,MaxStack
, the maximal number of local variables,FrameSize
, the exception handlers,Handlers
, and the stack mapStackMap
.The representation of the instruction stream and stack map attribute must be as specified in 4.10.1.3 and 4.10.1.4.
- samePackageName(Class1, Class2)
True iff the package names of
Class1
andClass2
are the same.- differentPackageName(Class1, Class2)
True iff the package names of
Class1
andClass2
are different.
The above accessors are used to define loadedSuperclasses
, which produces a list of a class's superclasses.
loadedSuperclasses(Class, [ Superclass | Rest ]) :-
classSuperClassName(Class, SuperclassName),
classDefiningLoader(Class, L),
loadedClass(SuperclassName, L, Superclass),
loadedSuperclasses(Superclass, Rest).
loadedSuperclasses(Class, []) :-
classClassName(Class, 'java/lang/Object'),
classDefiningLoader(Class, BL),
isBootstrapLoader(BL).
The loadedSuperclasses
predicate replaces superclassChain
(4.10.1.2), which had the same effect, but produced a list of class types rather than loaded classes. (Despite the change in representation, all superclasses are loaded in either case.)
This helps reduce reliance on class types when what is really wanted is classes. The verifier already has a first-class notion of a class, a black-box "live" representation of a loaded class file. Using an extra layer of indirection to encode these classes as verification type structures is unnecessary and risks problems when the type system evolves.
When type checking a method's body, it is convenient to access information about the method. For this purpose, we define an environment, a six-tuple consisting of:
- a class
- a method
- the declared return type of the method (or
void
) - the instructions in a method
- the maximal size of the operand stack
- a list of exception handlers
We specify accessors to extract information from the environment.
allInstructions(Environment, Instructions) :-
Environment = environment(_Class, _Method, _ReturnType,
Instructions, _, _).
exceptionHandlers(Environment, Handlers) :-
Environment = environment(_Class, _Method, _ReturnType,
_Instructions, _, Handlers).
maxOperandStackLength(Environment, MaxStack) :-
Environment = environment(_Class, _Method, _ReturnType,
_Instructions, MaxStack, _Handlers).
currentClassLoader(Environment, Loader) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, L).
thisClass(Environment, class(ClassName, L)) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
thisClass(Environment, Class) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _).
thisType(Environment, class(ClassName, L)) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
thisMethodReturnType(Environment, ReturnType) :-
Environment = environment(_Class, _Method, ReturnType,
_Instructions, _, _).
We specify additional predicates to extract higher-level information from the environment.
offsetStackFrame(Environment, Offset, StackFrame) :-
allInstructions(Environment, Instructions),
member(stackMap(Offset, StackFrame), Instructions).
currentClassLoader(Environment, Loader) :-
thisClass(Environment, class(_, Loader)).
The old thisClass
predicate operated on types, not classes. While both are useful, often it's unnecessary to work with types, so thisClass
now operates on a live class; thisType
preserves the old behavior.
These changes are made to reduce reliance on class types when what is really wanted is classes. The verifier already has a first-class notion of a class, a black-box "live" representation of a loaded class file. Using an extra layer of indirection to encode these classes as verification type structures is unnecessary and risks problems when the type system evolves. :::
Finally, we specify a general predicate used throughout the type rules:
notMember(_, []).
notMember(X, [A | More]) :- X \= A, notMember(X, More).
The principle guiding the determination as to which accessors are stipulated and which are fully specified is that we do not want to over-specify the representation of the
class
file. Providing specific accessors to theClass
orMethod
term would force us to completely specify the format for a Prolog term representing theclass
file.
4.10.1.2 Verification Type System
The type checker enforces a type system based upon a hierarchy of verification types, illustrated below.
Verification type hierarchy:
top
____________/\____________
/ \
/ \
oneWord twoWord
/ | \ / \
/ | \ / \
int float reference long double
/ \
/ \_____________
/ \
/ \
uninitialized +---------------------+
/ \ | Java reference |
/ \ | reference |
/ \ | type hierarchy |
uninitializedThis uninitialized(Offset) +---------------------+
|
|
null
The "reference type hierarchy" was previously referred to as the "Java reference type hierarchy". But the reference type subtyping graph doesn't rely on the Java language at all, and in fact, as of Java 5, differs significantly from it.
Most verification types have a direct correspondence with the primitive and reference types described in 2.2 and represented by field descriptors in Table 4.3-A:
The primitive types
double
,float
,int
, andlong
(field descriptorsD
,F
,I
,J
) each correspond to the verification type of the same name.The primitive typesbyte
,char
,short
, andboolean
(field descriptorsB
,C
,S
,Z
) all correspond to the verification typeint
.Class and interface types (field descriptors beginning
L
) correspond to verification types that use the functorclass
. The verification typeclass(*N*, *L*)
represents the type of the class or interface whose binary name is*N*
as loaded by the loader*L*
. Note that*L*
is an initiating loader (5.3) of the class represented byclass(*N*, *L*)
and may, or may not, be the class's defining loader.For example, the class type
Object
would be represented asclass('java/lang/Object',
BL
L
)
, wherethe defining loader of classBL
'java/lang/Object'
, as loaded byL
, is the bootstrap loader.Array types (field descriptors beginning
[
) correspond to verification types that use the functorarrayOf
. Note that the primitive typesbyte
,char
,short
, andboolean
do not correspond to verification types, but an array type whose element type isbyte
,char
,short
, orboolean
does correspond to a verification type; such verification types support the baload, bastore, caload, castore, saload, sastore, and newarray instructions.The verification type
arrayOf(*T*)
represents the array type whose component type is the verification type*T*
.The verification type
arrayOf(byte)
represents the array type whoseelementcomponent type isbyte
.The verification type
arrayOf(char)
represents the array type whoseelementcomponent type ischar
.The verification type
arrayOf(short)
represents the array type whoseelementcomponent type isshort
.The verification type
arrayOf(boolean)
represents the array type whoseelementcomponent type isboolean
.
For example, the array types
int[]
andObject[]
would be represented by the verification typesarrayOf(int)
andarrayOf(class('java/lang/Object', BL))
respectively. The array typesbyte[]
andboolean[][]
would be represented by the verification typesarrayOf(byte)
andarrayOf(arrayOf(boolean))
respectively.
The remaining verification types are described as follows:
The verification types
top
,oneWord
,twoWord
, andreference
describe abstract unions of other types, as illustrated above, and are represented in Prolog as atomswhose name denotes the verification type in question.The verification types
uninitialized
,uninitializedThis
, anduninitialized(Offset)
describe references to objects created withnew
that have not yet been initialized (2.9.2).uninitialized
anduninitializedThis
are represented with an atom.The verification typeuninitialized(Offset)
is represented by applying the functoruninitialized
to an argument representing the numerical value of theOffset
.The verification type
null
describes the result of theaconst_null
instruction, and is represented in Prolog as an atom.
The subtyping rules for verification types are as follows.
Subtyping is reflexive.
isAssignable(X, X).
The verification types which are not reference types in the Java programming language have subtype rules of the form:
isAssignable(v, X) :- isAssignable(the_direct_supertype_of_v, X).
That is, v
is a subtype of X
if the direct supertype of v
is a subtype of X
. The rules are:
The type top
is a supertype of all other types.
isAssignable(oneWord, top).
isAssignable(twoWord, top).
A type is a subtype of some other type, X, if its direct supertype is a subtype of X.
isAssignable(int, X) :- isAssignable(oneWord, X).
isAssignable(float, X) :- isAssignable(oneWord, X).
isAssignable(long, X) :- isAssignable(twoWord, X).
isAssignable(double, X) :- isAssignable(twoWord, X).
isAssignable(reference, X) :- isAssignable(oneWord, X).
isAssignable(class(_, _), X) :- isAssignable(reference, X).
isAssignable(arrayOf(_), X) :- isAssignable(reference, X).
isAssignable(null, X) :- isAssignable(reference, X).
isAssignable(uninitialized, X) :- isAssignable(reference, X).
isAssignable(uninitializedThis, X) :- isAssignable(uninitialized, X).
isAssignable(uninitialized(_), X) :- isAssignable(uninitialized, X).
The type null
is a subtype of all reference types.
isAssignable(null, class(_, _)).
isAssignable(null, arrayOf(_)).
isAssignable(null, X) :- isAssignable(class('java/lang/Object', BL), X),
isBootstrapLoader(BL).
These subtype rules are not necessarily the most obvious formulation of subtyping. There is a clear split between subtyping rules
for reference types in the Java programming languageamong reference types, and rules for the remaining verification types. The split allows us to state general subtyping relations betweenJava programming languagereference types and other verification types. These relations hold independently of aJavareference type's position in the type hierarchy, and help to prevent excessive class loading by a Java Virtual Machine implementation. For example, we do not want to start climbing theJavasuperclass hierarchy in response to a query of the formclass(foo, L) <: twoWord
.
We also have a rule that says subtyping is reflexive, so together these rules cover most verification types that are not reference types
in the Java programming language.
Subtype rules for the reference types in the Java programming language are specified recursively with isJavaAssignable
isWideningReference
.
isAssignable(class(X, Lx), class(Y, Ly)) :-
isJavaAssignable(class(X, Lx), class(Y, Ly)).
isAssignable(arrayOf(X), class(Y, L)) :-
isJavaAssignable(arrayOf(X), class(Y, L)).
isAssignable(arrayOf(X), arrayOf(Y)) :-
isJavaAssignable(arrayOf(X), arrayOf(Y)).
isAssignable(From, To) :- isWideningReference(From, To).
The isWideningReference
predicate is only defined for reference types, and will fail to match any non-reference inputs. So it's unnecessary to restrict the form of the inputs to avoid testing non-referene types.
For assignments, interfaces are treated like The verifier allows any reference type to be widened to an interface type.Object
.
isJavaAssignable(class(_, _), class(To, L)) :-
loadedClass(To, L, ToClass),
classIsInterface(ToClass).
isWideningReference(class(_, _), class(To, L)) :-
loadedClass(To, L, ToClass),
classIsInterface(ToClass).
isWideningReference(arrayOf(_), class(To, L)) :-
loadedClass(To, L, ToClass),
classIsInterface(ToClass).
This approach is less strict than the Java Programming Language, which will not allow an assignment to an interface unless the value is statically known to implement or extend the interface. The Java Virtual Machine instead uses a run-time check to ensure that invocations of interface methods actually operate on objects that implement the interface (6.5.invokeinterface). But there is no requirement that a reference stored by a local variable of an interface type refers to an object that actually implements that interface.
A class type can be widened to another class type if that type refers to the loaded class or one of its superclasses.
isJavaAssignable(From, To) :-
isJavaSubclassOf(From, To).
isWideningReference(class(ClassName, L1), class(ClassName, L2)) :-
L1 \= L2,
loadedClass(ClassName, L1, Class),
loadedClass(ClassName, L2, Class).
isWideningReference(class(From, L1), class(To, L2)) :-
From \= To,
loadedClass(From, L1, FromClass),
loadedClass(To, L2, ToClass),
loadedSuperclases(FromClass, Supers),
member(ToClass, Supers).
A bug in the previous rules failed to allow the same class to be treated as a subtype of itself when referenced in the context of different initiating class loaders. It's not clear if this has any practical impact (are subtype tests ever performed between types referenced from different classes?), but the first rule above addresses it.
In the case in which two class types have the same name and the same initiating class loader, neither of these rules apply. If the types are the same, that's an identity, not a widening. The reflexive isAssignable
rule applies, and the class should not be loaded.
Array types are subtypes of Object
. The intent is also that array types are subtypes of Cloneable
and java.io.Serializable
.
isJavaAssignable(arrayOf(_), class('java/lang/Object', BL)) :-
isBootstrapLoader(BL).
isJavaAssignable(arrayOf(_), X) :-
isArrayInterface(X).
isArrayInterface(class('java/lang/Cloneable', BL)) :-
isBootstrapLoader(BL).
isArrayInterface(class('java/io/Serializable', BL)) :-
isBootstrapLoader(BL).
isWideningReference(arrayOf(_), class('java/lang/Object', L)) :-
loadedClass('java/lang/Object', L, ObjectClass),
classDefiningLoader(ObjectClass, BL),
isBootstrapLoader(BL).
A bug in the previous rules fails to treat array types as subtypes of class('java/lang/Object', L)
unless L
is the bootstrap loader. Since L
is the initiating loader, that rule failed to support the common case of java/lang/Object
being referenced outside of bootstrap classes.
The previous rules also fail to allow an array type to be treated as a subtype of an arbitrary interface type. In practice, it is possible to, say, pass an array as an argument to a method expecting a Runnable
. The earlier rules for interface types address this, making it unnecessary to single out Cloneable
and Serializable
for special treatment.
Subtyping between arrays of primitive type is the identity relation.
isJavaAssignable(arrayOf(X), arrayOf(Y)) :-
atom(X),
atom(Y),
X = Y.
Subtyping between arrays of reference type is covariant.
isJavaAssignable(arrayOf(X), arrayOf(Y)) :-
compound(X), compound(Y), isJavaAssignable(X, Y).
isWideningReference(arrayOf(X), arrayOf(Y)) :-
isWideningReference(X, Y).
The subtyping rule for arrays of primitive types is an identity conversion, not a widening; and it is already covered by the reflexive rule for isAssignable
.
The subtyping rule for arrays of reference types does not need to check that the inputs are reference types—if not, isWideningReference
will not succeed.
Subclassing is reflexive.
isJavaSubclassOf(class(SubclassName, L), class(SubclassName, L)).
isJavaSubclassOf(class(SubclassName, LSub), class(SuperclassName, LSuper)) :-
superclassChain(SubclassName, LSub, Chain),
member(class(SuperclassName, L), Chain),
loadedClass(SuperclassName, L, Sup),
loadedClass(SuperclassName, LSuper, Sup).
This relation is expressed directly with isWideningReference
, above. No need to introduce another predicate.
superclassChain(ClassName, L, [class(SuperclassName, Ls) | Rest]) :-
loadedClass(ClassName, L, Class),
classSuperClassName(Class, SuperclassName),
classDefiningLoader(Class, Ls),
superclassChain(SuperclassName, Ls, Rest).
superclassChain('java/lang/Object', L, []) :-
loadedClass('java/lang/Object', L, Class),
classDefiningLoader(Class, BL),
isBootstrapLoader(BL).
This predicate is moved to 4.10.1.1 and renamed loadedSuperclasses
.
4.10.1.3 Instruction Representation
Bug fix: member references and instructions like checkcast
and ldc
can refer to array types, so CONSTANT_Class
structures need to be encoded as types, not class names.
Individual bytecode instructions are represented in Prolog as terms whose functor is the name of the instruction and whose arguments are its parsed operands.
For example, an aload instruction is represented as the term
aload(N)
, which includes the indexN
that is the operand of the instruction.
The instructions as a whole are represented as a list of terms of the form:
instruction(Offset, AnInstruction)
For example,
instruction(21, aload(1))
.
The order of instructions in this list must be the same as in the class
file.
Some instructions have operands that refer to entries in the constant_pool
table representing fields, methods, and dynamically-computed call sites. Such entries are represented as functor applications of the form:
class(N, L)
orarrayOf(T)
for a constant pool entry that is aCONSTANT_Class_info
structure (4.4.1).These are verification types, as described in 4.10.1.2.
If the
name_index
item of the structure gives the name of a class or interface,N
is that name, andL
is the class loader of the class or interface containing the constant pool.If the
name_index
item of the structure gives an array type,T
is the array component type.
field(
FieldClassName
FieldClassType
, FieldName, FieldDescriptor)
for a constant pool entry that is aCONSTANT_Fieldref_info
structure (4.4.2).FieldClassName
is the name of the classFieldClassType
is the verification type of the class, interface, or array type referenced by theclass_index
item in the structure.FieldName
andFieldDescriptor
correspond to the name and field descriptor referenced by thename_and_type_index
item of the structure.method(
MethodClassName
MethodClassType
, MethodName, MethodDescriptor)
for a constant pool entry that is aCONSTANT_Methodref_info
structure (4.4.2).MethodClassName
is the name of the classMethodClassType
is the verification type of the class, interface, or array type referenced by theclass_index
item of the structure.MethodName
andMethodDescriptor
correspond to the name and method descriptor referenced by thename_and_type_index
item of the structure.imethod(
MethodClassName
MethodClassType
, MethodName, MethodDescriptor)
for a constant pool entry that is aCONSTANT_InterfaceMethodref_info
structure (4.4.2).MethodIntfName
is the name of the interfaceMethodClassType
is the verification type of the class, interface, or array type referenced by theclass_index
item of the structure.MethodName
andMethodDescriptor
correspond to the name and method descriptor referenced by thename_and_type_index
item of the structure.
string(Value)
for a constant pool entry that is aCONSTANT_String_info
structure (4.4.3).Value
is the string referenced by thestring_index
item of the structure.int(Value)
for a constant pool entry that is aCONSTANT_Integer_info
structure (4.4.4).Value
is theint
constant represented by thebytes
item of the structure.float(Value)
for a constant pool entry that is aCONSTANT_Float_info
structure (4.4.4).Value
is thefloat
constant represented by thebytes
item of the structure.long(Value)
for a constant pool entry that is aCONSTANT_Long_info
structure (4.4.5).Value
is thelong
constant represented by thehigh_bytes
andlow_bytes
items of the structure.double(Value)
for a constant pool entry that is aCONSTANT_Double_info
structure (4.4.5).Value
is thedouble
constant represented by thehigh_bytes
andlow_bytes
items of the structure.methodHandle(Kind, Reference)
for a constant pool entry that is aCONSTANT_MethodHandle_info
structure (4.4.8).Kind
is the value of thereference_kind
item of the structure.Reference
is the value of thereference_index
item of the structure.methodType(MethodDescriptor)
for a constant pool entry that is aCONSTANT_MethodType_info
structure (4.4.9).MethodDescriptor
is the method descriptor referenced by thedescriptor_index
item of the structure.dconstant(ConstantName, FieldDescriptor)
for a constant pool entry that is aCONSTANT_Dynamic_info
structure (4.4.10).ConstantName
andFieldDescriptor
correspond to the name and field descriptor referenced by thename_and_type_index
item of the structure. (Thebootstrap_method_attr_index
item is irrelevant to verification.)
dmethod(CallSiteName, MethodDescriptor)
for a constant pool entry that is aCONSTANT_InvokeDynamic_info
structure (4.4.10).CallSiteName
andMethodDescriptor
correspond to the name and method descriptor referenced by thename_and_type_index
item of the structure. (Thebootstrap_method_attr_index
item is irrelevant to verification.)
We've combined the two lists of constant forms into one. Because CONSTANT_Class_info
is relevant to both lists, it's awkward to keep the two separate. It would require, say, a forward reference from the first list to the second.
For clarity, we assume that field and method descriptors (4.3.2, 4.3.3) are mapped into more readable names: the leading L
and trailing ;
are dropped from class names, and the BaseType characters used for primitive types are mapped to the names of those types.
The descriptor should always be processed with parseFieldDescriptor, so its format doesn't need to be specified.
For example, a getfield instruction whose operand refers to a constant pool entry representing a field
foo
of typeF
in classBar
would be represented asgetfield(field('Bar', 'foo', 'F'))
getfield(field(class('Bar', L), 'foo', 'F'))
, whereL
is the class loader of the class containing the instruction. An ldc instruction for loading theint
constant 91 would be represented asldc(int(91))
.
The ldc instruction, among others, has an operand that refers to a loadable entry in the constant_pool
table. There are nine kinds of loadable entry (see Table 4.4-C), represented by functor applications of the following forms:
int(Value)
for a constant pool entry that is aCONSTANT_Integer_info
structure (4.4.4).Value
is theint
constant represented by thebytes
item of the structure.For example, an ldc instruction for loading the
int
constant 91 would be represented asldc(int(91))
.float(Value)
for a constant pool entry that is aCONSTANT_Float_info
structure (4.4.4).Value
is thefloat
constant represented by thebytes
item of the structure.long(Value)
for a constant pool entry that is aCONSTANT_Long_info
structure (4.4.5).Value
is thelong
constant represented by thehigh_bytes
andlow_bytes
items of the structure.double(Value)
for a constant pool entry that is aCONSTANT_Double_info
structure (4.4.5).Value
is thedouble
constant represented by thehigh_bytes
andlow_bytes
items of the structure.class(ClassName)
for a constant pool entry that is aCONSTANT_Class_info
structure (4.4.1).ClassName
is the name of the class or interface referenced by thename_index
item in the structure.string(Value)
for a constant pool entry that is aCONSTANT_String_info
structure (4.4.3).Value
is the string referenced by thestring_index
item of the structure.methodHandle(Kind, Reference)
for a constant pool entry that is aCONSTANT_MethodHandle_info
structure (4.4.8).Kind
is the value of thereference_kind
item of the structure.Reference
is the value of thereference_index
item of the structure.methodType(MethodDescriptor)
for a constant pool entry that is aCONSTANT_MethodType_info
structure (4.4.9).MethodDescriptor
is the method descriptor referenced by thedescriptor_index
item of the structure.dconstant(ConstantName, FieldDescriptor)
for a constant pool entry that is aCONSTANT_Dynamic_info
structure (4.4.10).ConstantName
andFieldDescriptor
correspond to the name and field descriptor referenced by thename_and_type_index
item of the structure. (Thebootstrap_method_attr_index
item is irrelevant to verification.)
4.10.1.6 Type Checking Methods with Code
For the initial type state of an instance method, we compute the type of this
and put it in a list. The type of this
in the <init>
method of Object
is Object
; in other <init>
methods, the type of this
is uninitializedThis
; otherwise, the type of this
in an instance method is class(N, L)
where N
is the name of the class containing the method and L
is its defining class loader.
For the initial type state of a static method, this
is irrelevant, so the list is empty.
methodInitialThisType(_Class, Method, []) :-
methodAccessFlags(Method, AccessFlags),
member(static, AccessFlags),
methodName(Method, MethodName),
MethodName \= '`<init>`'.
methodInitialThisType(_Class, Method, []) :-
methodAccessFlags(Method, AccessFlags),
member(static, AccessFlags).
An instance initialization method cannot be static
, so checking for it is redundant.
methodInitialThisType(Class, Method, [This]) :-
methodAccessFlags(Method, AccessFlags),
notMember(static, AccessFlags),
instanceMethodInitialThisType(Class, Method, This).
instanceMethodInitialThisType(Class, Method, class('java/lang/Object', L)) :-
methodName(Method, '`<init>`'),
classDefiningLoader(Class, L),
isBootstrapLoader(L),
classClassName(Class, 'java/lang/Object').
instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
methodName(Method, '`<init>`'),
classClassName(Class, ClassName),
classDefiningLoader(Class, CurrentLoader),
superclassChain(ClassName, CurrentLoader, Chain),
Chain \= [].
instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
methodName(Method, '`<init>`'),
loadedSuperclasses(Class, Supers),
Supers \= [].
instanceMethodInitialThisType(Class, Method, class(ClassName, L)) :-
methodName(Method, MethodName),
MethodName \= '`<init>`',
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
...
4.10.1.8 Type Checking for protected
Members
Member references can refer to array types, so the rules in this section need to be defined in terms of a type, not a class name.
All instructions that access members must contend with the rules concerning protected
members. This section describes the protected
check that corresponds to JLS §6.6.2.1.
The protected
check applies only to protected
members of superclasses of the current class. protected
members in other classes will be caught by the access checking done at resolution (5.4.4). There are four cases:
If the
name of a class is not the name of any superclassreferenced type is not a class type with the same name as a superclass, it cannot be a superclass, and so it can safely be ignored.passesProtectedCheck(Environment, MemberClassName, MemberName, MemberDescriptor, StackFrame) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), superclassChain(CurrentClassName, CurrentLoader, Chain), notMember(class(MemberClassName, _), Chain).
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName, MemberDescriptor, StackFrame) :- thisClass(Environment, CurrentClass), \+ hasSuperclassWithName(CurrentClass, MemberClassName). passesProtectedCheck(Environment, arrayOf(_), MemberName, MemberDescriptor, StackFrame). hasSuperclassWithName(Class, SuperclassName) :- loadedSuperclasses(Class, Supers), member(Super, Supers), classClassName(Super, SuperclassName).
If the
MemberClassName
is the same as the name of a superclass, the class being resolved may indeed be a superclass. In this case, if no superclass namedMemberClassName
in a different run-time package has aprotected
member namedMemberName
with descriptorMemberDescriptor
, theprotected
check does not apply.This is because the actual class being resolved will either be one of these superclasses, in which case we know that it is either in the same run-time package, and the access is legal; or the member in question is not
protected
and the check does not apply; or it will be a subclass, in which case the check would succeed anyway; or it will be some other class in the same run-time package, in which case the access is legal and the check need not take place; or the verifier need not flag this as a problem, since it will be caught anyway because resolution will per force fail.passesProtectedCheck(Environment, MemberClassName, MemberName, MemberDescriptor, StackFrame) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), superclassChain(CurrentClassName, CurrentLoader, Chain), member(class(MemberClassName, _), Chain), classesInOtherPkgWithProtectedMember( class(CurrentClassName, CurrentLoader), MemberName, MemberDescriptor, MemberClassName, Chain, []).
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName, MemberDescriptor, StackFrame) :- thisClass(Environment, CurrentClass), hasSuperclassWithName(CurrentClass, MemberClassName), loadedSuperclasses(CurrentClass, Chain), classesInOtherPkgWithProtectedMember( CurrentClass, MemberName, MemberDescriptor, MemberClassName, Chain, []).
If there does exist a
protected
superclass member in a different run-time package, then loadMemberClassName
; if the member in question is notprotected
, the check does not apply. (Using a superclass member that is notprotected
is trivially correct.)passesProtectedCheck(Environment, MemberClassName, MemberName, MemberDescriptor, frame(_Locals, [Target | Rest], _Flags)) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), superclassChain(CurrentClassName, CurrentLoader, Chain), member(class(MemberClassName, _), Chain), classesInOtherPkgWithProtectedMember( class(CurrentClassName, CurrentLoader), MemberName, MemberDescriptor, MemberClassName, Chain, List), List \= [], loadedClass(MemberClassName, CurrentLoader, ReferencedClass), isNotProtected(ReferencedClass, MemberName, MemberDescriptor).
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName, MemberDescriptor, frame(_Locals, [Target | Rest], _Flags)) :- thisClass(Environment, CurrentClass), hasSuperclassWithName(CurrentClass, MemberClassName), loadedSuperclasses(CurrentClass, Chain), classesInOtherPkgWithProtectedMember( CurrentClass, MemberName, MemberDescriptor, MemberClassName, Chain, List), List \= [], loadedClass(MemberClassName, CurrentLoader, ReferencedClass), isNotProtected(ReferencedClass, MemberName, MemberDescriptor).
Otherwise, use of a member of an object of type
Target
requires thatTarget
be assignable to the type of the current class.passesProtectedCheck(Environment, MemberClassName, MemberName, MemberDescriptor, frame(_Locals, [Target | Rest], _Flags)) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), superclassChain(CurrentClassName, CurrentLoader, Chain), member(class(MemberClassName, _), Chain), classesInOtherPkgWithProtectedMember( class(CurrentClassName, CurrentLoader), MemberName, MemberDescriptor, MemberClassName, Chain, List), List \= [], loadedClass(MemberClassName, CurrentLoader, ReferencedClass), isProtected(ReferencedClass, MemberName, MemberDescriptor), isAssignable(Target, class(CurrentClassName, CurrentLoader)).
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName, MemberDescriptor, frame(_Locals, [Target | Rest], _Flags)) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), hasSuperclassWithName(CurrentClass, MemberClassName), loadedSuperclasses(CurrentClass, Chain), classesInOtherPkgWithProtectedMember( CurrentClass, MemberName, MemberDescriptor, MemberClassName, Chain, List), List \= [], loadedClass(MemberClassName, CurrentLoader, ReferencedClass), isProtected(ReferencedClass, MemberName, MemberDescriptor), thisType(Environment, ThisType), isAssignable(Target, ThisType).
The predicate classesInOtherPkgWithProtectedMember(Class, MemberName, MemberDescriptor, MemberClassName, Chain, List)
is true if List
is the set of classes in Chain
with name MemberClassName
that are in a different run-time package than Class
which have a protected
member named MemberName
with descriptor MemberDescriptor
.
classesInOtherPkgWithProtectedMember(_, _, _, _, [], []).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
[class(MemberClassName, L) | T]) :-
differentRuntimePackage(Class, class(MemberClassName, L)),
loadedClass(MemberClassName, L, Super),
isProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[Super | Tail],
[Super | T]) :-
classClassName(Super, MemberClassName),
differentRuntimePackage(Class, Super),
isProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
T) :-
differentRuntimePackage(Class, class(MemberClassName, L)),
loadedClass(MemberClassName, L, Super),
isNotProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[Super | Tail],
T) :-
classClassName(Super, MemberClassName),
differentRuntimePackage(Class, Super),
isNotProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
T] :-
sameRuntimePackage(Class, class(MemberClassName, L)),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
Super | Tail],
T) :-
classClassName(Super, MemberClassName),
sameRuntimePackage(Class, Super),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
sameRuntimePackage(Class1, Class2) :-
classDefiningLoader(Class1, L),
classDefiningLoader(Class2, L),
samePackageName(Class1, Class2).
differentRuntimePackage(Class1, Class2) :-
classDefiningLoader(Class1, L1),
classDefiningLoader(Class2, L2),
L1 \= L2.
differentRuntimePackage(Class1, Class2) :-
differentPackageName(Class1, Class2).
4.10.1.9 Type Checking Instructions
getfield
A getfield instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting a field whose declared type is FieldType
, declared in a class type FieldClassName
FieldClassType
, and one can validly replace a type matching FieldClassName
FieldClassType
with type FieldType
on the incoming operand stack yielding the outgoing type state. FieldClassName
must not be an array type.protected
fields are subject to additional checks (4.10.1.8).
Array types are allowed here.
instructionIsTypeSafe(getfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, FieldName, FieldDescriptor),
CP = field(FieldClassType, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
passesProtectedCheck(Environment, FieldClassName, FieldName,
FieldDescriptor, StackFrame),
currentClassLoader(Environment, CurrentLoader),
validTypeTransition(Environment,
[class(FieldClassName, CurrentLoader)], FieldType,
StackFrame, NextStackFrame),
passesProtectedCheck(Environment, FieldClassType, FieldName,
FieldDescriptor, StackFrame),
validTypeTransition(Environment, [FieldClassType], FieldType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
getstatic
A getstatic instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting a field whose declared type is FieldType
, and one can validly push FieldType
on the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(getstatic(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(_FieldClassName, _FieldName, FieldDescriptor),
CP = field(_FieldClassType, _FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
validTypeTransition(Environment, [], FieldType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
invokeinterface
An invokeinterface instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting an interface method namedMethodName
with descriptorDescriptor
that is a member ofan interfacea typeMethodIntfName
MethodClassType
.MethodName
is not<init>
.MethodName
is not<clinit>
.Its second operand,
Count
, is a valid count operand (see below).One can validly replace types matching
the typeMethodIntfName
MethodClassType
and the argument types given inDescriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.
instructionIsTypeSafe(invokeinterface(CP, Count, 0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = imethod(MethodIntfName, MethodName, Descriptor),
CP = imethod(MethodClassType, _MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
currentClassLoader(Environment, CurrentLoader),
reverse([class(MethodIntfName, CurrentLoader) | OperandArgList],
StackArgList),
reverse([MethodClassType | OperandArgList], StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
validTypeTransition(Environment, [], ReturnType,
TempFrame, NextStackFrame),
countIsValid(Count, StackFrame, TempFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Array types are allowed here.
The Count
operand of an invokeinterface instruction is valid if it equals the size of the arguments to the instruction. This is equal to the difference between the size of InputFrame
and OutputFrame
.
countIsValid(Count, InputFrame, OutputFrame) :-
InputFrame = frame(_Locals1, OperandStack1, _Flags1),
OutputFrame = frame(_Locals2, OperandStack2, _Flags2),
length(OperandStack1, Length1),
length(OperandStack2, Length2),
Count =:= Length1 - Length2.
invokespecial
An invokespecial instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting a method namedMethodName
with descriptorDescriptor
that is a member of aclasstypeMethodClassName
MethodClassType
.Either:
MethodName
is not<init>
.MethodName
is not<clinit>
.MethodClassType
is the current class, a superclass of the current class, or a direct superinterface of the current class.One can validly replace types matching the current class and the argument types given in
Descriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.One can validly replace types matching the classMethodClassName
and the argument types given inDescriptor
on the incoming operand stack with the return type given inDescriptor
.
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, MethodName, Descriptor),
(CP = method(MethodClassType, MethodName, Descriptor) ;
CP = imethod(MethodClassType, MethodName, Descriptor)),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
validSpecialMethodClassType(Environment, MethodClassType),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
isAssignable(class(CurrentClassName, CurrentLoader),
class(MethodClassName, CurrentLoader)),
reverse([class(CurrentClassName, CurrentLoader) | OperandArgList],
StackArgList),
thisType(Environment, ThisType),
reverse([ThisType | OperandArgList], StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList2),
validTypeTransition(Environment, StackArgList2, ReturnType,
StackFrame, _ResultStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
validSpecialMethodClassType(Environment, class(MethodClassName, L)) :-
loadedClass(MethodClassName, L, MethodClass),
\+ classIsInterface(MethodClass),
thisClass(Environment, ThisClass),
loadedSuperclasses(ThisClass, Supers),
member(MethodClass, Supers).
validSpecialMethodClassType(Environment, class(MethodClassName, L)) :-
loadedClass(MethodClassName, L, MethodClass),
classIsInterface(MethodClass),
thisClass(Environment, ThisClass),
classInterfaceNames(ThisClass, InterfaceNames),
member(MethodClassName, InterfaceNames).
Interface methods are allowed here (4.9.1).
The old rules attempt to enforce the constraints on MethodClassName
via an isAssignable
call, deferring some complexity to subtype testing. Unfortunately, this isn't correct for interfaces: every reference type is "assignable" to every interface type.
In practice, Hotspot appears to perform the following check:
If a
Methodref
is used, test forisAssignable
.If an
InterfaceMethodref
is used, make sure it is named as a direct superinterface.
This allows, for example, the name of a valid interface that is not related to the current class to appear in a Methodref
. Verification succeeds, and no error occurs until resolution of the Methodref
.
That's some unnecessary complexity that doesn't quite align with 4.9.2. Instead, these new rules directly test for a valid class/interface name: the current class, a superclass, or a direct superinterface. The rules do some class loading, but note that the same loading occurred before in the isAssignable
test.
Another problem with the old rules is a redundant subtyping check via validTypeTransition
. Given that CurrentClass
<: MethodClassType
and the stack operand <: CurrentClass
, there's no need to also check that the stack operand <: MethodClassType
.
Array types are syntactically allowed here, but the validSpecialMethodClassType
clause will reject them.
The
isAssignable
validSpecialMethodClassType
clause enforces the structural constraint that invokespecial, for other than an instance initialization method, must name a method in the current class/interface or a superclass/superinterface.
The
firstvalidTypeTransition
clause enforces the structural constraint that invokespecial, for other than an instance initialization method, targets a receiver object of the current class or deeper. To see why, consider thatStackArgList
simulates the list of types on the operand stack expected by the method, starting with the current class (the class performing invokespecial). The actual types on the operand stack are inStackFrame
. The effect ofvalidTypeTransition
is to pop the first type from the operand stack inStackFrame
and check it is a subtype of the first term ofStackArgList
, namely the current class. Thus, the actual receiver type is compatible with the current class.
A sharp-eyed reader might notice that enforcing this structural constraint supercedes the structural constraint pertaining to invokespecial of a
protected
method. Thus, the Prolog code above makes no reference topassesProtectedCheck
(4.10.1.8), whereas the Prolog code for invokespecial of an instance initialization method usespassesProtectedCheck
to ensure the actual receiver type is compatible with the current class when certainprotected
instance initialization methods are named.
The secondvalidTypeTransition
clause enforces the structural constraint that any method invocation instruction must target a receiver object whose type is compatible with the type named by the instruction. To see why, consider thatStackArgList2
simulates the list of types on the operand stack expected by the method, starting with the type named by the instruction. Again, the actual types on the operand stack are inStackFrame
, and the effect ofvalidTypeTransition
is to check the actual receiver type inStackFrame
is compatible with the type named by the instruction inStackArgList2
.
Or:
MethodName is
<init>
.Descriptor
specifies avoid
return type.One can validly pop types matching the argument types given in
Descriptor
and an uninitialized type,UninitializedArg
, off the incoming operand stack, yieldingOperandStack
.The outgoing type state is derived from the incoming type state by first replacing the incoming operand stack with
OperandStack
and then replacing all instances ofUninitializedArg
with the type of instance being initialized.If the instruction calls an instance initialization method on a class instance created by an earlier new instruction, and the method is
protected
, the usage conforms to the special rules governing access toprotected
members (4.10.1.8).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, '`<init>`', Descriptor),
(CP = method(MethodClassType, '`<init>`', Descriptor) ;
CP = imethod(MethodClassType, '`<init>`', Descriptor)),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitializedThis | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitializedThis, Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClassType, This),
rewrittenInitializationFlags(uninitializedThis, Flags, NextFlags),
substitute(uninitializedThis, This, OperandStack, NextOperandStack),
substitute(uninitializedThis, This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, '`<init>`', Descriptor),
(CP = method(MethodClassType, '`<init>`', Descriptor) ;
CP = imethod(MethodClassType, '`<init>`', Descriptor)),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitialized(Address) | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitialized(Address), Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenUninitializedType(uninitialized(Address), Environment,
MethodClassType, This),
rewrittenInitializationFlags(uninitialized(Address), Flags, NextFlags),
substitute(uninitialized(Address), This, OperandStack, NextOperandStack),
substitute(uninitialized(Address), This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags),
passesProtectedCheck(Environment, MethodClassName, '`<init>`',
Descriptor, NextStackFrame).
passesProtectedCheck(Environment, MethodClassType, '`<init>`',
Descriptor, NextStackFrame).
Array types are syntactically allowed here, but the rewrittenUninitializedType
clause will reject them.
I've confirmed that Hotspot behavior in JDK 13 is to allow interface method references named <init>
, delaying any errors until resolution/runtime.
To compute what type the uninitialized argument's type needs to be rewritten to, there are two cases:
If we are initializing an object within its constructor, its type is initially
uninitializedThis
. This type will be rewritten to the type of the class of the<init>
method.The second case arises from initialization of an object created by new. The uninitialized arg type is rewritten to
MethodClass
MethodClassType
, the type of the method holder of<init>
. We check whether there really is a new instruction atAddress
.
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, MethodClass) :-
MethodClass = class(MethodClassName, CurrentLoader),
thisClass(Environment, MethodClass).
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClassType, MethodClassType) :-
thisType(Environment, MethodClassType).
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, MethodClass) :-
MethodClass = class(MethodClassName, CurrentLoader),
thisClass(Environment, class(thisClassName, thisLoader)),
superclassChain(thisClassName, thisLoader, [MethodClass | Rest]).
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClassType, MethodClassType) :-
MethodClassType = class(MethodClassName, _),
thisClass(Environment, ThisClass),
classSuperClassName(ThisClass, MethodClassName).
rewrittenUninitializedType(uninitialized(Address), Environment,
MethodClass, MethodClass) :-
allInstructions(Environment, Instructions),
member(instruction(Address, new(MethodClass)), Instructions).
rewrittenUninitializedType(uninitialized(Address), Environment,
MethodClassType, MethodClassType) :-
allInstructions(Environment, Instructions),
member(instruction(Address, new(MethodClassType)), Instructions).
rewrittenInitializationFlags(uninitializedThis, _Flags, []).
rewrittenInitializationFlags(uninitialized(_), Flags, Flags).
substitute(_Old, _New, [], []).
substitute(Old, New, [Old | FromRest], [New | ToRest]) :-
substitute(Old, New, FromRest, ToRest).
substitute(Old, New, [From1 | FromRest], [From1 | ToRest]) :-
From1 \= Old,
substitute(Old, New, FromRest, ToRest).
The rule for invokespecial of an
<init>
method is the sole motivation for passing back a distinct exception stack frame. The concern is that when initializing an object within its constructor, invokespecial can cause a superclass<init>
method to be invoked, and that invocation could fail, leavingthis
uninitialized. This situation cannot be created using source code in the Java programming language, but can be created by programming in bytecode directly.
In this situation, the original frame holds an uninitialized object in local variable 0 and has flag
flagThisUninit
. Normal termination of invokespecial initializes the uninitialized object and turns off theflagThisUninit
flag. But if the invocation of an<init>
method throws an exception, the uninitialized object might be left in a partially initialized state, and needs to be made permanently unusable. This is represented by an exception frame containing the broken object (the new value of the local) and theflagThisUninit
flag (the old flag). There is no way to get from an apparently-initialized object bearing theflagThisUninit
flag to a properly initialized object, so the object is permanently unusable.
If not for this situation, the flags of the exception stack frame would always be the same as the flags of the input stack frame.
invokestatic
An invokestatic instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting a method namedMethodName
with descriptorDescriptor
.MethodName
is not<init>
.MethodName
is not<clinit>
.One can validly replace types matching the argument types given in
Descriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.
instructionIsTypeSafe(invokestatic(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(_MethodClassName, MethodName, Descriptor),
(CP = method(_MethodClassType, MethodName, Descriptor) ;
CP = imethod(_MethodClassType, MethodName, Descriptor)),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Interface methods are allowed here (4.9.1).
invokevirtual
An invokevirtual instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting a method namedMethodName
with descriptorDescriptor
that is a member of aclasstypeMethodClassName
MethodClassType
.MethodName
is not<init>
.MethodName
is not<clinit>
.One can validly replace types matching
the classMethodClassName
MethodClassType
and the argument types given inDescriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.If the method is
protected
, the usage conforms to the special rules governing access toprotected
members (4.10.1.8).
instructionIsTypeSafe(invokevirtual(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, MethodName, Descriptor),
CP = method(MethodClassType, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, ArgList),
currentClassLoader(Environment, CurrentLoader),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList),
reverse([MethodClassType | OperandArgList], StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
canPop(StackFrame, ArgList, PoppedFrame),
passesProtectedCheck(Environment, MethodClassName, MethodName,
Descriptor, PoppedFrame),
passesProtectedCheck(Environment, MethodClassType, MethodName,
Descriptor, PoppedFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
ldc, ldc_w, ldc2_w
An ldc instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting an entity of type Type
, where Type
is loadable (4.4), but not long
or double
, and one can validly push Type
onto the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(ldc(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
loadableConstant(CP, Type),
Type \= long,
Type \= double,
validTypeTransition(Environment, [], Type, StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
loadableConstant(CP, Type) :-
member([CP, Type], [
[int(_), int],
[float(_), float],
[long(_), long],
[double(_), double]
]).
loadableConstant(CP, Type) :-
isBootstrapLoader(BL),
member([CP, Type], [
[class(_), class('java/lang/Class', BL)],
[class(_,_), class('java/lang/Class', BL)],
[arrayOf(_), class('java/lang/Class', BL)],
[string(_), class('java/lang/String', BL)],
[methodHandle(_,_), class('java/lang/invoke/MethodHandle', BL)],
[methodType(_,_), class('java/lang/invoke/MethodType', BL)]
]).
loadableConstant(CP, Type) :-
CP = dconstant(_, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, Type).
An ldc_w instruction is type safe iff the equivalent ldc instruction is type safe.
instructionHasEquivalentTypeRule(ldc_w(CP), ldc(CP))
An ldc2_w instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting an entity of type Type
, where Type
is either long
or double
, and one can validly push Type
onto the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(ldc2_w(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
loadableConstant(CP, Type),
(Type = long ; Type = double),
validTypeTransition(Environment, [], Type, StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
multianewarray
A multianewarray instruction with operands CP
and Dim
is type safe iff CP
refers to a constant pool entry denoting an array type whose dimension is greater or equal to Dim
, Dim
is strictly positive, and one can validly replace Dim
int
types on the incoming operand stack with the type denoted by CP
yielding the outgoing type state.
instructionIsTypeSafe(multianewarray(CP, Dim), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = arrayOf(_),
classDimension(CP, Dimension),
Dimension >= Dim,
Dim > 0,
/* Make a list of Dim ints */
findall(int, between(1, Dim, _), IntList),
validTypeTransition(Environment, IntList, CP,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The dimension dimensions of an array type whose component type is also an array type is one more than the dimension dimensions of its component type.
classDimension(arrayOf(X), Dimension) :-
classDimension(X, Dimension1),
Dimension is Dimension1 + 1.
classDimension(_, Dimension) :-
Dimension = 0.
arrayDimensions(arrayOf(X), XDimensions + 1) :-
arrayDimensions(X, XDimensions).
arrayDimensions(Type, 0) :-
Type \= arrayOf(_).
Renamed this predicate, since the element type is not necessarily a class type. Also addressed a bug: the second rule previously would match array types as well as non-array types.
putfield
A putfield instruction with operand CP
is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting a field whose declared type isFieldType
, declared in aclasstypeFieldClassName
FieldClassType
.FieldClassName
must not be an array type.Array types are allowed here.
Either:
One can validly pop types matching
FieldType
andFieldClassName
FieldClassType
off the incoming operand stack yielding the outgoing type state.protected
fields are subject to additional checks (4.10.1.8).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, FieldName, FieldDescriptor),
CP = field(FieldClassType, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
canPop(StackFrame, [FieldType], PoppedFrame),
passesProtectedCheck(Environment, FieldClassName, FieldName,
FieldDescriptor, PoppedFrame),
currentClassLoader(Environment, CurrentLoader),
canPop(StackFrame, [FieldType, class(FieldClassName, CurrentLoader)],
NextStackFrame),
passesProtectedCheck(Environment, FieldClassType, FieldName,
FieldDescriptor, PoppedFrame),
canPop(StackFrame, [FieldType, FieldClassType], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Or:
- If the instruction occurs in an instance initialization method of the class
FieldClassName
FieldClassType
, then one can validly pop types matchingFieldType
anduninitializedThis
off the incoming operand stack yielding the outgoing type state. This allows instance fields ofthis
that are declared in the current class to be assigned prior to complete initialization ofthis
.
- If the instruction occurs in an instance initialization method of the class
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, _FieldName, FieldDescriptor),
CP = field(FieldClassType, _FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
Environment = environment(CurrentClass, CurrentMethod, _, _, _, _),
CurrentClass = class(FieldClassName, _),
thisType(Environment, FieldClassType),
Environment = environment(_, CurrentMethod, _, _, _, _),
isInit(CurrentMethod),
canPop(StackFrame, [FieldType, uninitializedThis], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
putstatic
A putstatic instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting a field whose declared type is FieldType
, and one can validly pop a type matching FieldType
off the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(putstatic(CP), _Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(_FieldClassName, _FieldName, FieldDescriptor),
CP = field(_FieldClassType, _FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
canPop(StackFrame, [FieldType], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Chapter 5: Loading, Linking, and Initializing
5.4 Linking
5.4.2 Preparation
Preparation involves creating the static fields for a class or interface and initializing such fields to their default values (2.3, 2.4). This does not require the execution of any Java Virtual Machine code; explicit initializers for static fields are executed as part of initialization (5.5), not preparation.
During preparation of a class or interface C, the Java Virtual Machine also imposes loading constraints (5.3.4):
Let L1 be the defining loader of C. For each instance method m declared in C that can override (5.4.5) an instance method declared in a superclass or superinterface
<
D, L2>
,the Java Virtual Machine imposes loading constraints as followsfor each class or interface name N mentioned by the descriptor of m (4.3.3), the Java Virtual Machine imposes the loading constraint NL1 = NL2.Given that the return type of m is Tr, and that the formal parameter types of m are Tf1, ..., Tfn:If Tr not an array type, let T0 be Tr; otherwise, let T0 be the element type of Tr.For i = 1 to n: If Tfi is not an array type, let Ti be Tfi; otherwise, let Ti be the element type of Tfi.Then TiL1 = TiL2 for i = 0 to n.For each instance method m declared in a superinterface
<
I, L3>
of C, if C does not itself declare an instance method that can override m, then a method is selected (5.4.6) with respect to C and the method m in<
I, L3>
. Let<
D, L2>
be the class or interface that declares the selected method.The Java Virtual Machine imposes loading constraints as follows.For each class or interface name N mentioned by the descriptor of m, the Java Virtual Machine imposes the loading constraint NL2 = NL3.Given that the return type of m is Tr, and that the formal parameter types of m are Tf1, ..., Tfn:If Tr not an array type, let T0 be Tr; otherwise, let T0 be the element type of Tr.For i = 1 to n: If Tfi is not an array type, let Ti be Tfi; otherwise, let Ti be the element type of Tfi.Then TiL2 = TiL3 for i = 0 to n.
Preparation may occur at any time following creation but must be completed prior to initialization.
5.4.3 Resolution
5.4.3.2 Field Resolution
To resolve an unresolved symbolic reference from D to a field in a class or interface C, the symbolic reference to C given by the field reference must first be resolved (5.4.3.1). Therefore, any exception that can be thrown as a result of failure of resolution of a class or interface reference can be thrown as a result of failure of field resolution. If the reference to C can be successfully resolved, an exception relating to the failure of resolution of the field reference itself can be thrown.
When resolving a field reference, field resolution first attempts to look up the referenced field in C and its superclasses:
If C declares a field with the name and descriptor specified by the field reference, field lookup succeeds. The declared field is the result of the field lookup.
Otherwise, field lookup is applied recursively to the direct superinterfaces of the specified class or interface C.
Otherwise, if C has a superclass S, field lookup is applied recursively to S.
Otherwise, field lookup fails.
Then, the result of field resolution is determined:
If field lookup failed, field resolution throws a
NoSuchFieldError
.Otherwise, field lookup succeeded. Access control is applied for the access from D to the field which is the result of field lookup (5.4.4). Then:
If access control failed, field resolution fails for the same reason.
Otherwise, access control succeeded. Loading constraints are imposed, as follows.
Let
<
E, L1>
be the class or interface in which the referenced field is actually declared. Let L2 be the defining loader of D.Given that the type of the referenced field is Tf: if Tf is not an array type, let T be Tf; otherwise, let T be the element type of Tf.The Java Virtual Machine imposes the loading constraint that TL1 = TL2.For any class or interface name N mentioned by the descriptor of the referenced field (4.3.2), the Java Virtual Machine imposes the loading constraint NL1 = NL2 (5.3.4).
If imposing this constraint results in any loading constraints being violated (5.3.4), then field resolution fails. Otherwise, field resolution succeeds.
5.4.3.3 Method Resolution
To resolve an unresolved symbolic reference from D to a method in a class C, the symbolic reference to C given by the method reference is first resolved (5.4.3.1). Therefore, any exception that can be thrown as a result of failure of resolution of a class reference can be thrown as a result of failure of method resolution. If the reference to C can be successfully resolved, exceptions relating to the resolution of the method reference itself can be thrown.
When resolving a method reference:
If C is an interface, method resolution throws an
IncompatibleClassChangeError
.Otherwise, method resolution attempts to locate the referenced method in C and its superclasses:
If C declares exactly one method with the name specified by the method reference, and the declaration is a signature polymorphic method (2.9.3), then method lookup succeeds.
All the class names mentioned in the descriptor are resolved (5.4.3.1).The descriptor specified by the method reference is resolved, as if by resolution of an unresolved symbolic reference to a method type (5.4.3.5).The resolved method is the signature polymorphic method declaration. It is not necessary for C to declare a method with the descriptor specified by the method reference.
Otherwise, if C declares a method with the name and descriptor specified by the method reference, method lookup succeeds.
Otherwise, if C has a superclass, step 2 of method resolution is recursively invoked on the direct superclass of C.
Otherwise, method resolution attempts to locate the referenced method in the superinterfaces of the specified class C:
If the maximally-specific superinterface methods of C for the name and descriptor specified by the method reference include exactly one method that does not have its
ACC_ABSTRACT
flag set, then this method is chosen and method lookup succeeds.Otherwise, if any superinterface of C declares a method with the name and descriptor specified by the method reference that has neither its
ACC_PRIVATE
flag nor itsACC_STATIC
flag set, one of these is arbitrarily chosen and method lookup succeeds.Otherwise, method lookup fails.
A maximally-specific superinterface method of a class or interface C for a particular method name and descriptor is any method for which all of the following are true:
The method is declared in a superinterface (direct or indirect) of C.
The method is declared with the specified name and descriptor.
The method has neither its
ACC_PRIVATE
flag nor itsACC_STATIC
flag set.Where the method is declared in interface I, there exists no other maximally-specific superinterface method of C with the specified name and descriptor that is declared in a subinterface of I.
The result of method resolution is determined as follows:
If method lookup failed, method resolution throws a
NoSuchMethodError
.Otherwise, method lookup succeeded. Access control is applied for the access from D to the method which is the result of method lookup (5.4.4). Then:
If access control failed, method resolution fails for the same reason.
Otherwise, access control succeeded. Loading constraints are imposed, as follows.
Let
<
E, L1>
be the class or interface in which the referenced method m is actually declared. Let L2 be the defining loader of D.Given that the return type of m is Tr, and that the formal parameter types of m are Tf1, ..., Tfn:If Tr is not an array type, let T0 be Tr; otherwise, let T0 be the element type of Tr.For i = 1 to n: If Tfi is not an array type, let Ti be Tfi; otherwise, let Ti be the element type of Tfi.The Java Virtual Machine imposes the loading constraints TiL1 = TiL2 for i = 0 to n.For each class or interface name N mentioned by the descriptor of the referenced method (4.3.3), the Java Virtual Machine imposes the loading constraint NL1 = NL2 (5.3.4).
If imposing these constraints results in any loading constraints being violated (5.3.4), then method resolution fails. Otherwise, method resolution succeeds.
When resolution searches for a method in the class's superinterfaces, the best outcome is to identify a maximally-specific non-
abstract
method. It is possible that this method will be chosen by method selection, so it is desirable to add class loader constraints for it.
Otherwise, the result is nondeterministic. This is not new: The Java® Virtual Machine Specification has never identified exactly which method is chosen, and how "ties" should be broken. Prior to Java SE 8, this was mostly an unobservable distinction. However, beginning with Java SE 8, the set of interface methods is more heterogenous, so care must be taken to avoid problems with nondeterministic behavior. Thus:
Superinterface methods that are
private
andstatic
are ignored by resolution. This is consistent with the Java programming language, where such interface methods are not inherited.Any behavior controlled by the resolved method should not depend on whether the method is
abstract
or not.
Note that if the result of resolution is an
abstract
method, the referenced class C may be non-abstract
. Requiring C to beabstract
would conflict with the nondeterministic choice of superinterface methods. Instead, resolution assumes that the run time class of the invoked object has a concrete implementation of the method.
5.4.3.4 Interface Method Resolution
To resolve an unresolved symbolic reference from D to an interface method in an interface C, the symbolic reference to C given by the interface method reference is first resolved (5.4.3.1). Therefore, any exception that can be thrown as a result of failure of resolution of an interface reference can be thrown as a result of failure of interface method resolution. If the reference to C can be successfully resolved, exceptions relating to the resolution of the interface method reference itself can be thrown.
When resolving an interface method reference:
If C is not an interface, interface method resolution throws an
IncompatibleClassChangeError
.Otherwise, if C declares a method with the name and descriptor specified by the interface method reference, method lookup succeeds.
Otherwise, if the class
Object
declares a method with the name and descriptor specified by the interface method reference, which has itsACC_PUBLIC
flag set and does not have itsACC_STATIC
flag set, method lookup succeeds.Otherwise, if the maximally-specific superinterface methods (5.4.3.3) of C for the name and descriptor specified by the method reference include exactly one method that does not have its
ACC_ABSTRACT
flag set, then this method is chosen and method lookup succeeds.Otherwise, if any superinterface of C declares a method with the name and descriptor specified by the method reference that has neither its
ACC_PRIVATE
flag nor itsACC_STATIC
flag set, one of these is arbitrarily chosen and method lookup succeeds.Otherwise, method lookup fails.
The result of interface method resolution is determined as follows:
If method lookup failed, interface method resolution throws a
NoSuchMethodError
.Otherwise, method lookup succeeded. Access control is applied for the access from D to the method which is the result of method lookup (5.4.4). Then:
If access control failed, interface method resolution fails for the same reason.
Otherwise, access control succeeded. Loading constraints are imposed, as follows.
Let
<
E, L1>
be the class or interface in which the referenced interface method m is actually declared. Let L2 be the defining loader of D.Given that the return type of m is Tr, and that the formal parameter types of m are Tf1, ..., Tfn:If Tr is not an array type, let T0 be Tr; otherwise, let T0 be the element type of Tr.For i = 1 to n: If Tfi is not an array type, let Ti be Tfi; otherwise, let Ti be the element type of Tfi.The Java Virtual Machine imposes the loading constraints TiL1 = TiL2 for i = 0 to n.For each class or interface name N mentioned by the descriptor of the referenced method (4.3.3), the Java Virtual Machine imposes the loading constraint NL1 = NL2 (5.3.4).
If imposing these constraints results in any loading constraints being violated (5.3.4), then interface method resolution fails. Otherwise, interface method resolution succeeds.
Access control is necessary because interface method resolution may pick a
private
method of interface C. (Prior to Java SE 8, the result of interface method resolution could be a non-public
method of classObject
or astatic
method of classObject
; such results were not consistent with the inheritance model of the Java programming language, and are disallowed in Java SE 8 and above.)
5.4.3.5 Method Type and Method Handle Resolution
To resolve an unresolved symbolic reference to a method type, it is as if resolution occurs of unresolved symbolic references to classes and interfaces (5.4.3.1) whose names correspond to the types given in are mentioned by the method descriptor (4.3.3), in the order in which they are mentioned.
Any exception that can be thrown as a result of failure of resolution of a class reference to a class or interface can thus be thrown as a result of failure of method type resolution.
The result of successful method type resolution is a reference
to an instance of java.lang.invoke.MethodType
which represents the method descriptor.
Method type resolution occurs regardless of whether the run-time constant pool actually contains symbolic references to classes and interfaces indicated in the method descriptor. Also, the resolution is deemed to occur on unresolved symbolic references, so a failure to resolve one method type will not necessarily lead to a later failure to resolve another method type with the same textual method descriptor, if suitable classes and interfaces can be loaded by the later time.
Resolution of an unresolved symbolic reference to a method handle is more complicated. Each method handle resolved by the Java Virtual Machine has an equivalent instruction sequence called its bytecode behavior, indicated by the method handle's kind. The integer values and descriptions of the nine kinds of method handle are given in Table 5.4.3.5-A.
Symbolic references by an instruction sequence to fields or methods are indicated by C.x:T
, where x
and T
are the name and descriptor (4.3.2, 4.3.3) of the field or method, and C
is the class or interface in which the field or method is to be found.
Table 5.4.3.5-A. Bytecode Behaviors for Method Handles
Kind | Description | Interpretation |
---|---|---|
1 | REF_getField |
getfield C.f:T |
2 | REF_getStatic |
getstatic C.f:T |
3 | REF_putField |
putfield C.f:T |
4 | REF_putStatic |
putstatic C.f:T |
5 | REF_invokeVirtual |
invokevirtual C.m:(A*)T |
6 | REF_invokeStatic |
invokestatic C.m:(A*)T |
7 | REF_invokeSpecial |
invokespecial C.m:(A*)T |
8 | REF_newInvokeSpecial |
new C; dup; invokespecial C.<init>:(A*)V |
9 | REF_invokeInterface |
invokeinterface C.m:(A*)T |
Let MH be the symbolic reference to a method handle (5.1) being resolved. Also:
Let R be the symbolic reference to the field or method contained within MH.
R is derived from the
CONSTANT_Fieldref
,CONSTANT_Methodref
, orCONSTANT_InterfaceMethodref
structure referred to by thereference_index
item of theCONSTANT_MethodHandle
from which MH is derived.For example, R is a symbolic reference to C
.
f for bytecode behavior of kind 1, and a symbolic reference to C.
<init>
for bytecode behavior of kind 8.If MH's bytecode behavior is kind 7 (
REF_invokeSpecial
), then C must be the current class or interface, a superclass of the current class, a direct superinterface of the current class or interface, orObject
.Let T be the type of the field referenced by R, or the return type of the method referenced by R. Let A* be the sequence (perhaps empty) of parameter types of the method referenced by R.
T and A* are derived from the
CONSTANT_NameAndType
structure referred to by thename_and_type_index
item in theCONSTANT_Fieldref
,CONSTANT_Methodref
, orCONSTANT_InterfaceMethodref
structure from which R is derived.
To resolve MH, all symbolic references to classes, interfaces, fields, and methods in MH's bytecode behavior are resolved, using the following four steps:
R is resolved. This occurs as if by field resolution (5.4.3.2) when MH's bytecode behavior is kind 1, 2, 3, or 4, and as if by method resolution (5.4.3.3) when MH's bytecode behavior is kind 5, 6, 7, or 8, and as if by interface method resolution (5.4.3.4) when MH's bytecode behavior is kind 9.
The following constraints apply to the result of resolving R. These constraints correspond to those that would be enforced during verification or execution of the instruction sequence for the relevant bytecode behavior.
If MH's bytecode behavior is kind 8 (
REF_newInvokeSpecial
), then R must resolve to an instance initialization method declared in class C.If R resolves to a
protected
member, then the following rules apply depending on the kind of MH's bytecode behavior:For kinds 1, 3, and 5 (
REF_getField
,REF_putField
, andREF_invokeVirtual
): IfC.f
orC.m
resolved to aprotected
field or method, and C is in a different run-time package than the current class, then C must be assignable to the current class.For kind 8 (
REF_newInvokeSpecial
): If C.
<init>
resolved to aprotected
method, then C must be declared in the same run-time package as the current class.
R must resolve to a
static
or non-static
member depending on the kind of MH's bytecode behavior:For kinds 1, 3, 5, 7, and 9 (
REF_getField
,REF_putField
,REF_invokeVirtual
,REF_invokeSpecial
, andREF_invokeInterface
):C.f
orC.m
must resolve to a non-static
field or method.For kinds 2, 4, and 6 (
REF_getStatic
,REF_putStatic
, andREF_invokeStatic
):C.f
orC.m
must resolve to astatic
field or method.
Resolution occurs as if of unresolved symbolic references to classes and interfaces whose names correspond to each type in A* , and to the type T, in that order.This is phrased incorrectly—not all types correspond to class and interface names. It's also unnecessary: the next step will perform
MethodType
resolution which, as described above, resolves all the mentioned classes and interfaces.A reference to an instance of
java.lang.invoke.MethodType
is obtained as if by resolution of an unresolved symbolic reference to a method type that contains the method descriptor specified in Table 5.4.3.5-B for the kind of MH.It is as if the symbolic reference to a method handle contains a symbolic reference to the method type that the resolved method handle will eventually have. The detailed structure of the method type is obtained by inspecting Table 5.4.3.5-B.
Table 5.4.3.5-B. Method Descriptors for Method Handles
Kind Description Method descriptor 1 REF_getField
(C)T
2 REF_getStatic
()T
3 REF_putField
(C,T)V
4 REF_putStatic
(T)V
5 REF_invokeVirtual
(C,A*)T
6 REF_invokeStatic
(A*)T
7 REF_invokeSpecial
(C,A*)T
8 REF_newInvokeSpecial
(A*)C
9 REF_invokeInterface
(C,A*)T
In steps 1, 3, and 4 1 and 3, any exception that can be thrown as a result of failure of resolution of a symbolic reference to a class, interface, field, or method can be thrown as a result of failure of method handle resolution. In step 2, any failure due to the specified constraints causes a failure of method handle resolution due to an IllegalAccessError
.
The intent is that resolving a method handle can be done in exactly the same circumstances that the Java Virtual Machine would successfully verify and resolve the symbolic references in the bytecode behavior. In particular, method handles to
private
,protected
, andstatic
members can be created in exactly those classes for which the corresponding normal accesses are legal.
The result of successful method handle resolution is a reference
to an instance of java.lang.invoke.MethodHandle
which represents the method handle MH.
The type descriptor of this java.lang.invoke.MethodHandle
instance is the java.lang.invoke.MethodType
instance produced in the third step of method handle resolution above.
The type descriptor of a method handle is such that a valid call to
invokeExact
injava.lang.invoke.MethodHandle
on the method handle has exactly the same stack effects as the bytecode behavior. Calling this method handle on a valid set of arguments has exactly the same effect and returns the same result (if any) as the corresponding bytecode behavior.
If the method referenced by R has the ACC_VARARGS
flag set (4.6), then the java.lang.invoke.MethodHandle
instance is a variable arity method handle; otherwise, it is a fixed arity method handle.
A variable arity method handle performs argument list boxing (JLS §15.12.4.2) when invoked via invoke
, while its behavior with respect to invokeExact
is as if the ACC_VARARGS
flag were not set.
Method handle resolution throws an IncompatibleClassChangeError
if the method referenced by R has the ACC_VARARGS
flag set and either A* is an empty sequence or the last parameter type in A* is not an array type. That is, creation of a variable arity method handle fails.
An implementation of the Java Virtual Machine is not required to intern method types or method handles. That is, two distinct symbolic references to method types or method handles which are structurally identical might not resolve to the same instance of java.lang.invoke.MethodType
or java.lang.invoke.MethodHandle
respectively.
The
java.lang.invoke.MethodHandles
class in the Java SE Platform API allows creation of method handles with no bytecode behavior. Their behavior is defined by the method ofjava.lang.invoke.MethodHandles
that creates them. For example, a method handle may, when invoked, first apply transformations to its argument values, then supply the transformed values to the invocation of another method handle, then apply a transformation to the value returned from that invocation, then return the transformed value as its own result.
5.5 Initialization
Initialization of a class or interface consists of executing its class or interface initialization method (2.9.2).
A class or interface C may be initialized only as a result of:
The execution of any one of the Java Virtual Machine instructions new, getstatic, putstatic, or invokestatic that references C (6.5.new, 6.5.getstatic, 6.5.putstatic, 6.5.invokestatic).
Upon execution of a new instruction, the class to be initialized is the class referenced by the instruction.
Upon execution of a getstatic, putstatic, or invokestatic instruction, the class or interface to be initialized is the class or interface that declares the resolved field or method.
The first invocation of a
java.lang.invoke.MethodHandle
instance which was the result of method handle resolution (5.4.3.5) for a method handle of kind 2 (REF_getStatic
), 4 (REF_putStatic
), 6 (REF_invokeStatic
), or 8 (REF_newInvokeSpecial
).This implies that the class of a bootstrap method is initialized when the bootstrap method is invoked for an invokedynamic instruction (6.5.invokedynamic), as part of the continuing resolution of the call site specifier.
Invocation of certain reflective methods in the class library (2.12), for example, in class
Class
or in packagejava.lang.reflect
.If C is a class, the initialization of one of its subclasses.
If C is an interface that declares a non-
abstract
, non-static
method, the initialization of a class that implements C directly or indirectly.Its designation as the initial class or interface at Java Virtual Machine startup (5.2).
Prior to initialization, a class or interface must be linked, that is, verified, prepared, and optionally resolved.
Because the Java Virtual Machine is multithreaded, initialization of a class or interface requires careful synchronization, since some other thread may be trying to initialize the same class or interface at the same time. There is also the possibility that initialization of a class or interface may be requested recursively as part of the initialization of that class or interface. The implementation of the Java Virtual Machine is responsible for taking care of synchronization and recursive initialization by using the following procedure. It assumes that the class or interface has already been verified and prepared, and that the Class
object class or interface contains state that indicates one of four situations:Class
object
This
class or interface is verified and prepared but not initialized.Class
objectThis
class or interface is being initialized by some particular thread.Class
objectThis
class or interface is fully initialized and ready for use.Class
objectThis
class or interface is in an erroneous state, perhaps because initialization was attempted and failed.Class
object
Here and below, we eliminate the unnecessary assertion that the initialization state of the class is stored by an instance of java.lang.Class
. The specification need not concern itself with how classes are internally represented and how this representation relates to instances of java.lang.Class
.
For each class or interface C, there is a unique initialization lock LC. The mapping from C to LC is left to the discretion of the Java Virtual Machine implementation. For example, LC could be the Class
object for C, or the monitor associated with that Class
object. The procedure for initializing C is then as follows:
Synchronize on the initialization lock, LC, for C. This involves waiting until the current thread can acquire LC.
If
theC indicates that initialization is in progress for C by some other thread, then release LC and block the current thread until informed that the in-progress initialization has completed, at which time repeat this procedure.Class
object forThread interrupt status is unaffected by execution of the initialization procedure.
If
theC indicates that initialization is in progress for C by the current thread, then this must be a recursive request for initialization. Release LC and complete normally.Class
object forIf
theC indicates thatClass
object forCit has already been initialized, then no further action is required. Release LC and complete normally.If
theC is in an erroneous state, then initialization is not possible. Release LC and throw aClass
object forNoClassDefFoundError
.Otherwise, record the fact that initialization of the
Class
object for C is in progress by the current thread, and release LC.Then, initialize each
final
static
field of C with the constant value in itsConstantValue
attribute (4.7.2), in the order the fields appear in theClassFile
structure.Next, if C is a class rather than an interface, then let SC be its superclass and let SI1, ..., SIn be all superinterfaces of C (whether direct or indirect) that declare at least one non-
abstract
, non-static
method. The order of superinterfaces is given by a recursive enumeration over the superinterface hierarchy of each interface directly implemented by C. For each interface I directly implemented by C (in the order of theinterfaces
array of C), the enumeration recurs on I's superinterfaces (in the order of theinterfaces
array of I) before returning I.For each S in the list [ SC, SI1, ..., SIn ], if S has not yet been initialized, then recursively perform this entire procedure for S. If necessary, verify and prepare S first.
If the initialization of S completes abruptly because of a thrown exception, then acquire LC, label
theC as erroneous, notify all waiting threads, release LC, and complete abruptly, throwing the same exception that resulted from initializingClass
object forSCS.Next, determine whether assertions are enabled for C by querying its defining class loader.
Next, execute the class or interface initialization method of C.
If the execution of the class or interface initialization method completes normally, then acquire LC, label the
Class
object for C as fully initialized, notify all waiting threads, release LC, and complete this procedure normally.Otherwise, the class or interface initialization method must have completed abruptly by throwing some exception E. If the class of E is not
Error
or one of its subclasses, then create a new instance of the classExceptionInInitializerError
with E as the argument, and use this object in place of E in the following step. If a new instance ofExceptionInInitializerError
cannot be created because anOutOfMemoryError
occurs, then use anOutOfMemoryError
object in place of E in the following step.Acquire LC, label
theC as erroneous, notify all waiting threads, release LC, and complete this procedure abruptly with reason E or its replacement as determined in the previous step.Class
object for
A Java Virtual Machine implementation may optimize this procedure by eliding the lock acquisition in step 1 (and release in step 4/5) when it can determine that the initialization of the class has already completed, provided that, in terms of the Java memory model, all happens-before orderings (JLS §17.4.5) that would exist if the lock were acquired, still exist when the optimization is performed.
Chapter 6: The Java Virtual Machine Instruction Set
6.5 Instructions
aastore
- Operation
Store into
reference
array- Format
aastore
- Forms
aastore = 83 (0x53)
- Operand Stack
..., arrayref, index, value →
...
- Description
The arrayref must be of type
reference
and must refer to an array whose components are of typereference
. The index must be of typeint
, and value must be of typereference
. The arrayref, index, and value are popped from the operand stack.If value is
null
, then value is stored as the component of the array at index.Otherwise, value is non-
null
.If the type of value is assignment compatible with the type of the components of the array referenced by arrayref, then value is stored as the component of the array at index.If value is a value of the component type of the array referenced by arrayref, then value is stored as the component of the array at index.The following rules are used to determine whether a value that is not
null
is assignment compatible with the array component type. If S is the type of the object referred to by value, and T is the reference type of the array components, then aastore determines whether assignment is compatible as follows:If S is a class type, then:
If T is a class type, then S must be the same class as T, or S must be a subclass of T;
If T is an interface type, then S must implement interface T.
If S is an array type SC
[]
, that is, an array of components of type SC, then:If T is a class type, then T must be
Object
.If T is an interface type, then T must be one of the interfaces implemented by arrays (JLS §4.10.3).
If T is an array type TC
[]
, that is, an array of components of type TC, then one of the following must be true:TC and SC are the same primitive type.
TC and SC are reference types, and type SC is assignable to TC by these run-time rules.
Whether value is a value of the array component type is determined according to the rules given for checkcast.
Appealing to "assignment compatible" is a roundabout way to say what we really mean—value must be a value of the array's component type.
aastore, checkcast, and instanceof use the same rules to interpret types. It's helpful to consolidate those rules in one place, so that readers can clearly see that the rules are the same, and so that future enhancements to the type system have fewer rules to maintain.
- Run-time Exceptions
If arrayref is
null
, aastore throws aNullPointerException
.Otherwise, if index is not within the bounds of the array referenced by arrayref, the aastore instruction throws an
ArrayIndexOutOfBoundsException
.Otherwise, if
arrayref is notthe non-null
and the actual type ofnull
value is notassignment compatible with the actual type of the components of the arraya value of the array component type, aastore throws anArrayStoreException
."Otherwise" here implies "arrayref is not
null
".
checkcast
- Operation
Check whether object is of given type
- Format
checkcast
indexbyte1
indexbyte2- Forms
checkcast = 192 (0xc0)
- Operand Stack
..., objectref →
..., objectref
- Description
The objectref must be of type
reference
. The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1<<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a class, array, or interface type.If objectref is
null
, then the operand stack is unchanged.Otherwise, the named class, array, or interface type is resolved (5.4.3.1). If objectref
can be cast to the resolved class, array, or interface typeis a value of the type given by the resolved class, interface, or array type, the operand stack is unchanged; otherwise, the checkcast instruction throws a.ClassCastException
The following rules are used to determine whether
an objectref that is nota reference to an object is a value of a reference type, T.null
can be cast to the resolved typeIf S is the type of the object referred to by objectref, and T is the resolved class, array, or interface type, then checkcast determines whether objectref can be cast to type T as follows:If
S is a class typethe reference is to an instance of a class C, then:If T is a class type, then S must be the same class as T, or S must be a subclass of T;If T is the type of a class D, then the reference is a value of T if C is D or a subclass of D.
If T is an interface type, then S must implement interface T.If T is the type of an interface I, then the reference is a value of T if C implements I.
If
S is an array type SCthe reference is to an array with component type SC, then:[]
, that is, an array of components of type SCIf T is a class type, then
T must bethe reference is a value of T if T isObject
Object
.If T is an interface type, then
T must be one of the interfaces implemented by arrays (JLS §4.10.3)the reference is a value of T if T isCloneable
orjava.io.Serializable
(as loaded by the bootstrap class loader).It's unnecessary and especially risky to tie JVMS to the Java Language Specification here—we certainly don't want language changes to accidentally impact the routine behavior of JVM instructions.
If T is an array type TC
[]
, that is, an array of components of type TC, thenone of the following must be truethe reference is a value of T if one of the following are true:TC and SC are the same
primitivetype.TC and SC are reference types, and type SC can be cast to TC by recursive application of these rules.Bug fix: an earlier cleanup of these rules (JDK-8069130) removed cases to handle an interface type S. These cases appeared vacuous at the top level, but were necessary to support a recursive analysis for array types. Rather than restoring the old rules, it's probably easier to follow if the recursion is contained within the array type discussion.
Further, recursion to the top level is no longer a good fit, because the rules are expressed in terms of a specific reference, not types.
TC is the class type
Object
.TC is a class type, SC is a class type, and the class of SC is a subclass of the class of TC.
TC is an interface type, SC is a class type, and the class of SC implements the interface of TC.
TC is an interface type, SC is an interface type, and the interface of SC extends the interface of TC.
TC is the interface type
Cloneable
orjava.io.Serializable
(as loaded by the bootstrap class loader), and SC is an array type.TC is an array type TCC
[]
, SC is an array type SCC[]
, and one of these tests of array component types apply recursively to TCC and SCC.
- Linking Exceptions
During resolution of the symbolic reference to the class, array, or interface type, any of the exceptions documented in 5.4.3.1 can be thrown.
- Run-time Exception
Otherwise, if objectref
cannot be cast to the resolved class, array, or interface typeis not null and is not a value of the type given by the resolved class, interface, or array type, the checkcast instruction throws aClassCastException
.- Notes
The checkcast instruction is very similar to the instanceof instruction (6.5.instanceof). It differs in its treatment of
null
, its behavior when its test fails (checkcast throws an exception, instanceof pushes a result code), and its effect on the operand stack.
instanceof
- Operation
Determine if object is of given type
- Format
instanceof
indexbyte1
indexbyte2- Forms
instanceof = 193 (0xc1)
- Operand Stack
..., objectref →
..., result
- Description
The objectref, which must be of type
reference
, is popped from the operand stack. The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1<<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a class, array, or interface type.If objectref is
null
, the instanceof instruction pushes anint
result of 0as anonto the operand stack.int
Otherwise, the named class, array, or interface type is resolved (5.4.3.1). If objectref is
an instance ofa value of the type given by the resolved class, interface, or array type,or implements the resolved interface,the instanceof instruction pushes anint
result of 1as anonto the operand stack; otherwise, it pushes anint
int
result of 0.The following rules are used to determine whether an objectref that is not
null
is an instance of the resolved type. If S is the type of the object referred to by objectref, and T is the resolved class, array, or interface type, then instanceof determines whether objectref is an instance of T as follows:If S is a class type, then:
If T is a class type, then S must be the same class as T, or S must be a subclass of T;
If T is an interface type, then S must implement interface T.
If S is an array type SC
[]
, that is, an array of components of type SC, then:If T is a class type, then T must be
Object
.If T is an interface type, then T must be one of the interfaces implemented by arrays (JLS §4.10.3).
If T is an array type TC
[]
, that is, an array of components of type TC, then one of the following must be true:TC and SC are the same primitive type.
TC and SC are reference types, and type SC can be cast to TC by these run-time rules.
Whether objectref is a value of the type given by the resolved class, interface, or array type is determined according to the rules given for checkcast.
aastore, checkcast, and instanceof use the same rules to interpret types. It's helpful to consolidate those rules in one place, so that readers can clearly see that the rules are the same, and so that future enhancements to the type system have fewer rules to maintain.
- Linking Exceptions
During resolution of the symbolic reference to the class, array, or interface type, any of the exceptions documented in 5.4.3.1 can be thrown.
- Notes
The instanceof instruction is very similar to the checkcast instruction (6.5.checkcast). It differs in its treatment of
null
, its behavior when its test fails (checkcast throws an exception, instanceof pushes a result code), and its effect on the operand stack.
new
- Operation
Create new object
- Format
new
indexbyte1
indexbyte2- Forms
new = 187 (0xbb)
- Operand Stack
... →
..., objectref
- Description
The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1
<<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a class or interfacetype. The named class or interfacetypeis resolved (5.4.3.1) and should result in a non-abstract
classtype. Memory for a new instance of that class is allocated from the garbage-collected heap, and the instance variables of the new object are initialized totheirthe default initial values of their types (2.3, 2.4). The objectref, areference
to the instance, is pushed onto the operand stack.On successful resolution of the class, it is initialized if it has not already been initialized (5.5).
- Linking Exceptions
During resolution of the symbolic reference to the class or interface
type, any of the exceptions documented in 5.4.3.1 can be thrown.Otherwise, if the symbolic reference to the class or interface type resolves to an interface or an
abstract
class, new throws anInstantiationError
.- Run-time Exception
Otherwise, if execution of this new instruction causes initialization of the referenced class, new may throw an
Error
as detailed inJLS §15.9.45.5.This is an unnecessary JLS reference, and also appears to be out of date: JLS 15.9.4 doesn't describe class initialization at all.
- Notes
The new instruction does not completely create a new instance; instance creation is not completed until an instance initialization method (2.9.1) has been invoked on the uninitialized instance.
putfield
- Operation
Set field in object
- Format
putfield
indexbyte1
indexbyte2- Forms
putfield = 181 (0xb5)
- Operand Stack
..., objectref, value →
...
- Description
The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1
<<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a field (5.1), which gives the name and descriptor of the field as well as a symbolic reference to the class in which the field is to be found. The referenced field is resolved (5.4.3.2).The type of a value stored by a putfield instruction must be compatible with the descriptor of the referenced field (4.3.2). If the field descriptor type is
boolean
,byte
,char
,short
, orint
, then the value must be anint
. If the field descriptor type isfloat
,long
, ordouble
, then the value must be afloat
,long
, ordouble
, respectively. If the field descriptor type is a reference type, then the value must be a value ofa type that is assignment compatible (JLS §5.2) withthe field descriptor type. If the field isfinal
, it must be declared in the current class, and the instruction must occur in an instance initialization method of the current class (2.9.1).The value and objectref are popped from the operand stack.
The objectref must be of type
reference
but not an array type.If the value is of type
int
and the field descriptor type isboolean
, then theint
value is narrowed by taking the bitwise AND of value and 1, resulting in value'. Otherwise, the value undergoes value set conversion (2.8.3), resulting in value'.The referenced field in objectref is set to value'.
- Linking Exceptions
During resolution of the symbolic reference to the field, any of the exceptions pertaining to field resolution (5.4.3.2) can be thrown.
Otherwise, if the resolved field is a
static
field, putfield throws anIncompatibleClassChangeError
.Otherwise, if the resolved field is
final
, it must be declared in the current class, and the instruction must occur in an instance initialization method of the current class. Otherwise, anIllegalAccessError
is thrown.- Run-time Exception
Otherwise, if objectref is
null
, the putfield instruction throws aNullPointerException
.
putstatic
- Operation
Set static field in class
- Format
putstatic
indexbyte1
indexbyte2- Forms
putstatic = 179 (0xb3)
- Operand Stack
..., value →
...
- Description
The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1
<<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a field (5.1), which gives the name and descriptor of the field as well as a symbolic reference to the class or interface in which the field is to be found. The referenced field is resolved (5.4.3.2).On successful resolution of the field, the class or interface that declared the resolved field is initialized if that class or interface has not already been initialized (5.5).
The type of a value stored by a putstatic instruction must be compatible with the descriptor of the referenced field (4.3.2). If the field descriptor type is
boolean
,byte
,char
,short
, orint
, then the value must be anint
. If the field descriptor type isfloat
,long
, ordouble
, then the value must be afloat
,long
, ordouble
, respectively. If the field descriptor type is a reference type, then the value must be a value ofa type that is assignment compatible (JLS §5.2) withthe field descriptor type. If the field isfinal
, it must be declared in the current class or interface, and the instruction must occur in the class or interface initialization method of the current class or interface (2.9.2).The value is popped from the operand stack.
If the value is of type
int
and the field descriptor type isboolean
, then theint
value is narrowed by taking the bitwise AND of value and 1, resulting in value'. Otherwise, the value undergoes value set conversion (2.8.3), resulting in value'.The referenced field in the class or interface is set to value'.
- Linking Exceptions
During resolution of the symbolic reference to the class or interface field, any of the exceptions pertaining to field resolution (5.4.3.2) can be thrown.
Otherwise, if the resolved field is not a
static
(class) field or an interface field, putstatic throws anIncompatibleClassChangeError
.Otherwise, if the resolved field is
final
, it must be declared in the current class or interface, and the instruction must occur in the class or interface initialization method of the current class or interface. Otherwise, anIllegalAccessError
is thrown.- Run-time Exception
Otherwise, if execution of this putstatic instruction causes initialization of the referenced class or interface, putstatic may throw an
Error
as detailed in 5.5.- Notes
A putstatic instruction may be used only to set the value of an interface field on the initialization of that field. Interface fields may be assigned to only once, on execution of an interface variable initialization expression when the interface is initialized (5.5, JLS §9.3.1).