init
MethodsChanges to the Java® Virtual Machine Specification • Version 14-internal+0-adhoc.dlsmith.20190628
This document describes changes to the Java Virtual Machine Specification to allow the names <init>
and <clinit>
to be used for methods that are not instance, class, or interface initialization methods. Initialization methods are distinguished from regular methods by the corresponding descriptor: instance initialization methods have a void
return, and class and interface initialization methods have a void
return and no parameters.
No versioning is applied to the feature: in the unlikely event that an old class file declares or invokes a method named <init>
or <clinit>
without an appropriate descriptor, the class is no longer rejected as malformed.
Changes are described with respect to existing sections of the JVM Specification. New text is indicated like this and deleted text is indicated like this. Explanation and discussion, as needed, is set aside in grey boxes.
History of corner-case behavior (tested on JDK 7 and later versions):
A method named <init>
with a non-void
return type has historically caused a ClassFormatError
. This was formally specified in JVMS 9.
Abstract methods named <init>
that return void
were historically allowed in interfaces, but were un-invokeable. In JDK 8, non-abstract methods named <init>
that return void
were allowed in version 52.0 class files, and were verified as if they were instance initialization methods. Since JDK 9 and JVMS 9, methods in interfaces with name <init>
cause a ClassFormatError
.
Proposed behavior: we now allow non-void
methods named <init>
in classes and interfaces, and do not treat them as instance initialization methods. The pre-JDK 9 treatment of <init>
methods in interfaces is best considered a bug.
A class has zero or more instance initialization methods, each typically corresponding to a constructor written in the Java programming language.
A method is an instance initialization method if all of the following are true:
It is defined in a class (not an interface).
It has the special name <init>
.
It is void
(4.3.3).
In a class, any non-void
method named <init>
is not an instance initialization method. In an interface, any method named <init>
is not an instance initialization method. Such methods cannot be invoked by any Java Virtual Machine instruction (4.4.2, 4.9.2) and are rejected by format checking (4.6, 4.8).
A method with the name <init>
that is not void
is not an instance initialization method, and is treated like any other method.
The declaration and use of an instance initialization method is constrained by the Java Virtual Machine. For the declaration, the method must appear in a class, and the method's access_flags
item and code
array are constrained (4.6, 4.9.2). For a use, an instance initialization method may be invoked only by the invokespecial instruction on an uninitialized class instance (4.10.1.9).
Because the name
<init>
is not a valid identifier in the Java programming language, it cannot be used directly in a program written in the Java programming language.
History of corner-case behavior (tested on JDK 7 and later versions):
A method named <clinit>
with a non-void
return type has historically caused a ClassFormatError
. This was formally specified in JVMS 9.
A void
<clinit>
method with parameters was historically silently ignored—not recognized as a class initialization method, never run during initialization, and impossible to invoke. As of JVMS 9, such declarations were prohibited, but JDK 9+ only enforces this on version 51+ class files.
A void
, 0-parameter <clinit>
method without access flag ACC_STATIC
was historically treated as a class initialization method, and the flag setting was ignored. (Verification treats the method as if it were ACC_STATIC
, although this is not specified in 4.6 or 4.10.1.6.) As of JDK 7, in version 51+ class files these methods were silently ignored. As of JDK 9, version 51+ class files were prohibited from declaring such methods.
Proposed behavior: we now allow methods named <clinit>
with parameters or non-void
returns in classes and interfaces, and do not treat them as class or interface initialization methods. In 4.6, we impose strict constraints on all access flags of class and interface initialization methods in version 57 class files, while specifying the implicit treatment of older access flags.
A class or interface has at most one class or interface initialization method and is initialized by the Java Virtual Machine invoking that method (5.5).
A method is a class or interface initialization method if all of the following are true:
It has the special name <clinit>
.
It is void
and has no parameters (4.3.3).
In a class
file whose version number is 51.0 or above, the method has its ACC_STATIC
flag set and takes no arguments (4.6).
The requirement forACC_STATIC
was introduced in Java SE 7, and for taking no arguments in Java SE 9. In a class file whose version number is 50.0 or below, a method named<clinit>
that isvoid
is considered the class or interface initialization method regardless of the setting of itsACC_STATIC
flag or whether it takes arguments.
Other methods named <clinit>
in a class
file are not class or interface initialization methods. They are never invoked by the Java Virtual Machine itself, cannot be invoked by any Java Virtual Machine instruction (4.9.1), and are rejected by format checking (4.6, 4.8).
This discussion conflates the definition of "class initialization method" with constraints on class initialization methods and other methods named <clinit>
.
A method with the name <clinit>
that is not void
or that has one or more parameters is not a class or interface initialization method, and is treated like any other method.
The declaration of the access_flags
item of a class or interface initialization method is constrained by the Java Virtual Machine (4.6).
Class and interface initialization methods cannot be invoked by any Java Virtual Machine instruction (4.9.2).
Because the name
<clinit>
is not a valid identifier in the Java programming language, it cannot be used directly in a program written in the Java programming language.
The following five instructions invoke methods:
invokevirtual invokes an instance method of an object, dispatching on the (virtual) type of the object. This is the normal method dispatch in the Java programming language.
invokeinterface invokes an interface method, searching the methods implemented by the particular run-time object to find the appropriate method.
invokespecial invokes an instance method requiring special handling, either an instance initialization method (2.9.1) or a method of the current class or its supertypes.
invokestatic invokes a class (static
) method in a named class.
invokedynamic invokes the method which is the target of the call site object bound to the invokedynamic instruction. The call site object was bound to a specific lexical occurrence of the invokedynamic instruction by the Java Virtual Machine as a result of running a bootstrap method before the first execution of the instruction. Therefore, each occurrence of an invokedynamic instruction has a unique linkage state, unlike the other instructions which invoke methods.
The method return instructions, which are distinguished by return type, are ireturn (used to return values of type boolean
, byte
, char
, short
, or int
), lreturn, freturn, dreturn, and areturn. In addition, the return instruction is used to return from methods declared to be void, instance initialization methods, and class or interface initialization methods.
Initialization methods are "methods declared to be void".
class
File FormatNames of methods, fields, local variables, and formal parameters are stored as unqualified names. An unqualified name must contain at least one Unicode code point and must not contain any of the ASCII characters .
;
[
/
(that is, period or semicolon or left square bracket or forward slash).
Method names are further constrained so that, with the exception of the special method names <init>
and <clinit>
(2.9), they must not contain the ASCII characters <
or >
(that is, left angle bracket or right angle bracket).
Note that a field name or interface method name may be<init>
or<clinit>
, but no method invocation instruction may reference<clinit>
and only the invokespecial instruction (6.5.invokespecial) may reference<init>
.
Design discussion: since it is no longer the case that all methods whose names have the form <foo>
are "special", we could consider relaxing this constraint, allowing <
and >
to be used freely in method names. On the other hand, it may be useful to hold additional names like <foo>
in reserve, should the JVM have additional special-purpose needs in the future.
CONSTANT_Fieldref_info
, CONSTANT_Methodref_info
, and CONSTANT_InterfaceMethodref_info
StructuresFields, methods, and interface methods are represented by similar structures:
CONSTANT_Fieldref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_Methodref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_InterfaceMethodref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
The items of these structures are as follows:
The tag
item of a CONSTANT_Fieldref_info
structure has the value CONSTANT_Fieldref
(9).
The tag
item of a CONSTANT_Methodref_info
structure has the value CONSTANT_Methodref
(10).
The tag
item of a CONSTANT_InterfaceMethodref_info
structure has the value CONSTANT_InterfaceMethodref
(11).
The value of the class_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Class_info
structure (4.4.1) representing a class or interface type that has the field or method as a member.
In a CONSTANT_Fieldref_info
structure, the class_index
item may be either a class type or an interface type.
In a CONSTANT_Methodref_info
structure, the class_index
item must be a class type, not an interface type.
In a CONSTANT_InterfaceMethodref_info
structure, the class_index
item must be an interface type, not a class type.
The value of the name_and_type_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_NameAndType_info
structure (4.4.6). This constant_pool
entry indicates the name and descriptor of the field or method.
In a CONSTANT_Fieldref_info
structure, the indicated descriptor must be a field descriptor (4.3.2). Otherwise, the indicated descriptor must be a method descriptor (4.3.3).
If the name of the method in a CONSTANT_Methodref_info
structure begins with a '<
' ('\u003c
'), then the name must be the special name <init>
, representing an instance initialization method (2.9.1). The return type of such a method must be void
.
In a CONSTANT_Methodref_info
structure or a CONSTANT_InterfaceMethodref
structure, the referenced name must be a valid method name (4.2.2).
CONSTANT_NameAndType_info
StructureThe CONSTANT_NameAndType_info
structure is used to represent a field or method, without indicating which class or interface type it belongs to:
CONSTANT_NameAndType_info {
u1 tag;
u2 name_index;
u2 descriptor_index;
}
The items of the CONSTANT_NameAndType_info
structure are as follows:
The tag
item has the value CONSTANT_NameAndType
(12).
The value of the name_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing either a valid unqualified name denoting a field or method (4.2.2), or the special method name .<init>
(2.9.1)
The value of the descriptor_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing a valid field descriptor or method descriptor (4.3.2, 4.3.3).
CONSTANT_MethodHandle_info
StructureThe CONSTANT_MethodHandle_info
structure is used to represent a method handle:
CONSTANT_MethodHandle_info {
u1 tag;
u1 reference_kind;
u2 reference_index;
}
The items of the CONSTANT_MethodHandle_info
structure are the following:
The tag
item has the value CONSTANT_MethodHandle
(15).
The value of the reference_kind
item must be in the range 1 to 9. The value denotes the kind of this method handle, which characterizes its bytecode behavior (5.4.3.5).
The value of the reference_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be as follows:
If the value of the reference_kind
item is 1 (REF_getField
), 2 (REF_getStatic
), 3 (REF_putField
), or 4 (REF_putStatic
), then the constant_pool
entry at that index must be a CONSTANT_Fieldref_info
structure (4.4.2) representing a field for which a method handle is to be created.
If the value of the reference_kind
item is 5 (REF_invokeVirtual
) or 8 (REF_newInvokeSpecial
), then the constant_pool
entry at that index must be a CONSTANT_Methodref_info
structure (4.4.2) representing a class's method or constructor (2.9.1) for which a method handle is to be created.
If the value of the reference_kind
item is 6 (REF_invokeStatic
) or 7 (REF_invokeSpecial
), then if the class
file version number is less than 52.0, the constant_pool
entry at that index must be a CONSTANT_Methodref_info
structure representing a class's method for which a method handle is to be created; if the class
file version number is 52.0 or above, the constant_pool
entry at that index must be either a CONSTANT_Methodref_info
structure or a CONSTANT_InterfaceMethodref_info
structure (4.4.2) representing a class's or interface's method for which a method handle is to be created.
If the value of the reference_kind
item is 9 (REF_invokeInterface
), then the constant_pool
entry at that index must be a CONSTANT_InterfaceMethodref_info
structure representing an interface's method for which a method handle is to be created.
If the value of the reference_kind
item is 5 (REF_invokeVirtual
), 6 (REF_invokeStatic
), 7 (REF_invokeSpecial
), or 9 (REF_invokeInterface
), the name of the method represented method referenced by a CONSTANT_Methodref_info
structure or a CONSTANT_InterfaceMethodref_info
structure must not be an instance, class, or interface initialization method (2.9.1, 2.9.2).<init>
or <clinit>
If the value is 8 (REF_newInvokeSpecial
), the name of the method represented method referenced by a CONSTANT_Methodref_info
structure must be an instance initialization method.<init>
CONSTANT_Dynamic_info
and CONSTANT_InvokeDynamic_info
StructuresMost structures in the constant_pool
table represent entities directly, by combining names, descriptors, and values recorded statically in the table. In contrast, the CONSTANT_Dynamic_info
and CONSTANT_InvokeDynamic_info
structures represent entities indirectly, by pointing to code which computes an entity dynamically. The code, called a bootstrap method, is invoked by the Java Virtual Machine during resolution of symbolic references derived from these structures (5.1, 5.4.3.6). Each structure specifies a bootstrap method as well as an auxiliary name and type that characterize the entity to be computed. In more detail:
The CONSTANT_Dynamic_info
structure is used to represent a dynamically-computed constant, an arbitrary value that is produced by invocation of a bootstrap method in the course of an ldc instruction (6.5.ldc), among others. The auxiliary type specified by the structure constrains the type of the dynamically-computed constant.
The CONSTANT_InvokeDynamic_info
structure is used to represent a dynamically-computed call site, an instance of java.lang.invoke.CallSite
that is produced by invocation of a bootstrap method in the course of an invokedynamic instruction (6.5.invokedynamic). The auxiliary type specified by the structure constrains the method type of the dynamically-computed call site.
CONSTANT_Dynamic_info {
u1 tag;
u2 bootstrap_method_attr_index;
u2 name_and_type_index;
}
CONSTANT_InvokeDynamic_info {
u1 tag;
u2 bootstrap_method_attr_index;
u2 name_and_type_index;
}
The items of these structures are as follows:
The tag
item of a CONSTANT_Dynamic_info
structure has the value CONSTANT_Dynamic
(17).
The tag
item of a CONSTANT_InvokeDynamic_info
structure has the value CONSTANT_InvokeDynamic
(18).
The value of the bootstrap_method_attr_index
item must be a valid index into the bootstrap_methods
array of the bootstrap method table of this class
file (4.7.23).
CONSTANT_Dynamic_info
structures are unique in that they are syntactically allowed to refer to themselves via the bootstrap method table. Rather than mandating that such cycles are detected when classes are loaded (a potentially expensive check), we permit cycles initially but mandate a failure at resolution (5.4.3.6).
The value of the name_and_type_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_NameAndType_info
structure (4.4.6). This constant_pool
entry indicates a name and descriptor.
In a CONSTANT_Dynamic_info
structure, the indicated descriptor must be a field descriptor (4.3.2).
In a CONSTANT_InvokeDynamic_info
structure, the indicated name must be a valid method name (4.2.2) and the indicated descriptor must be a method descriptor (4.3.3). If the name is <clinit>
, the method descriptor must not be ()V
. If the name is <init>
, the method descriptor must not have return descriptor V
.
The check for a valid method name is a bug fix: names like x<y
are valid as Dynamic
"field" names, but not as InvokeDynamic
"method" names.
The check prohibiting InvokeDynamic
call sites that look like initialization method invocations previously appeared in verification (4.10.1.9.invokedynamic), but is a simple structural check that logically belongs here.
In practice, in JDK 12 some of these checks occur as part of format checking, and some only occur if a corresponding invokedynamic
instruction is verified.
Each method, including each instance initialization method (2.9.1) and the class or interface initialization method (2.9.2), is described by a method_info
structure.
No two methods in one class
file may have the same name and descriptor (4.3.3).
The structure has the following format:
method_info {
u2 access_flags;
u2 name_index;
u2 descriptor_index;
u2 attributes_count;
attribute_info attributes[attributes_count];
}
The items of the method_info
structure are as follows:
The value of the access_flags
item is a mask of flags used to denote access permission to and properties of this method. The interpretation of each flag, when set, is specified in Table 4.6-A.
Table 4.6-A. Method access and property flags
Flag Name | Value | Interpretation |
---|---|---|
ACC_PUBLIC |
0x0001 | Declared public ; may be accessed from outside its package. |
ACC_PRIVATE |
0x0002 | Declared private ; accessible only within the defining class and other classes belonging to the same nest (5.4.4). |
ACC_PROTECTED |
0x0004 | Declared protected ; may be accessed within subclasses. |
ACC_STATIC |
0x0008 | Declared static . |
ACC_FINAL |
0x0010 | Declared final ; must not be overridden (5.4.5). |
ACC_SYNCHRONIZED |
0x0020 | Declared synchronized ; invocation is wrapped by a monitor use. |
ACC_BRIDGE |
0x0040 | A bridge method, generated by the compiler. |
ACC_VARARGS |
0x0080 | Declared with variable number of arguments. |
ACC_NATIVE |
0x0100 | Declared native ; implemented in a language other than the Java programming language. |
ACC_ABSTRACT |
0x0400 | Declared abstract ; no implementation is provided. |
ACC_STRICT |
0x0800 | Declared strictfp ; floating-point mode is FP-strict. |
ACC_SYNTHETIC |
0x1000 | Declared synthetic; not present in the source code. |
An instance initialization method (that is, a method with name <init>
and return descriptor V
) may have at most one of its ACC_PUBLIC
, ACC_PRIVATE
, and ACC_PROTECTED
flags set, and may also have its ACC_VARARGS
, ACC_STRICT
, and ACC_SYNTHETIC
flags set, but must not have any of the other flags in Table 4.6-A set.
In a class
file whose version number is 57.0 or above, a class or interface initialization method (that is, a method declared with name <clinit>
and descriptor ()V
) must have its ACC_STATIC
flag set, and may also have its ACC_STRICT
and ACC_SYNTHETIC
flags set, but must not have any of the other flags in Table 4.6-A set.
In a class
file whose version number is less than 57.0, a class or interface initialization method may have any of the flags in Table 4.6-A set, in any combination, but all flags except ACC_STATIC
, ACC_STRICT
, and ACC_SYNTHETIC
are treated by the Java Virtual Machine as if they were not set. If the major version number is 51 to 56, inclusive, the ACC_STATIC
flag must be set; if the major version number is 50 or below, the ACC_STATIC
flag is treated by the Java Virtual Machine as if it were set.
Methods of classes other than instance initialization methods and the class initialization method may have any of the flags in Table 4.6-A set. However, each non-initialization method of a class may have at most one of its ACC_PUBLIC
, ACC_PRIVATE
, and ACC_PROTECTED
flags set (JLS §8.4.3).
Methods of interfaces other than the interface initialization method may have any of the flags in Table 4.6-A set except ACC_PROTECTED
, ACC_FINAL
, ACC_SYNCHRONIZED
, and ACC_NATIVE
(JLS §9.4). ; exactly one of the ACC_PUBLIC
or ACC_PRIVATE
flags must be set. In a class
file whose version number is less than 52.0, each non-initialization method of an interface must have its ACC_PUBLIC
and ACC_ABSTRACT
flags set; in a .class
file whose version number is 52.0 or above, each method of an interface must have exactly one of its ACC_PUBLIC
and ACC_PRIVATE
flags set
If a method of a class or interface has its ACC_ABSTRACT
flag set, it must not have any of its ACC_PRIVATE
, ACC_STATIC
, ACC_FINAL
, ACC_SYNCHRONIZED
, ACC_NATIVE
, or ACC_STRICT
flags set.
An instance initialization method (2.9.1) may have at most one of its ACC_PUBLIC
, ACC_PRIVATE
, and ACC_PROTECTED
flags set, and may also have its ACC_VARARGS
, ACC_STRICT
, and ACC_SYNTHETIC
flags set, but must not have any of the other flags in Table 4.6-A set.
In a class
file whose version number is 51.0 or above, a method whose name is <clinit>
must have its ACC_STATIC
flag set.
A class or interface initialization method (2.9.2) is called implicitly by the Java Virtual Machine. The value of its access_flags
item is ignored except for the setting of the ACC_STATIC
and ACC_STRICT
flags, and the method is exempt from the preceding rules about legal combinations of flags.
The ACC_BRIDGE
flag is used to indicate a bridge method generated by a compiler for the Java programming language.
The ACC_VARARGS
flag indicates that this method takes a variable number of arguments at the source code level. A method declared to take a variable number of arguments must be compiled with the ACC_VARARGS
flag set to 1. All other methods must be compiled with the ACC_VARARGS
flag set to 0.
That's how all flags work. No need to be so descriptive about this one flag.
The ACC_SYNTHETIC
flag indicates that this method was generated by a compiler and does not appear in source code, unless it is and is not one of the methods named in 4.7.8.
All bits of the access_flags
item not assigned in Table 4.6-A are reserved for future use. They should be set to zero in generated class
files and should be ignored by Java Virtual Machine implementations.
The value of the name_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing either a valid unqualified name denoting a method (4.2.2), or (if this method is in a class rather than an interface) the special method name a valid unqualified method name (4.2.2).<init>
, or the special method name <clinit>
The value of the descriptor_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure representing a valid method descriptor (4.3.3). Furthermore:
If this method is in a class rather than an interface, and the name of the method is <init>
, then the descriptor must denote a void
method.
If the name of the method is <clinit>
, then the descriptor must denote a void
method, and, in a class
file whose version number is 51.0 or above, a method that takes no arguments.
If this method is declared in an interface, and the name of the method is <init>
, then the descriptor must not denote a void
method.
Instance initialization methods are not allowed in interfaces.
A future edition of this specification may require that the last parameter descriptor of the method descriptor is an array type if the
ACC_VARARGS
flag is set in theaccess_flags
item.
The value of the attributes_count
item indicates the number of additional attributes of this method.
Each value of the attributes
table must be an attribute_info
structure (4.7).
A method can have any number of optional attributes associated with it.
The attributes defined by this specification as appearing in the attributes
table of a method_info
structure are listed in Table 4.7-C.
The rules concerning attributes defined to appear in the attributes
table of a method_info
structure are given in 4.7.
The rules concerning non-predefined attributes in the attributes
table of a method_info
structure are given in 4.7.1.
Code
AttributeThe Code
attribute is a variable-length attribute in the attributes
table of a method_info
structure (4.6). A Code
attribute contains the Java Virtual Machine instructions and auxiliary information for a method, including an instance initialization method and a class or interface initialization method (2.9.1, 2.9.2).
If the method is either native
or abstract
, and is not a class or interface initialization method, then its method_info
structure must not have a Code
attribute in its attributes
table. Otherwise, its method_info
structure must have exactly one Code
attribute in its attributes
table.
The Code
attribute has the following format:
Code_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 max_stack;
u2 max_locals;
u4 code_length;
u1 code[code_length];
u2 exception_table_length;
{ u2 start_pc;
u2 end_pc;
u2 handler_pc;
u2 catch_type;
} exception_table[exception_table_length];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
The items of the Code_attribute
structure are as follows:
The value of the attribute_name_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing the string "Code
".
The value of the attribute_length
item indicates the length of the attribute, excluding the initial six bytes.
The value of the max_stack
item gives the maximum depth of the operand stack of this method (2.6.2) at any point during execution of the method.
The value of the max_locals
item gives the number of local variables in the local variable array allocated upon invocation of this method (2.6.1), including the local variables used to pass parameters to the method on its invocation.
The greatest local variable index for a value of type long
or double
is max_locals - 2
. The greatest local variable index for a value of any other type is max_locals - 1
.
The value of the code_length
item gives the number of bytes in the code
array for this method.
The value of code_length
must be greater than zero (as the code
array must not be empty) and less than 65536.
The code
array gives the actual bytes of Java Virtual Machine code that implement the method.
When the code
array is read into memory on a byte-addressable machine, if the first byte of the array is aligned on a 4-byte boundary, the tableswitch and lookupswitch 32-bit offsets will be 4-byte aligned. (Refer to the descriptions of those instructions for more information on the consequences of code
array alignment.)
The detailed constraints on the contents of the code
array are extensive and are given in a separate section (4.9).
The value of the exception_table_length
item gives the number of entries in the exception_table
table.
Each entry in the exception_table
array describes one exception handler in the code
array. The order of the handlers in the exception_table
array is significant (2.10).
Each exception_table
entry contains the following four items:
The values of the two items start_pc
and end_pc
indicate the ranges in the code
array at which the exception handler is active. The value of start_pc
must be a valid index into the code
array of the opcode of an instruction. The value of end_pc
either must be a valid index into the code
array of the opcode of an instruction or must be equal to code_length
, the length of the code
array. The value of start_pc
must be less than the value of end_pc
.
The start_pc
is inclusive and end_pc
is exclusive; that is, the exception handler must be active while the program counter is within the interval [start_pc
, end_pc
).
The fact that
end_pc
is exclusive is a historical mistake in the design of the Java Virtual Machine: if the Java Virtual Machine code for a method is exactly 65535 bytes long and ends with an instruction that is 1 byte long, then that instruction cannot be protected by an exception handler. A compiler writer can work around this bug by limiting the maximum size of the generated Java Virtual Machine code for any method, instance initialization method, or static initializer (the size of any code array) to 65534 bytes.
The value of the handler_pc
item indicates the start of the exception handler. The value of the item must be a valid index into the code
array and must be the index of the opcode of an instruction.
If the value of the catch_type
item is nonzero, it must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Class_info
structure (4.4.1) representing a class of exceptions that this exception handler is designated to catch. The exception handler will be called only if the thrown exception is an instance of the given class or one of its subclasses.
The verifier checks that the class is
Throwable
or a subclass ofThrowable
(4.9.2).
If the value of the catch_type
item is zero, this exception handler is called for all exceptions.
This is used to implement
finally
(3.13).
The value of the attributes_count
item indicates the number of attributes of the Code
attribute.
Each value of the attributes
table must be an attribute_info
structure (4.7).
A Code
attribute can have any number of optional attributes associated with it.
The attributes defined by this specification as appearing in the attributes
table of a Code
attribute are listed in Table 4.7-C.
The rules concerning attributes defined to appear in the attributes
table of a Code
attribute are given in 4.7.
The rules concerning non-predefined attributes in the attributes
table of a Code
attribute are given in 4.7.1.
The code for a method, instance initialization method (2.9.1), or class or interface initialization method (2.9.2) is stored in the code
array of the Code
attribute of a method_info
structure of a class
file (4.7.3). This section describes the constraints associated with the contents of the Code_attribute
structure.
The static constraints on a class
file are those defining the well-formedness of the file. These constraints have been given in the previous sections, except for static constraints on the code in the class
file. The static constraints on the code in a class
file specify how Java Virtual Machine instructions must be laid out in the code
array and what the operands of individual instructions must be.
The static constraints on the instructions in the code
array are as follows:
Only instances of the instructions documented in 6.5 may appear in the code
array. Instances of instructions using the reserved opcodes (6.2) or any opcodes not documented in this specification must not appear in the code
array.
If the class
file version number is 51.0 or above, then neither the jsr opcode or the jsr_w opcode may appear in the code
array.
The opcode of the first instruction in the code
array begins at index 0
.
For each instruction in the code
array except the last, the index of the opcode of the next instruction equals the index of the opcode of the current instruction plus the length of that instruction, including all its operands.
The wide instruction is treated like any other instruction for these purposes; the opcode specifying the operation that a wide instruction is to modify is treated as one of the operands of that wide instruction. That opcode must never be directly reachable by the computation.
The last byte of the last instruction in the code
array must be the byte at index code_length - 1
.
The static constraints on the operands of instructions in the code
array are as follows:
The target of each jump and branch instruction (jsr, jsr_w, goto, goto_w, ifeq, ifne, ifle, iflt, ifge, ifgt, ifnull, ifnonnull, if_icmpeq, if_icmpne, if_icmple, if_icmplt, if_icmpge, if_icmpgt, if_acmpeq, if_acmpne) must be the opcode of an instruction within this method.
The target of a jump or branch instruction must never be the opcode used to specify the operation to be modified by a wide instruction; a jump or branch target may be the wide instruction itself.
Each target, including the default, of each tableswitch instruction must be the opcode of an instruction within this method.
Each tableswitch instruction must have a number of entries in its jump table that is consistent with the value of its low and high jump table operands, and its low value must be less than or equal to its high value.
No target of a tableswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a tableswitch target may be a wide instruction itself.
Each target, including the default, of each lookupswitch instruction must be the opcode of an instruction within this method.
Each lookupswitch instruction must have a number of match-offset pairs that is consistent with the value of its npairs operand. The match-offset pairs must be sorted in increasing numerical order by signed match value.
No target of a lookupswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a lookupswitch target may be a wide instruction itself.
The operands of each ldc instruction and each ldc_w instruction must represent a valid index into the constant_pool
table. The constant pool entry referenced by that index must be loadable (4.4), and not any of the following:
An entry of kind CONSTANT_Long
or CONSTANT_Double
.
An entry of kind CONSTANT_Dynamic
that references a CONSTANT_NameAndType_info
structure which indicates a descriptor of J
(denoting long
) or D
(denoting double
).
The operands of each ldc2_w instruction must represent a valid index into the constant_pool
table. The constant pool entry referenced by that index must be loadable, and in particular one of the following:
An entry of kind CONSTANT_Long
or CONSTANT_Double
.
An entry of kind CONSTANT_Dynamic
that references a CONSTANT_NameAndType_info
structure which indicates a descriptor of J
(denoting long
) or D
(denoting double
).
The subsequent constant pool index must also be a valid index into the constant pool, and the constant pool entry at that index must not be used.
The operands of each getfield, putfield, getstatic, and putstatic instruction must represent a valid index into the constant_pool
table. The constant pool entry referenced by that index must be of kind CONSTANT_Fieldref
.
The indexbyte operands of each invokevirtual instruction must represent a valid index into the constant_pool
table. The constant pool entry referenced by that index must be of kind CONSTANT_Methodref
.
The indexbyte operands of each invokespecial and invokestatic instruction must represent a valid index into the constant_pool
table. If the class
file version number is less than 52.0, the constant pool entry referenced by that index must be of kind CONSTANT_Methodref
; if the class
file version number is 52.0 or above, the constant pool entry referenced by that index must be of kind CONSTANT_Methodref
or CONSTANT_InterfaceMethodref
.
The indexbyte operands of each invokeinterface instruction must represent a valid index into the constant_pool
table. The constant pool entry referenced by that index must be of kind CONSTANT_InterfaceMethodref
.
The indexbyte operands of each invokevirtual, invokeinterface, invokespecial, and invokestatic instruction must represent a valid index into the constant_pool
table. The constant pool entry referenced by that index must be of kind CONSTANT_Methodref
or CONSTANT_InterfaceMethodref
.
An invokevirtual instruction must reference a constant pool entry of kind CONSTANT_Methodref
.
An invokeinterface instruction must reference a constant pool entry of kind CONSTANT_InterfaceMethodref
.
In a class file whose version number is less than 52.0, an invokespecial or an invokestatic instruction must reference a constant pool entry of kind CONSTANT_Methodref
.
The CONSTANT_Methodref
or CONSTANT_InterfaceMethodref
referenced by an invokevirtual, invokeinterface, invokespecial, or invokestatic instruction must not have name <clinit>
with descriptor ()V
.
The CONSTANT_Methodref
or CONSTANT_InterfaceMethodref
referenced by an invokevirtual, invokeinterface, or invokestatic instruction must not have name <init>
with return descriptor V
.
The value of the count operand of each invokeinterface instruction must reflect the number of local variables necessary to store the arguments to be passed to the interface method, as implied by the descriptor of the CONSTANT_NameAndType_info
structure referenced by the CONSTANT_InterfaceMethodref
constant pool entry.
The fourth operand byte of each invokeinterface instruction must have the value zero.
The indexbyte operands of each invokedynamic instruction must represent a valid index into the constant_pool
table. The constant pool entry referenced by that index must be of kind CONSTANT_InvokeDynamic
.
The third and fourth operand bytes of each invokedynamic instruction must have the value zero.
Only the invokespecial instruction is allowed to invoke an instance initialization method (2.9.1).
No other method whose name begins with the character '<
' ('\u003c
') may be called by the method invocation instructions. In particular, the class or interface initialization method specially named <clinit>
is never called explicitly from Java Virtual Machine instructions, but only implicitly by the Java Virtual Machine itself.
The operands of each instanceof, checkcast, new, and anewarray instruction, and the indexbyte operands of each multianewarray instruction, must represent a valid index into the constant_pool
table. The constant pool entry referenced by that index must be of kind CONSTANT_Class
.
No new instruction may reference a constant pool entry of kind CONSTANT_Class
that represents an array type (4.3.2). The new instruction cannot be used to create an array.
No anewarray instruction may be used to create an array of more than 255 dimensions.
A multianewarray instruction must be used only to create an array of a type that has at least as many dimensions as the value of its dimensions operand. That is, while a multianewarray instruction is not required to create all of the dimensions of the array type referenced by its indexbyte operands, it must not attempt to create more dimensions than are in the array type.
The dimensions operand of each multianewarray instruction must not be zero.
The atype operand of each newarray instruction must take one of the values T_BOOLEAN
(4), T_CHAR
(5), T_FLOAT
(6), T_DOUBLE
(7), T_BYTE
(8), T_SHORT
(9), T_INT
(10), or T_LONG
(11).
The index operand of each iload, fload, aload, istore, fstore, astore, iinc, and ret instruction must be a non-negative integer no greater than max_locals - 1
.
The implicit index of each iload_<n>, fload_<n>, aload_<n>, istore_<n>, fstore_<n>, and astore_<n> instruction must be no greater than max_locals - 1
.
The index operand of each lload, dload, lstore, and dstore instruction must be no greater than max_locals - 2
.
The implicit index of each lload_<n>, dload_<n>, lstore_<n>, and dstore_<n> instruction must be no greater than max_locals - 2
.
The indexbyte operands of each wide instruction modifying an iload, fload, aload, istore, fstore, astore, iinc, or ret instruction must represent a non-negative integer no greater than max_locals - 1
.
The indexbyte operands of each wide instruction modifying an lload, dload, lstore, or dstore instruction must represent a non-negative integer no greater than max_locals - 2
.
The structural constraints on the code
array specify constraints on relationships between Java Virtual Machine instructions. The structural constraints are as follows:
Each instruction must only be executed with the appropriate type and number of arguments in the operand stack and local variable array, regardless of the execution path that leads to its invocation.
An instruction operating on values of type int
is also permitted to operate on values of type boolean
, byte
, char
, and short
.
As noted in 2.3.4 and 2.11.1, the Java Virtual Machine internally converts values of types
boolean
,byte
,short
, andchar
to typeint
.)
If an instruction can be executed along several different execution paths, the operand stack must have the same depth (2.6.2) prior to the execution of the instruction, regardless of the path taken.
At no point during execution can the operand stack grow to a depth greater than that implied by the max_stack
item.
At no point during execution can more values be popped from the operand stack than it contains.
At no point during execution can the order of the local variable pair holding a value of type long
or double
be reversed or the pair split up. At no point can the local variables of such a pair be operated on individually.
No local variable (or local variable pair, in the case of a value of type long
or double
) can be accessed before it is assigned a value.
Each invokespecial instruction must name one of the following:
an instance initialization method (2.9.1) (that is, an <init>
method with return descriptor V
)
a method in the current class or interface
a method in a superclass of the current class
a method in a direct superinterface of the current class or interface
a method in Object
If an invokespecial instruction names an instance initialization method, then the target reference on the operand stack must be an uninitialized class instance. An instance initialization method must never be invoked on an initialized class instance. In addition:
If the target reference on the operand stack is an uninitialized class instance for the current class, then invokespecial must name an instance initialization method from the current class or its direct superclass.
If an invokespecial instruction names an instance initialization method and the target reference on the operand stack is a class instance created by an earlier new instruction, then invokespecial must name an instance initialization method from the class of that class instance.
If an invokespecial instruction names a method which is not an instance initialization method, then the target reference on the operand stack must be a class instance whose type is assignment compatible with the current class (JLS §5.2).
The general rule for invokespecial is that the class or interface named by invokespecial must be be "above" the caller class or interface, while the receiver object targeted by invokespecial must be "at" or "below" the caller class or interface. The latter clause is especially important: a class or interface can only perform invokespecial on its own objects. See 4.10.1.9.invokespecial for an explanation of how the latter clause is implemented in Prolog.
Each instance initialization method, except for the instance initialization method derived from the constructor of class Object
, must call either another instance initialization method of this
or an instance initialization method of its direct superclass super
before its instance members are accessed.
However, instance fields of this
that are declared in the current class may be assigned by putfield before calling any instance initialization method.
When any instance method is invoked or when any instance variable is accessed, the class instance that contains the instance method or instance variable must already be initialized.
<init>
method, the handler must throw an exception or loop forever; and
<init>
method, the uninitialized class instance must remain uninitialized.There must never be an uninitialized class instance on the operand stack or in a local variable when a jsr or jsr_w instruction is executed.
The type of every class instance that is the target of a method invocation instruction (that is, the type of the target reference on the operand stack) must be assignment compatible with the class or interface type specified in the instruction.
The types of the arguments to each method invocation must be method invocation compatible with the method descriptor (JLS §5.3, 4.3.3).
Each return instruction must match its method's return type:
If the method returns a boolean
, byte
, char
, short
, or int
, only the ireturn instruction may be used.
If the method returns a float
, long
, or double
, only an freturn, lreturn, or dreturn instruction, respectively, may be used.
If the method returns a reference
type, only an areturn instruction may be used, and the type of the returned value must be assignment compatible with the return descriptor of the method (4.3.3).
All instance initialization methods, class or interface initialization methods, and methods declared to return void
must use only the return instruction.
The type of every class instance accessed by a getfield instruction or modified by a putfield instruction (that is, the type of the target reference on the operand stack) must be assignment compatible with the class type specified in the instruction.
The type of every value stored by a putfield or putstatic instruction must be compatible with the descriptor of the field (4.3.2) of the class instance or class being stored into:
If the descriptor type is boolean
, byte
, char
, short
, or int
, then the value must be an int
.
If the descriptor type is float
, long
, or double
, then the value must be a float
, long
, or double
, respectively.
If the descriptor type is a reference
type, then the value must be of a type that is assignment compatible with the descriptor type.
The type of every value stored into an array by an aastore instruction must be a reference
type.
The component type of the array being stored into by the aastore instruction must also be a reference
type.
Each athrow instruction must throw only values that are instances of class Throwable
or of subclasses of Throwable
.
Each class mentioned in a catch_type
item of the exception_table
array of the method's Code_attribute
structure must be Throwable
or a subclass of Throwable
.
If getfield or putfield is used to access a protected
field declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must be assignment compatible with the current class.
If invokevirtual or invokespecial is used to access a protected
method declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must be assignment compatible with the current class.
Execution never falls off the bottom of the code
array.
No return address (a value of type returnAddress
) may be loaded from a local variable.
The instruction following each jsr or jsr_w instruction may be returned to only by a single ret instruction.
No jsr or jsr_w instruction that is returned to may be used to recursively call a subroutine if that subroutine is already present in the subroutine call chain. (Subroutines can be nested when using try
-finally
constructs from within a finally
clause.)
Each instance of type returnAddress
can be returned to at most once.
If a ret instruction returns to a point in the subroutine call chain above the ret instruction corresponding to a given instance of type returnAddress
, then that instance can never be used as a return address.
class
FilesWe stipulate the existence of 28 Prolog predicates ("accessors") that have certain expected behavior but whose formal definitions are not given in this specification.
Extracts the name, ClassName
, of the class Class
.
True iff the class, Class
, is an interface.
True iff the class, Class
, is not a final
class.
Extracts the name, SuperClassName
, of the superclass of class Class
.
Extracts a list, Interfaces
, of the direct superinterfaces of the class Class
.
Extracts a list, Methods
, of the methods declared in the class Class
.
Extracts a list, Attributes
, of the attributes of the class Class
.
Each attribute is represented as a functor application of the form attribute(AttributeName, AttributeContents)
, where AttributeName
is the name of the attribute. The format of the attribute's contents is unspecified.
Extracts the defining class loader, Loader
, of the class Class
.
True iff the class loader Loader
is the bootstrap class loader.
True iff there exists a class named Name
whose representation (in accordance with this specification) when loaded by the class loader InitiatingLoader
is ClassDefinition
.
Extracts the name, Name
, of the method Method
.
Extracts the access flags, AccessFlags
, of the method Method
.
Extracts the descriptor, Descriptor
, of the method Method
.
Extracts a list, Attributes
, of the attributes of the method Method
.
True iff Method
(regardless of class) is an instance initialization method (2.9.1).<init>
True iff Method
(regardless of class) is not an instance initialization method.<init>
True iff Method
in class Class
is not final
.
True iff Method
in class Class
is static
.
True iff Method
in class Class
is not static
.
True iff Method
in class Class
is private
.
True iff Method
in class Class
is not private
.
True iff there is a member named MemberName
with descriptor MemberDescriptor
in the class MemberClass
and it is protected
.
True iff there is a member named MemberName
with descriptor MemberDescriptor
in the class MemberClass
and it is not protected
.
Converts a field descriptor, Descriptor
, into the corresponding verification type Type
(4.10.1.2).
Converts a method descriptor, Descriptor
, into a list of verification types, ArgTypeList
, corresponding to the method argument types, and a verification type, ReturnType
, corresponding to the return type.
Extracts the instruction stream, ParsedCode
, of the method Method
in Class
, as well as the maximum operand stack size, MaxStack
, the maximal number of local variables, FrameSize
, the exception handlers, Handlers
, and the stack map StackMap
.
The representation of the instruction stream and stack map attribute must be as specified in 4.10.1.3 and 4.10.1.4.
True iff the package names of Class1
and Class2
are the same.
True iff the package names of Class1
and Class2
are different.
When type checking a method's body, it is convenient to access information about the method. For this purpose, we define an environment, a six-tuple consisting of:
We specify accessors to extract information from the environment.
allInstructions(Environment, Instructions) :-
Environment = environment(_Class, _Method, _ReturnType,
Instructions, _, _).
exceptionHandlers(Environment, Handlers) :-
Environment = environment(_Class, _Method, _ReturnType,
_Instructions, _, Handlers).
maxOperandStackLength(Environment, MaxStack) :-
Environment = environment(_Class, _Method, _ReturnType,
_Instructions, MaxStack, _Handlers).
thisClass(Environment, class(ClassName, L)) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
thisMethodReturnType(Environment, ReturnType) :-
Environment = environment(_Class, _Method, ReturnType,
_Instructions, _, _).
We specify additional predicates to extract higher-level information from the environment.
offsetStackFrame(Environment, Offset, StackFrame) :-
allInstructions(Environment, Instructions),
member(stackMap(Offset, StackFrame), Instructions).
currentClassLoader(Environment, Loader) :-
thisClass(Environment, class(_, Loader)).
Finally, we specify a general predicate used throughout the type rules:
notMember(_, []).
notMember(X, [A | More]) :- X \= A, notMember(X, More).
The principle guiding the determination as to which accessors are stipulated and which are fully specified is that we do not want to over-specify the representation of the
class
file. Providing specific accessors to theClass
orMethod
term would force us to completely specify the format for a Prolog term representing theclass
file.
Individual bytecode instructions are represented in Prolog as terms whose functor is the name of the instruction and whose arguments are its parsed operands.
For example, an aload instruction is represented as the term
aload(N)
, which includes the indexN
that is the operand of the instruction.
The instructions as a whole are represented as a list of terms of the form:
instruction(Offset, AnInstruction)
For example,
instruction(21, aload(1))
.
The order of instructions in this list must be the same as in the class
file.
Some instructions have operands that refer to entries in the constant_pool
table representing fields, methods, and dynamically-computed call sites. Such entries are represented as functor applications of the form:
field(FieldClassName, FieldName, FieldDescriptor)
for a constant pool entry that is a CONSTANT_Fieldref_info
structure (4.4.2).
FieldClassName
is the name of the class referenced by the class_index
item in the structure. FieldName
and FieldDescriptor
correspond to the name and field descriptor referenced by the name_and_type_index
item of the structure.
method(MethodClassName, MethodName, MethodDescriptor)
for a constant pool entry that is a CONSTANT_Methodref_info
structure (4.4.2).
MethodClassName
is the name of the class referenced by the class_index
item of the structure. MethodName
and MethodDescriptor
correspond to the name and method descriptor referenced by the name_and_type_index
item of the structure.
imethod(MethodIntfName, MethodName, MethodDescriptor)
for a constant pool entry that is a CONSTANT_InterfaceMethodref_info
structure (4.4.2).
MethodIntfName
is the name of the interface referenced by the class_index
item of the structure. MethodName
and MethodDescriptor
correspond to the name and method descriptor referenced by the name_and_type_index
item of the structure.
dmethod(CallSiteName, MethodDescriptor)
for a constant pool entry that is a CONSTANT_InvokeDynamic_info
structure (4.4.10).
CallSiteName
and MethodDescriptor
correspond to the name and method descriptor referenced by the name_and_type_index
item of the structure. (The bootstrap_method_attr_index
item is irrelevant to verification.)
For clarity, we assume that field and method descriptors (4.3.2, 4.3.3) are mapped into more readable names: the leading L
and trailing ;
are dropped from class names, and the BaseType characters used for primitive types are mapped to the names of those types.
For example, a getfield instruction whose operand refers to a constant pool entry representing a field
foo
of typeF
in classBar
would be represented asgetfield(field('Bar', 'foo', 'F'))
.
The ldc instruction, among others, has an operand that refers to a loadable entry in the constant_pool
table. There are nine kinds of loadable entry (see Table 4.4-C), represented by functor applications of the following forms:
int(Value)
for a constant pool entry that is a CONSTANT_Integer_info
structure (4.4.4).
Value
is the int
constant represented by the bytes
item of the structure.
For example, an ldc instruction for loading the
int
constant 91 would be represented asldc(int(91))
.
float(Value)
for a constant pool entry that is a CONSTANT_Float_info
structure (4.4.4).
Value
is the float
constant represented by the bytes
item of the structure.
long(Value)
for a constant pool entry that is a CONSTANT_Long_info
structure (4.4.5).
Value
is the long
constant represented by the high_bytes
and low_bytes
items of the structure.
double(Value)
for a constant pool entry that is a CONSTANT_Double_info
structure (4.4.5).
Value
is the double
constant represented by the high_bytes
and low_bytes
items of the structure.
class(ClassName)
for a constant pool entry that is a CONSTANT_Class_info
structure (4.4.1).
ClassName
is the name of the class or interface referenced by the name_index
item in the structure.
string(Value)
for a constant pool entry that is a CONSTANT_String_info
structure (4.4.3).
Value
is the string referenced by the string_index
item of the structure.
methodHandle(Kind, Reference)
for a constant pool entry that is a CONSTANT_MethodHandle_info
structure (4.4.8).
Kind
is the value of the reference_kind
item of the structure. Reference
is the value of the reference_index
item of the structure.
methodType(MethodDescriptor)
for a constant pool entry that is a CONSTANT_MethodType_info
structure (4.4.9).
MethodDescriptor
is the method descriptor referenced by the descriptor_index
item of the structure.
dconstant(ConstantName, FieldDescriptor)
for a constant pool entry that is a CONSTANT_Dynamic_info
structure (4.4.10).
ConstantName
and FieldDescriptor
correspond to the name and field descriptor referenced by the name_and_type_index
item of the structure. (The bootstrap_method_attr_index
item is irrelevant to verification.)
To interpret method invocation instructions, the following predicates identify references to methods that are instance, class, or interface initialization methods.
isInitReference(MethodRef) :-
MethodRef = method(_, '<init>', MethodDescriptor),
parseMethodDescriptor(MethodDescriptor, _, void).
isInitReference(MethodRef) :-
MethodRef = imethod(_, '<init>', MethodDescriptor),
parseMethodDescriptor(MethodDescriptor, _, void).
isClinitReference(MethodRef) :-
MethodRef = method(_, '<clinit>', MethodDescriptor),
parseMethodDescriptor(MethodDescriptor, [], void).
isClinitReference(MethodRef) :-
MethodRef = imethod(_, '<clinit>', MethodDescriptor),
parseMethodDescriptor(MethodDescriptor, [], void).
Non-abstract
, non-native
methods are type correct if they have code and the code is type correct.
methodIsTypeSafe(Class, Method) :-
doesNotOverrideFinalMethod(Class, Method),
methodAccessFlags(Method, AccessFlags),
methodAttributes(Method, Attributes),
notMember(native, AccessFlags),
notMember(abstract, AccessFlags),
member(attribute('Code', _), Attributes),
methodWithCodeIsTypeSafe(Class, Method).
A method with code is type safe if it is possible to merge the code and the stack map frames into a single stream such that each stack map frame precedes the instruction it corresponds to, and the merged stream is type correct. The method's exception handlers, if any, must also be legal.
methodWithCodeIsTypeSafe(Class, Method) :-
parseCodeAttribute(Class, Method, FrameSize, MaxStack,
ParsedCode, Handlers, StackMap),
mergeStackMapAndCode(StackMap, ParsedCode, MergedCode),
methodInitialStackFrame(Class, Method, FrameSize, StackFrame, ReturnType),
Environment = environment(Class, Method, ReturnType, MergedCode,
MaxStack, Handlers),
handlersAreLegal(Environment),
mergedCodeIsTypeSafe(Environment, MergedCode, StackFrame).
Let us consider exception handlers first.
An exception handler is represented by a functor application of the form:
handler(Start, End, Target, ClassName)
whose arguments are, respectively, the start and end of the range of instructions covered by the handler, the first instruction of the handler code, and the name of the exception class that this handler is designed to handle.
An exception handler is legal if its start (Start
) is less than its end (End
), there exists an instruction whose offset is equal to Start
, there exists an instruction whose offset equals End
, and the handler's exception class is assignable to the class Throwable
. The exception class of a handler is Throwable
if the handler's class entry is 0, otherwise it is the class named in the handler.
An additional requirement exists for a handler inside an instance initialization method if one of the instructions covered by the handler is invokespecial of an <init>
instance initialization method. In this case, the fact that a handler is running means the object under construction is likely broken, so it is important that the handler does not swallow the exception and allow the enclosing <init>
instance initialization method to return normally to the caller. Accordingly, the handler is required to either complete abruptly by throwing an exception to the caller of the enclosing <init>
instance initialization method, or to loop forever.<init>
handlersAreLegal(Environment) :-
exceptionHandlers(Environment, Handlers),
checklist(handlerIsLegal(Environment), Handlers).
handlerIsLegal(Environment, Handler) :-
Handler = handler(Start, End, Target, _),
Start < End,
allInstructions(Environment, Instructions),
member(instruction(Start, _), Instructions),
offsetStackFrame(Environment, Target, _),
instructionsIncludeEnd(Instructions, End),
currentClassLoader(Environment, CurrentLoader),
handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
isBootstrapLoader(BL),
isAssignable(ExceptionClass, class('java/lang/Throwable', BL)),
initHandlerIsLegal(Environment, Handler).
instructionsIncludeEnd(Instructions, End) :-
member(instruction(End, _), Instructions).
instructionsIncludeEnd(Instructions, End) :-
member(endOfCode(End), Instructions).
handlerExceptionClass(handler(_, _, _, 0),
class('java/lang/Throwable', BL), _) :-
isBootstrapLoader(BL).
handlerExceptionClass(handler(_, _, _, Name),
class(Name, L), L) :-
Name \= 0.
initHandlerIsLegal(Environment, Handler) :-
notInitHandler(Environment, Handler).
notInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isNotInit(Method).
notInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method),
member(instruction(_, invokespecial(CP)), Instructions),
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`'.
notInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method),
member(instruction(_, invokespecial(CP)), Instructions),
\+ isInitReference(CP).
initHandlerIsLegal(Environment, Handler) :-
isInitHandler(Environment, Handler),
sublist(isApplicableInstruction(Target), Instructions,
HandlerInstructions),
noAttemptToReturnNormally(HandlerInstructions).
isInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method).
member(instruction(_, invokespecial(CP)), Instructions),
CP = method(MethodClassName, '`<init>`', Descriptor).
isInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method),
member(instruction(_, invokespecial(CP)), Instructions),
isInitReference(CP).
isApplicableInstruction(HandlerStart, instruction(Offset, _)) :-
Offset >= HandlerStart.
noAttemptToReturnNormally(Instructions) :-
notMember(instruction(_, return), Instructions).
noAttemptToReturnNormally(Instructions) :-
member(instruction(_, athrow), Instructions).
Let us now turn to the stream of instructions and stack map frames.
Merging instructions and stack map frames into a single stream involves four cases:
Merging an empty StackMap
and a list of instructions yields the original list of instructions.
mergeStackMapAndCode([], CodeList, CodeList).
Given a list of stack map frames beginning with the type state for the instruction at Offset
, and a list of instructions beginning at Offset
, the merged list is the head of the stack map frame list, followed by the head of the instruction list, followed by the merge of the tails of the two lists.
mergeStackMapAndCode([stackMap(Offset, Map) | RestMap],
[instruction(Offset, Parse) | RestCode],
[stackMap(Offset, Map),
instruction(Offset, Parse) | RestMerge]) :-
mergeStackMapAndCode(RestMap, RestCode, RestMerge).
Otherwise, given a list of stack map frames beginning with the type state for the instruction at OffsetM
, and a list of instructions beginning at OffsetP
, then, if OffsetP < OffsetM
, the merged list consists of the head of the instruction list, followed by the merge of the stack map frame list and the tail of the instruction list.
mergeStackMapAndCode([stackMap(OffsetM, Map) | RestMap],
[instruction(OffsetP, Parse) | RestCode],
[instruction(OffsetP, Parse) | RestMerge]) :-
OffsetP < OffsetM,
mergeStackMapAndCode([stackMap(OffsetM, Map) | RestMap],
RestCode, RestMerge).
Otherwise, the merge of the two lists is undefined. Since the instruction list has monotonically increasing offsets, the merge of the two lists is not defined unless every stack map frame offset has a corresponding instruction offset and the stack map frames are in monotonically increasing order.
To determine if the merged stream for a method is type correct, we first infer the method's initial type state.
The initial type state of a method consists of an empty operand stack and local variable types derived from the type of this
and the arguments, as well as the appropriate flag, depending on whether this is an instance initialization method.<init>
methodInitialStackFrame(Class, Method, FrameSize, frame(Locals, [], Flags),
ReturnType):-
methodDescriptor(Method, Descriptor),
parseMethodDescriptor(Descriptor, RawArgs, ReturnType),
expandTypeList(RawArgs, Args),
methodInitialThisType(Class, Method, ThisList),
flags(ThisList, Flags),
append(ThisList, Args, ThisArgs),
expandToLength(ThisArgs, FrameSize, top, Locals).
Given a list of types, the following clause produces a list where every type of size 2 has been substituted by two entries: one for itself, and one top
entry. The result then corresponds to the representation of the list as 32-bit words in the Java Virtual Machine.
expandTypeList([], []).
expandTypeList([Item | List], [Item | Result]) :-
sizeOf(Item, 1),
expandTypeList(List, Result).
expandTypeList([Item | List], [Item, top | Result]) :-
sizeOf(Item, 2),
expandTypeList(List, Result).
flags([uninitializedThis], [flagThisUninit]).
flags(X, []) :- X \= [uninitializedThis].
expandToLength(List, Size, _Filler, List) :-
length(List, Size).
expandToLength(List, Size, Filler, Result) :-
length(List, ListLength),
ListLength < Size,
Delta is Size - ListLength,
length(Extra, Delta),
checklist(=(Filler), Extra),
append(List, Extra, Result).
For the initial type state of an instance method, we compute the type of this
and put it in a list. The type of this
in the instance initialization method of <init>
Object
is Object
; in other instance initialization methods, the type of <init>
this
is uninitializedThis
; otherwise, the type of this
in an instance method is class(N, L)
where N
is the name of the class containing the method and L
is its defining class loader.
For the initial type state of a static method, this
is irrelevant, so the list is empty.
methodInitialThisType(_Class, Method, []) :-
methodAccessFlags(Method, AccessFlags),
member(static, AccessFlags),
methodName(Method, MethodName),
MethodName \= '`<init>`'.
methodInitialThisType(_Class, Method, []) :-
methodAccessFlags(Method, AccessFlags),
member(static, AccessFlags).
An instance initialization method cannot be static
, so checking for it is redundant.
methodInitialThisType(Class, Method, [This]) :-
methodAccessFlags(Method, AccessFlags),
notMember(static, AccessFlags),
instanceMethodInitialThisType(Class, Method, This).
instanceMethodInitialThisType(Class, Method, class('java/lang/Object', L)) :-
methodName(Method, '`<init>`'),
classDefiningLoader(Class, L),
isBootstrapLoader(L),
classClassName(Class, 'java/lang/Object').
instanceMethodInitialThisType(Class, Method, class('java/lang/Object', L)) :-
isInit(Method),
classDefiningLoader(Class, L),
isBootstrapLoader(L),
classClassName(Class, 'java/lang/Object').
instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
methodName(Method, '`<init>`'),
classClassName(Class, ClassName),
classDefiningLoader(Class, CurrentLoader),
superclassChain(ClassName, CurrentLoader, Chain),
Chain \= [].
instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
isInit(Method),
classClassName(Class, ClassName),
classDefiningLoader(Class, CurrentLoader),
superclassChain(ClassName, CurrentLoader, Chain),
Chain \= [].
instanceMethodInitialThisType(Class, Method, class(ClassName, L)) :-
methodName(Method, MethodName),
MethodName \= '`<init>`',
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
instanceMethodInitialThisType(Class, Method, class(ClassName, L)) :-
isNotInit(Method),
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
We now compute whether the merged stream for a method is type correct, using the method's initial type state:
If we have a stack map frame and an incoming type state, the type state must be assignable to the one in the stack map frame. We may then proceed to type check the rest of the stream with the type state given in the stack map frame.
mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode],
frame(Locals, OperandStack, Flags)) :-
frameIsAssignable(frame(Locals, OperandStack, Flags), MapFrame),
mergedCodeIsTypeSafe(Environment, MoreCode, MapFrame).
A merged code stream is type safe relative to an incoming type state T
if it begins with an instruction I
that is type safe relative to T
, and I
satisfies its exception handlers (see below), and the tail of the stream is type safe given the type state following that execution of I
.
NextStackFrame
indicates what falls through to the following instruction. For an unconditional branch instruction, it will have the special value afterGoto
. ExceptionStackFrame
indicates what is passed to exception handlers.
mergedCodeIsTypeSafe(Environment, [instruction(Offset, Parse) | MoreCode],
frame(Locals, OperandStack, Flags)) :-
instructionIsTypeSafe(Parse, Environment, Offset,
frame(Locals, OperandStack, Flags),
NextStackFrame, ExceptionStackFrame),
instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame),
mergedCodeIsTypeSafe(Environment, MoreCode, NextStackFrame).
After an unconditional branch (indicated by an incoming type state of afterGoto
), if we have a stack map frame giving the type state for the following instructions, we can proceed and type check them using the type state provided by the stack map frame.
mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode],
afterGoto) :-
mergedCodeIsTypeSafe(Environment, MoreCode, MapFrame).
It is illegal to have code after an unconditional branch without a stack map frame being provided for it.
mergedCodeIsTypeSafe(_Environment, [instruction(_, _) | _MoreCode],
afterGoto) :-
write_ln('No stack frame after unconditional branch'),
fail.
If we have an unconditional branch at the end of the code, stop.
mergedCodeIsTypeSafe(_Environment, [endOfCode(Offset)],
afterGoto).
Branching to a target is type safe if the target has an associated stack frame, Frame
, and the current stack frame, StackFrame
, is assignable to Frame
.
targetIsTypeSafe(Environment, StackFrame, Target) :-
offsetStackFrame(Environment, Target, Frame),
frameIsAssignable(StackFrame, Frame).
An instruction satisfies its exception handlers if it satisfies every exception handler that is applicable to the instruction.
instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame) :-
exceptionHandlers(Environment, Handlers),
sublist(isApplicableHandler(Offset), Handlers, ApplicableHandlers),
checklist(instructionSatisfiesHandler(Environment, ExceptionStackFrame),
ApplicableHandlers).
An exception handler is applicable to an instruction if the offset of the instruction is greater or equal to the start of the handler's range and less than the end of the handler's range.
isApplicableHandler(Offset, handler(Start, End, _Target, _ClassName)) :-
Offset >= Start,
Offset < End.
An instruction satisfies an exception handler if the instructions's outgoing type state is ExcStackFrame
, and the handler's target (the initial instruction of the handler code) is type safe assuming an incoming type state T
. The type state T
is derived from ExcStackFrame
by replacing the operand stack with a stack whose sole element is the handler's exception class.
instructionSatisfiesHandler(Environment, ExcStackFrame, Handler) :-
Handler = handler(_, _, Target, _),
currentClassLoader(Environment, CurrentLoader),
handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
/* The stack consists of just the exception. */
ExcStackFrame = frame(Locals, _, Flags),
TrueExcStackFrame = frame(Locals, [ ExceptionClass ], Flags),
operandStackHasLegalLength(Environment, TrueExcStackFrame),
targetIsTypeSafe(Environment, TrueExcStackFrame, Target).
An invokedynamic instruction is type safe iff all of the following are true:
Its first operand, CP
, refers to a constant pool entry denoting an a dynamic call site with name with descriptor CallSiteName
Descriptor
.
CallSiteName
is not <init>
.
CallSiteName
is not <clinit>
.
One can validly replace types matching the argument types given in Descriptor
on the incoming operand stack with the return type given in Descriptor
, yielding the outgoing type state.
instructionIsTypeSafe(invokedynamic(CP,0,0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = dmethod(CallSiteName, Descriptor),
CallSiteName \= '`<init>`',
CallSiteName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokedynamic(CP,0,0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = dmethod(_, Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The check for inappropriate method names has moved to 4.4.10.
An invokeinterface instruction is type safe iff all of the following are true:
Its first operand, CP
, refers to a constant pool entry denoting an interface method named with descriptor MethodName
Descriptor
that is a member of an interface MethodIntfName
.
MethodName
is not <init>
.CP
does not reference an instance initialization method.
MethodName
is not <clinit>
.CP
does not reference a class or interface initialization method.
Its second operand, Count
, is a valid count operand (see below).
One can validly replace types matching the type MethodIntfName
and the argument types given in Descriptor
on the incoming operand stack with the return type given in Descriptor
, yielding the outgoing type state.
instructionIsTypeSafe(invokeinterface(CP, Count, 0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = imethod(MethodIntfName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
currentClassLoader(Environment, CurrentLoader),
reverse([class(MethodIntfName, CurrentLoader) | OperandArgList],
StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
validTypeTransition(Environment, [], ReturnType,
TempFrame, NextStackFrame),
countIsValid(Count, StackFrame, TempFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokeinterface(CP, Count, 0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = imethod(MethodIntfName, _, Descriptor),
\+ isInitReference(CP),
\+ isClinitReference(CP),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
currentClassLoader(Environment, CurrentLoader),
reverse([class(MethodIntfName, CurrentLoader) | OperandArgList],
StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
validTypeTransition(Environment, [], ReturnType,
TempFrame, NextStackFrame),
countIsValid(Count, StackFrame, TempFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The Count
operand of an invokeinterface instruction is valid if it equals the size of the arguments to the instruction. This is equal to the difference between the size of InputFrame
and OutputFrame
.
countIsValid(Count, InputFrame, OutputFrame) :-
InputFrame = frame(_Locals1, OperandStack1, _Flags1),
OutputFrame = frame(_Locals2, OperandStack2, _Flags2),
length(OperandStack1, Length1),
length(OperandStack2, Length2),
Count =:= Length1 - Length2.
An invokespecial instruction is type safe iff all of the following are true:
Its first operand, CP
, refers to a constant pool entry denoting a method named with descriptor MethodName
Descriptor
that is a member of a class MethodClassName
.
Either:
MethodName
is not <init>
.CP
does not reference an instance initialization method.
MethodName
is not <clinit>
.CP
does not reference a class or interface initialization method.
One can validly replace types matching the current class and the argument types given in Descriptor
on the incoming operand stack with the return type given in Descriptor
, yielding the outgoing type state.
One can validly replace types matching the class MethodClassName
and the argument types given in Descriptor
on the incoming operand stack with the return type given in Descriptor
.
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
isAssignable(class(CurrentClassName, CurrentLoader),
class(MethodClassName, CurrentLoader)),
reverse([class(CurrentClassName, CurrentLoader) | OperandArgList],
StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList2),
validTypeTransition(Environment, StackArgList2, ReturnType,
StackFrame, _ResultStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, _, Descriptor),
\+ isInitReference(CP),
\+ isClinitReference(CP),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
isAssignable(class(CurrentClassName, CurrentLoader),
class(MethodClassName, CurrentLoader)),
reverse([class(CurrentClassName, CurrentLoader) | OperandArgList],
StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList2),
validTypeTransition(Environment, StackArgList2, ReturnType,
StackFrame, _ResultStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The
isAssignable
clause enforces the structural constraint that invokespecial, for other than an instance initialization method, must name a method in the current class/interface or a superclass/superinterface.
The first
validTypeTransition
clause enforces the structural constraint that invokespecial, for other than an instance initialization method, targets a receiver object of the current class or deeper. To see why, consider thatStackArgList
simulates the list of types on the operand stack expected by the method, starting with the current class (the class performing invokespecial). The actual types on the operand stack are inStackFrame
. The effect ofvalidTypeTransition
is to pop the first type from the operand stack inStackFrame
and check it is a subtype of the first term ofStackArgList
, namely the current class. Thus, the actual receiver type is compatible with the current class.
A sharp-eyed reader might notice that enforcing this structural constraint supercedes the structural constraint pertaining to invokespecial of a
protected
method. Thus, the Prolog code above makes no reference topassesProtectedCheck
(4.10.1.8), whereas the Prolog code for invokespecial of an instance initialization method usespassesProtectedCheck
to ensure the actual receiver type is compatible with the current class when certainprotected
instance initialization methods are named.
The second
validTypeTransition
clause enforces the structural constraint that any method invocation instruction must target a receiver object whose type is compatible with the type named by the instruction. To see why, consider thatStackArgList2
simulates the list of types on the operand stack expected by the method, starting with the type named by the instruction. Again, the actual types on the operand stack are inStackFrame
, and the effect ofvalidTypeTransition
is to check the actual receiver type inStackFrame
is compatible with the type named by the instruction inStackArgList2
.
Or:
MethodName is <init>
.CP
reference an instance initialization method.
Descriptor
specifies a void
return type.
One can validly pop types matching the argument types given in Descriptor
and an uninitialized type, UninitializedArg
, off the incoming operand stack, yielding OperandStack
.
The outgoing type state is derived from the incoming type state by first replacing the incoming operand stack with OperandStack
and then replacing all instances of UninitializedArg
with the type of instance being initialized.
If the instruction calls an instance initialization method on a class instance created by an earlier new instruction, and the method is protected
, the usage conforms to the special rules governing access to protected
members (4.10.1.8).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, '`<init>`', Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitializedThis | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitializedThis, Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenInitializationFlags(uninitializedThis, Flags, NextFlags),
substitute(uninitializedThis, This, OperandStack, NextOperandStack),
substitute(uninitializedThis, This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, _, Descriptor),
isInitReference(CP),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitializedThis | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitializedThis, Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenInitializationFlags(uninitializedThis, Flags, NextFlags),
substitute(uninitializedThis, This, OperandStack, NextOperandStack),
substitute(uninitializedThis, This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, '`<init>`', Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitialized(Address) | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitialized(Address), Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenInitializationFlags(uninitialized(Address), Flags, NextFlags),
substitute(uninitialized(Address), This, OperandStack, NextOperandStack),
substitute(uninitialized(Address), This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags),
passesProtectedCheck(Environment, MethodClassName, '`<init>`',
Descriptor, NextStackFrame).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, _, Descriptor),
isInitReference(CP),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitialized(Address) | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitialized(Address), Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenInitializationFlags(uninitialized(Address), Flags, NextFlags),
substitute(uninitialized(Address), This, OperandStack, NextOperandStack),
substitute(uninitialized(Address), This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags),
passesProtectedCheck(Environment, MethodClassName, '`<init>`',
Descriptor, NextStackFrame).
To compute what type the uninitialized argument's type needs to be rewritten to, there are two cases:
If we are initializing an object within its constructor, its type is initially uninitializedThis
. This type will be rewritten to the type of the class of the <init>
method.
The second case arises from initialization of an object created by new. The uninitialized arg type is rewritten to MethodClass
, the type of the method holder of <init>
. We check whether there really is a new instruction at Address
.
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, MethodClass) :-
MethodClass = class(MethodClassName, CurrentLoader),
thisClass(Environment, MethodClass).
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, MethodClass) :-
MethodClass = class(MethodClassName, CurrentLoader),
thisClass(Environment, class(thisClassName, thisLoader)),
superclassChain(thisClassName, thisLoader, [MethodClass | Rest]).
rewrittenUninitializedType(uninitialized(Address), Environment,
MethodClass, MethodClass) :-
allInstructions(Environment, Instructions),
member(instruction(Address, new(MethodClass)), Instructions).
rewrittenInitializationFlags(uninitializedThis, _Flags, []).
rewrittenInitializationFlags(uninitialized(_), Flags, Flags).
substitute(_Old, _New, [], []).
substitute(Old, New, [Old | FromRest], [New | ToRest]) :-
substitute(Old, New, FromRest, ToRest).
substitute(Old, New, [From1 | FromRest], [From1 | ToRest]) :-
From1 \= Old,
substitute(Old, New, FromRest, ToRest).
The rule for invokespecial of an
instance initialization method is the sole motivation for passing back a distinct exception stack frame. The concern is that when initializing an object within its constructor, invokespecial can cause a superclass<init>
instance initialization method to be invoked, and that invocation could fail, leaving<init>
this
uninitialized. This situation cannot be created using source code in the Java programming language, but can be created by programming in bytecode directly.
In this situation, the original frame holds an uninitialized object in local variable 0 and has flag
flagThisUninit
. Normal termination of invokespecial initializes the uninitialized object and turns off theflagThisUninit
flag. But if the invocation of aninstance initialization method throws an exception, the uninitialized object might be left in a partially initialized state, and needs to be made permanently unusable. This is represented by an exception frame containing the broken object (the new value of the local) and the<init>
flagThisUninit
flag (the old flag). There is no way to get from an apparently-initialized object bearing theflagThisUninit
flag to a properly initialized object, so the object is permanently unusable.
If not for this situation, the flags of the exception stack frame would always be the same as the flags of the input stack frame.
An invokestatic instruction is type safe iff all of the following are true:
Its first operand, CP
, refers to a constant pool entry denoting a method named with descriptor MethodName
Descriptor
.
MethodName
is not <init>
.CP
does not reference an instance initialization method.
MethodName
is not <clinit>
.CP
does not reference a class or interface initialization method.
One can validly replace types matching the argument types given in Descriptor
on the incoming operand stack with the return type given in Descriptor
, yielding the outgoing type state.
instructionIsTypeSafe(invokestatic(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(_MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokestatic(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(_, _, Descriptor),
\+ isInitReference(CP),
\+ isClinitReference(CP),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
An invokevirtual instruction is type safe iff all of the following are true:
Its first operand, CP
, refers to a constant pool entry denoting a method named with descriptor MethodName
Descriptor
that is a member of a class MethodClassName
.
MethodName
is not <init>
.CP
does not reference an instance initialization method.
MethodName
is not <clinit>
.CP
does not reference a class or interface initialization method.
One can validly replace types matching the class MethodClassName
and the argument types given in Descriptor
on the incoming operand stack with the return type given in Descriptor
, yielding the outgoing type state.
If the method is protected
, the usage conforms to the special rules governing access to protected
members (4.10.1.8).
instructionIsTypeSafe(invokevirtual(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, ArgList),
currentClassLoader(Environment, CurrentLoader),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
canPop(StackFrame, ArgList, PoppedFrame),
passesProtectedCheck(Environment, MethodClassName, MethodName,
Descriptor, PoppedFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokevirtual(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, _, Descriptor),
\+ isInitReference(CP),
\+ isClinitReference(CP),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, ArgList),
currentClassLoader(Environment, CurrentLoader),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
canPop(StackFrame, ArgList, PoppedFrame),
passesProtectedCheck(Environment, MethodClassName, MethodName,
Descriptor, PoppedFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
A return instruction is type safe if the enclosing method declares a void
return type, and either:
The enclosing method is not an instance initialization method, or<init>
this
has already been completely initialized at the point where the instruction occurs.
instructionIsTypeSafe(return, Environment, _Offset, StackFrame,
afterGoto, ExceptionStackFrame) :-
thisMethodReturnType(Environment, void),
StackFrame = frame(_Locals, _OperandStack, Flags),
notMember(flagThisUninit, Flags),
exceptionStackFrame(StackFrame, ExceptionStackFrame).