This document describes changes to the Java Virtual Machine Specification, as modified by Format Checking Cleanup and Class Loading Cleanup, to simplify the process of validating attributes that are either closely tied to bytecode verification or non-essential to JVM semantics.
The following changes to implementation behavior are proposed:
Ignoring the contents of the following standard attributes, rather than spending computation time validating them:
Exceptions
,InnerClasses
,EnclosingMethod
,Synthetic
,Signature
,SourceFile
,LineNumberTable
,LocalVariableTable
,LocalVariableTypeTable
, andDeprecated
(4.7). (Duplicate occurrences of these attributes can still cause an error.)Delaying all validation of the
exception_table
ofCode
attributes until verification, when thecode
array will have been parsed (4.7.3, 4.9, 4.10).Throwing
VerifyError
rather thanClassFormatError
whenever verification-time validation ofexception_table
andStackMapTable
fails (4.10).Preventing fallback to verification by type inference in the corner case in which a version 50.0 class file contains an invalid
Uninitialized_variable_info
in aStackMapTable
(4.9.1).Tightening the check that an interface named by invokespecial is a direct superinterface (4.10.1.9.invokespecial).
This document also shifts the specification of various validation checks for Code
and StackMapTable
attributes to verification time, consistent with longstanding behavior of the reference implementation (4.9, 4.10). And it addresses some bugs and redundancies in the verification rules.
Changes are described with respect to existing sections of the JVM Specification. New text is indicated like this and deleted text is indicated like this. Explanation and discussion, as needed, is set aside in grey boxes.
Chapter 4: The class
File Format
4.7 Attributes
Attributes are used in the ClassFile
, field_info
, method_info
, and Code_attribute
structures of the class
file format (4.1, 4.5, 4.6, 4.7.3).
All attributes have the following general format:
attribute_info {
u2 attribute_name_index;
u4 attribute_length;
u1 info[attribute_length];
}
For all attributes, the attribute_name_index
item must be a valid unsigned 16-bit index into the constant pool of the class. The constant_pool
entry at attribute_name_index
must be a CONSTANT_Utf8_info
structure (4.4.7) which represents the name of the attribute. The value of the attribute_length
item indicates the length of the subsequent information in bytes. The length does not include the initial six bytes that contain the attribute_name_index
and attribute_length
items.
28 attributes are predefined by this specification. They are listed three times, for ease of navigation:
Table 4.7-A is ordered by the attributes' section numbers in this chapter. Each attribute is shown with the first version of the
class
file format in which it was defined. Also shown is the version of the Java SE Platform which introduced that version of theclass
file format (4.1).Table 4.7-B is ordered by the first version of the
class
file format in which each attribute was defined.Table 4.7-C is ordered by the location in a
class
file where each attribute is defined to appear.
Within the context of their use in this specification, that is, in the attributes
tables of the structures of appropriately-versioned class
files in which they appear, the names of these predefined attributes are reserved.
Any conditions on the presence of a predefined attribute in an attributes
table are specified explicitly in the section which describes the attribute. If no conditions are specified, then the attribute may appear any number of times in an attributes
table.
The predefined attributes are categorized into three groups according to their purpose:
Nine attributes are critical to correct interpretation of the
class
file by the Java Virtual Machine:ConstantValue
Code
StackMapTable
BootstrapMethods
Module
ModulePackages
ModuleMainClass
NestHost
NestMembers
In a
class
file whose version number is v, each of these attributes must be recognized and correctly read by an implementation of the Java Virtual Machine if the implementation supports version v of theclass
file format, and the attribute was first defined in version v or earlier of theclass
file format, and the attribute appears in a location where it is defined to appear.
Ten attributes are not critical to correct interpretation of the
class
file by the Java Virtual Machine, but are either critical to correct interpretation of theclass
file by the class libraries of the Java SE Platform, or are useful for tools (in which case the section that specifies an attribute describes it as "optional"):Exceptions
InnerClasses
EnclosingMethod
Synthetic
Signature
SourceFile
LineNumberTable
LocalVariableTable
LocalVariableTypeTable
Deprecated
In a
class
file whose version number is v, each of these attributes must be recognized and correctly read by an implementation of the Java Virtual Machine if the implementation supports version v of theclass
file format, and the attribute was first defined in version v or earlier of theclass
file format, and the attribute appears in a location where it is defined to appear.
NineNineteen attributes are not critical to correct interpretation of theclass
file by the Java Virtual Machine, but contain metadata about theclass
file that is either exposed by the class libraries of the Java SE Platform, or made available by tools (in which case the section that specifies an attribute describes it as "optional"):Exceptions
InnerClasses
EnclosingMethod
Synthetic
Signature
SourceFile
SourceDebugExtension
LineNumberTable
LocalVariableTable
LocalVariableTypeTable
Deprecated
RuntimeVisibleAnnotations
RuntimeInvisibleAnnotations
RuntimeVisibleParameterAnnotations
RuntimeInvisibleParameterAnnotations
RuntimeVisibleTypeAnnotations
RuntimeInvisibleTypeAnnotations
AnnotationDefault
MethodParameters
An implementation of the Java Virtual Machine may use the information that these attributes contain, or otherwise must silently ignore these attributes.
The effect of grouping these two categories together is that the old group #2 is no longer subject to format checking—the contents of the attributes must be silently ignored (or silently observed) by the JVM.
Table 4.7-A. Predefined class
file attributes (by section)
Attribute | Section | class file |
Java SE |
---|---|---|---|
ConstantValue |
4.7.2 | 45.3 | 1.0.2 |
Code |
4.7.3 | 45.3 | 1.0.2 |
StackMapTable |
4.7.4 | 50.0 | 6 |
Exceptions |
4.7.5 | 45.3 | 1.0.2 |
InnerClasses |
4.7.6 | 45.3 | 1.1 |
EnclosingMethod |
4.7.7 | 49.0 | 5.0 |
Synthetic |
4.7.8 | 45.3 | 1.1 |
Signature |
4.7.9 | 49.0 | 5.0 |
SourceFile |
4.7.10 | 45.3 | 1.0.2 |
SourceDebugExtension |
4.7.11 | 49.0 | 5.0 |
LineNumberTable |
4.7.12 | 45.3 | 1.0.2 |
LocalVariableTable |
4.7.13 | 45.3 | 1.0.2 |
LocalVariableTypeTable |
4.7.14 | 49.0 | 5.0 |
Deprecated |
4.7.15 | 45.3 | 1.1 |
RuntimeVisibleAnnotations |
4.7.16 | 49.0 | 5.0 |
RuntimeInvisibleAnnotations |
4.7.17 | 49.0 | 5.0 |
RuntimeVisibleParameterAnnotations |
4.7.18 | 49.0 | 5.0 |
RuntimeInvisibleParameterAnnotations |
4.7.19 | 49.0 | 5.0 |
RuntimeVisibleTypeAnnotations |
4.7.20 | 52.0 | 8 |
RuntimeInvisibleTypeAnnotations |
4.7.21 | 52.0 | 8 |
AnnotationDefault |
4.7.22 | 49.0 | 5.0 |
BootstrapMethods |
4.7.23 | 51.0 | 7 |
MethodParameters |
4.7.24 | 52.0 | 8 |
Module |
4.7.25 | 53.0 | 9 |
ModulePackages |
4.7.26 | 53.0 | 9 |
ModuleMainClass |
4.7.27 | 53.0 | 9 |
NestHost |
4.7.28 | 55.0 | 11 |
NestMembers |
4.7.29 | 55.0 | 11 |
Table 4.7-B. Predefined class
file attributes (by class
file format)
Attribute | class file |
Java SE | Section |
---|---|---|---|
ConstantValue |
45.3 | 1.0.2 | 4.7.2 |
Code |
45.3 | 1.0.2 | 4.7.3 |
Exceptions |
45.3 | 1.0.2 | 4.7.5 |
SourceFile |
45.3 | 1.0.2 | 4.7.10 |
LineNumberTable |
45.3 | 1.0.2 | 4.7.12 |
LocalVariableTable |
45.3 | 1.0.2 | 4.7.13 |
InnerClasses |
45.3 | 1.1 | 4.7.6 |
Synthetic |
45.3 | 1.1 | 4.7.8 |
Deprecated |
45.3 | 1.1 | 4.7.15 |
EnclosingMethod |
49.0 | 5.0 | 4.7.7 |
Signature |
49.0 | 5.0 | 4.7.9 |
SourceDebugExtension |
49.0 | 5.0 | 4.7.11 |
LocalVariableTypeTable |
49.0 | 5.0 | 4.7.14 |
RuntimeVisibleAnnotations |
49.0 | 5.0 | 4.7.16 |
RuntimeInvisibleAnnotations |
49.0 | 5.0 | 4.7.17 |
RuntimeVisibleParameterAnnotations |
49.0 | 5.0 | 4.7.18 |
RuntimeInvisibleParameterAnnotations |
49.0 | 5.0 | 4.7.19 |
AnnotationDefault |
49.0 | 5.0 | 4.7.22 |
StackMapTable |
50.0 | 6 | 4.7.4 |
BootstrapMethods |
51.0 | 7 | 4.7.23 |
RuntimeVisibleTypeAnnotations |
52.0 | 8 | 4.7.20 |
RuntimeInvisibleTypeAnnotations |
52.0 | 8 | 4.7.21 |
MethodParameters |
52.0 | 8 | 4.7.24 |
Module |
53.0 | 9 | 4.7.25 |
ModulePackages |
53.0 | 9 | 4.7.26 |
ModuleMainClass |
53.0 | 9 | 4.7.27 |
NestHost |
55.0 | 11 | 4.7.28 |
NestMembers |
55.0 | 11 | 4.7.29 |
Table 4.7-C. Predefined class
file attributes (by location)
Attribute | Location | class file |
---|---|---|
SourceFile |
ClassFile |
45.3 |
InnerClasses |
ClassFile |
45.3 |
EnclosingMethod |
ClassFile |
49.0 |
SourceDebugExtension |
ClassFile |
49.0 |
BootstrapMethods |
ClassFile |
51.0 |
Module , ModulePackages , ModuleMainClass |
ClassFile |
53.0 |
NestHost , NestMembers |
ClassFile |
55.0 |
ConstantValue |
field_info |
45.3 |
Code |
method_info |
45.3 |
Exceptions |
method_info |
45.3 |
RuntimeVisibleParameterAnnotations , RuntimeInvisibleParameterAnnotations |
method_info |
49.0 |
AnnotationDefault |
method_info |
49.0 |
MethodParameters |
method_info |
52.0 |
Synthetic |
ClassFile , field_info , method_info |
45.3 |
Deprecated |
ClassFile , field_info , method_info |
45.3 |
Signature |
ClassFile , field_info , method_info |
49.0 |
RuntimeVisibleAnnotations , RuntimeInvisibleAnnotations |
ClassFile , field_info , method_info |
49.0 |
LineNumberTable |
Code |
45.3 |
LocalVariableTable |
Code |
45.3 |
LocalVariableTypeTable |
Code |
49.0 |
StackMapTable |
Code |
50.0 |
RuntimeVisibleTypeAnnotations , RuntimeInvisibleTypeAnnotations |
ClassFile , field_info , method_info , Code |
52.0 |
4.7.3 The Code
Attribute
The Code
attribute is a variable-length attribute in the attributes
table of a method_info
structure (4.6). A Code
attribute contains the Java Virtual Machine instructions and auxiliary information for a method, including an instance initialization method and a class or interface initialization method (2.9.1, 2.9.2).
If the method is either native
or abstract
, its method_info
structure must not have a Code
attribute in its attributes
table. Otherwise, its method_info
structure must have exactly one Code
attribute in its attributes
table.
The Code
attribute must have the following format:
Code_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 max_stack;
u2 max_locals;
u4 code_length;
u1 code[code_length];
u2 exception_table_length;
{ u2 start_pc;
u2 end_pc;
u2 handler_pc;
u2 catch_type;
} exception_table[exception_table_length];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
The items of the Code_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "Code
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes.- max_stack
The value of the
max_stack
itemgivesshould give the maximum depth of the operand stack of this method (2.6.2) at any point during execution of the method.- max_locals
The value of the
max_locals
itemgivesshould give the number of local variables in the local variable array allocated upon invocation of this method (2.6.1), including the local variables used to pass parameters to the method on its invocation.The greatest local variable index for a value of type
long
ordouble
ismax_locals - 2
. The greatest local variable index for a value of any other type ismax_locals - 1
.- code_length
The value of the
code_length
item gives the number of bytes in thecode
array for this method.The value of
code_length
must be greater than zero (as thecode
array must not be empty) and less than 65536.- code[]
The
code
array gives the actual bytes of Java Virtual Machine code that implement the method.When the
code
array is read into memory on a byte-addressable machine, if the first byte of the array is aligned on a 4-byte boundary, the tableswitch and lookupswitch 32-bit offsets will be 4-byte aligned. (Refer to the descriptions of those instructions for more information on the consequences ofcode
array alignment.)The detailed constraints on the contents of thecode
array are extensive and are given in a separate section (4.9).The contents of the
code
array are subject to extensive constraints during verification (5.4.1), as described in 4.9.- exception_table_length
The value of the
exception_table_length
item gives the number of entries in theexception_table
table.- exception_table[]
Each entry in the
exception_table
array describes one exception handler in thecode
array. The order of the handlers in theexception_table
array is significant (2.10).The contents of the
exception_table
array are validated during verification, along with thecode
array.Each
exception_table
entry contains the following four items:- start_pc, end_pc
The values of the two items
start_pc
andend_pc
indicate the ranges in thecode
array at which the exception handler is active. The value ofstart_pc
mustshould be a valid index into thecode
array of the opcode of an instruction. The value ofend_pc
eithermustshould be a valid index into thecode
array of the opcode of an instruction ormustshould be equal tocode_length
, the length of thecode
array. The value ofstart_pc
mustshould be less than the value ofend_pc
.The
start_pc
is inclusive andend_pc
is exclusive; that is, the exception handlermustwill be active while the program counter is within the interval [start_pc
,end_pc
).The fact that
end_pc
is exclusive is a historical mistake in the design of the Java Virtual Machine: if the Java Virtual Machine code for a method is exactly 65535 bytes long and ends with an instruction that is 1 byte long, then that instruction cannot be protected by an exception handler. A compiler writer can work around this bug by limiting the maximum size of the generated Java Virtual Machine code for any method, instance initialization method, or static initializer (the size of any code array) to 65534 bytes.- handler_pc
The value of the
handler_pc
item indicates the start of the exception handler. The value of the itemmustshould be a valid index into thecode
array andmustshould be the index of the opcode of an instruction.- catch_type
If the value of the
catch_type
item is nonzero, itmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould be aCONSTANT_Class_info
structure (4.4.1) representing a class of exceptions that this exception handler is designated to catch. The exception handler will be called only if the thrown exception is an instance of the given class or one of its subclasses.The verifier checks that the class is
Throwable
or a subclass ofThrowable
(4.9.2).If the value of the
catch_type
item is zero, this exception handler is called for all exceptions.This is used to implement
finally
(3.13).
In JDK 13, some of these assertions are checked during format checking, but assertions about opcode indices are delayed until verification. For simplicity, it makes more sense to perform all checks on the exception table during verification (see 4.9.1).
- attributes_count
The value of the
attributes_count
item indicates the number of attributes of theCode
attribute.- attributes[]
Each value of the
attributes
table must be anattribute_info
structure (4.7).The attributes defined by this specification as appearing in the
attributes
table of aCode
attribute are listed in Table 4.7-C.The rules concerning attributes defined to appear in the
attributes
table of aCode
attribute are given in 4.7.The rules concerning nonstandard attributes in the
attributes
table of aCode
attribute are given in 4.7.1.
4.7.4 The StackMapTable
Attribute
The StackMapTable
attribute is a variable-length attribute in the attributes
table of a Code
attribute (4.7.3) in a version 50.0 or later class
file. A StackMapTable
attribute is used during the process of verification by type checking (4.10.1).
There must be no more than one StackMapTable
attribute in the attributes
table of a Code
attribute.
In a class
file whose version number is 50.0 or above, if a method's Code
attribute does not have a StackMapTable
attribute, it has an implicit stack map attribute (4.10.1). This implicit stack map attribute is equivalent to a StackMapTable
attribute with number_of_entries
equal to zero.
The StackMapTable
attribute must should have the following format:
StackMapTable_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 number_of_entries;
stack_map_frame entries[number_of_entries];
}
The contents of the StackMapTable
attribute are validated during verification, along with the code
and exception_table
arrays of the Code
attribute (4.7.3).
The items of the StackMapTable_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "StackMapTable
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes.- number_of_entries
The value of the
number_of_entries
itemgivesshould give the number ofstack_map_frame
entries in theentries
table.- entries[]
Each entry in the
entries
tabledescribesshould describe one stack map frame of the method. The order of the stack map frames in theentries
table is significant.
A stack map frame specifies (either explicitly or implicitly) the bytecode offset at which it applies, and the verification types of local variables and operand stack entries for that offset.
Each stack map frame described in the entries
table relies on the previous frame for some of its semantics. The first stack map frame of a method is implicit, and computed from the method descriptor by the type checker (4.10.1.6). The stack_map_frame
structure at entries[0]
therefore describes the second stack map frame of the method.
The bytecode offset at which a stack map frame applies is calculated by taking the value offset_delta
specified in the frame (either explicitly or implicitly), and adding offset_delta + 1
to the bytecode offset of the previous frame, unless the previous frame is the initial frame of the method. In that case, the bytecode offset at which the stack map frame applies is the value offset_delta
specified in the frame.
By using an offset delta rather than storing the actual bytecode offset, we ensure, by definition, that stack map frames are in the correctly sorted order. Furthermore, by consistently using the formula
offset_delta + 1
for all explicit frames (as opposed to the implicit first frame), we guarantee the absence of duplicates.
We say that an instruction in the bytecode has a corresponding stack map frame if the instruction starts at offset i in the code
array of a Code
attribute, and the Code
attribute has a StackMapTable
attribute whose entries
array contains a stack map frame that applies at bytecode offset i.
A verification type specifies the type of either one or two locations, where a location is either a single local variable or a single operand stack entry. A verification type is represented by a discriminated union, verification_type_info
, that consists of a one-byte tag, indicating which item of the union is in use, followed by zero or more bytes, giving more information about the tag.
union verification_type_info {
Top_variable_info;
Integer_variable_info;
Float_variable_info;
Long_variable_info;
Double_variable_info;
Null_variable_info;
UninitializedThis_variable_info;
Object_variable_info;
Uninitialized_variable_info;
}
A verification type that specifies one location in the local variable array or in the operand stack is represented by the following items of the verification_type_info
union:
The
Top_variable_info
item indicates that the local variable has the verification typetop
.Top_variable_info { u1 tag = ITEM_Top; /* 0 */ }
The
Integer_variable_info
item indicates that the location has the verification typeint
.Integer_variable_info { u1 tag = ITEM_Integer; /* 1 */ }
The
Float_variable_info
item indicates that the location has the verification typefloat
.Float_variable_info { u1 tag = ITEM_Float; /* 2 */ }
The
Null_variable_info
type indicates that the location has the verification typenull
.Null_variable_info { u1 tag = ITEM_Null; /* 5 */ }
The
UninitializedThis_variable_info
item indicates that the location has the verification typeuninitializedThis
.UninitializedThis_variable_info { u1 tag = ITEM_UninitializedThis; /* 6 */ }
The
Object_variable_info
item indicates that the location has the verification type which is the class represented by theCONSTANT_Class_info
structure (4.4.1) found in theconstant_pool
table at the index given bycpool_index
.Object_variable_info { u1 tag = ITEM_Object; /* 7 */ u2 cpool_index; }
The
Uninitialized_variable_info
item indicates that the location has the verification typeuninitialized(Offset)
. TheOffset
item indicates the offset, in thecode
array of theCode
attribute that contains thisStackMapTable
attribute, of the new instruction (6.5.new) that created the object being stored in the location.Uninitialized_variable_info { u1 tag = ITEM_Uninitialized; /* 8 */ u2 offset; }
A verification type that specifies two locations in the local variable array or in the operand stack is represented by the following items of the verification_type_info
union:
The
Long_variable_info
item indicates that the first of two locations has the verification typelong
.Long_variable_info { u1 tag = ITEM_Long; /* 4 */ }
The
Double_variable_info
item indicates that the first of two locations has the verification typedouble
.Double_variable_info { u1 tag = ITEM_Double; /* 3 */ }
The
Long_variable_info
andDouble_variable_info
items indicate the verification type of the second of two locations as follows:If the first of the two locations is a local variable, then:
It
mustshould not be the local variable with the highest index.The next higher numbered local variable has the verification type
top
.
If the first of the two locations is an operand stack entry, then:
It
mustshould not be the topmost location of the operand stack.The next location closer to the top of the operand stack has the verification type
top
.
A stack map frame is represented by a discriminated union, stack_map_frame
, which consists of a one-byte tag, indicating which item of the union is in use, followed by zero or more bytes, giving more information about the tag.
union stack_map_frame {
same_frame;
same_locals_1_stack_item_frame;
same_locals_1_stack_item_frame_extended;
chop_frame;
same_frame_extended;
append_frame;
full_frame;
}
The tag indicates the frame type of the stack map frame:
The frame type
same_frame
is represented by tags in the range [0-63]. This frame type indicates that the frame has exactly the same local variables as the previous frame and that the operand stack is empty. Theoffset_delta
value for the frame is the value of the tag item,frame_type
.same_frame { u1 frame_type = SAME; /* 0-63 */ }
The frame type
same_locals_1_stack_item_frame
is represented by tags in the range [64, 127]. This frame type indicates that the frame has exactly the same local variables as the previous frame and that the operand stack has one entry. Theoffset_delta
value for the frame is given by the formulaframe_type - 64
. The verification type of the one stack entry appears after the frame type.same_locals_1_stack_item_frame { u1 frame_type = SAME_LOCALS_1_STACK_ITEM; /* 64-127 */ verification_type_info stack[1]; }
Tags in the range [128-246] are reserved for future use.
The frame type
same_locals_1_stack_item_frame_extended
is represented by the tag 247. This frame type indicates that the frame has exactly the same local variables as the previous frame and that the operand stack has one entry. Theoffset_delta
value for the frame is given explicitly, unlike in the frame typesame_locals_1_stack_item_frame
. The verification type of the one stack entry appears afteroffset_delta
.same_locals_1_stack_item_frame_extended { u1 frame_type = SAME_LOCALS_1_STACK_ITEM_EXTENDED; /* 247 */ u2 offset_delta; verification_type_info stack[1]; }
The frame type
chop_frame
is represented by tags in the range [248-250]. This frame type indicates that the frame has the same local variables as the previous frame except that the last k local variables are absent, and that the operand stack is empty. The value of k is given by the formula251 - frame_type
. Theoffset_delta
value for the frame is given explicitly.chop_frame { u1 frame_type = CHOP; /* 248-250 */ u2 offset_delta; }
Assume the verification types of local variables in the previous frame are given by
locals
, an array structured as in thefull_frame
frame type. Iflocals[M-1]
in the previous frame represented local variable X andlocals[M]
represented local variable Y, then the effect of removing one local variable is thatlocals[M-1]
in the new frame represents local variable X andlocals[M]
is undefined.It is an error if k is larger than the number of local variables inlocals
for the previous frame, that is, if the number of local variables in the new frame would be less than zero.This is checked by verification.
The frame type
same_frame_extended
is represented by the tag 251. This frame type indicates that the frame has exactly the same local variables as the previous frame and that the operand stack is empty. Theoffset_delta
value for the frame is given explicitly, unlike in the frame typesame_frame
.same_frame_extended { u1 frame_type = SAME_FRAME_EXTENDED; /* 251 */ u2 offset_delta; }
The frame type
append_frame
is represented by tags in the range [252-254]. This frame type indicates that the frame has the same locals as the previous frame except that k additional locals are defined, and that the operand stack is empty. The value of k is given by the formulaframe_type - 251
. Theoffset_delta
value for the frame is given explicitly.append_frame { u1 frame_type = APPEND; /* 252-254 */ u2 offset_delta; verification_type_info locals[frame_type - 251]; }
The 0th entry in
locals
represents the verification type of the first additional local variable. Iflocals[M]
represents local variableN
, then:locals[M+1]
represents local variableN+1
iflocals[M]
is one ofTop_variable_info
,Integer_variable_info
,Float_variable_info
,Null_variable_info
,UninitializedThis_variable_info
,Object_variable_info
, orUninitialized_variable_info
; andlocals[M+1]
represents local variableN+2
iflocals[M]
is eitherLong_variable_info
orDouble_variable_info
.
It is an error if, for any index i,locals[*i*]
represents a local variable whose index is greater than the maximum number of local variables for the method.This is checked by verification.
The frame type
full_frame
is represented by the tag 255. Theoffset_delta
value for the frame is given explicitly.
The 0th entry infull_frame { u1 frame_type = FULL_FRAME; /* 255 */ u2 offset_delta; u2 number_of_locals; verification_type_info locals[number_of_locals]; u2 number_of_stack_items; verification_type_info stack[number_of_stack_items]; }
locals
represents the verification type of local variable- If
locals[M]
represents local variableN
, then:
locals[M+1]
represents local variableN+1
iflocals[M]
is one ofTop_variable_info
,Integer_variable_info
,Float_variable_info
,Null_variable_info
,UninitializedThis_variable_info
,Object_variable_info
, orUninitialized_variable_info
; andlocals[M+1]
represents local variableN+2
iflocals[M]
is eitherLong_variable_info
orDouble_variable_info
.
It is an error if, for any index i,locals[*i*]
represents a local variable whose index is greater than the maximum number of local variables for the method.This is checked by verification.
The 0th entry in
stack
represents the verification type of the bottom of the operand stack, and subsequent entries instack
represent the verification types of stack entries closer to the top of the operand stack. We refer to the bottom of the operand stack as stack entry 0, and to subsequent entries of the operand stack as stack entry 1, 2, etc. Ifstack[M]
represents stack entryN
, then:stack[M+1]
represents stack entryN+1
ifstack[M]
is one ofTop_variable_info
,Integer_variable_info
,Float_variable_info
,Null_variable_info
,UninitializedThis_variable_info
,Object_variable_info
, orUninitialized_variable_info
; andstack[M+1]
represents stack entryN+2
ifstack[M]
is eitherLong_variable_info
orDouble_variable_info
.
It is an error if, for any index i,stack[*i*]
represents a stack entry whose index is greater than the maximum operand stack size for the method.This is checked by verification.
- If
4.7.5 The Exceptions
Attribute
The Exceptions
attribute is a variable-length attribute in the attributes
table of a method_info
structure (4.6). The Exceptions
attribute indicates which checked exceptions a method may throw.
There must be no more than one Exceptions
attribute in the attributes
table of a method_info
structure.
The Exceptions
attribute must should have the following format:
Exceptions_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 number_of_exceptions;
u2 exception_index_table[number_of_exceptions];
}
The items of the Exceptions_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "Exceptions
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes.- number_of_exceptions
The value of the
number_of_exceptions
itemindicatesshould indicate the number of entries in theexception_index_table
.- exception_index_table[]
Each value in the
exception_index_table
arraymustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould be aCONSTANT_Class_info
structure (4.4.1) representing a class type that this method is declared to throw.
A method should throw an exception only if at least one of the following three criteria is met:
The exception is an instance of
RuntimeException
or one of its subclasses.The exception is an instance of
Error
or one of its subclasses.The exception is an instance of one of the exception classes specified in the
exception_index_table
just described, or one of their subclasses.
These requirements are not enforced in the Java Virtual Machine; they are enforced only at compile time.
4.7.6 The InnerClasses
Attribute
The InnerClasses
attribute is a variable-length attribute in the attributes
table of a ClassFile
structure (4.1).
There must be no more than one InnerClasses
attribute in the attributes
table of a ClassFile
structure.
The InnerClasses
attribute must should have the following format:
InnerClasses_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 number_of_classes;
{ u2 inner_class_info_index;
u2 outer_class_info_index;
u2 inner_name_index;
u2 inner_class_access_flags;
} classes[number_of_classes];
}
The items of the InnerClasses_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "InnerClasses
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes.- number_of_classes
The value of the
number_of_classes
itemindicatesshould indicate the number of entries in theclasses
array.- classes[]
Every
CONSTANT_Class_info
entry (4.4.1) in theconstant_pool
table which represents a class or interface C that is not a package member should have exactly one corresponding entry in theclasses
array.If a class or interface has members that are classes or interfaces, its
constant_pool
table (and hence itsInnerClasses
attribute) should refer to each such member (JLS §13.1), even if that member is not otherwise mentioned by the class.In addition, the
constant_pool
table of every nested class and nested interface should refer to its enclosing class, so altogether, every nested class and nested interface will haveInnerClasses
information for each enclosing class and for each of its own nested classes and interfaces.Each entry in the
classes
arraycontainsshould contain the following four items:- inner_class_info_index
The value of the
inner_class_info_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould be aCONSTANT_Class_info
structure representing C.- outer_class_info_index
If C is not a member of a class or an interface - that is, if C is a top-level class or interface (JLS §7.6) or a local class (JLS §14.3) or an anonymous class (JLS §15.9.5) - then the value of the
outer_class_info_index
itemmustshould be zero.Otherwise, the value of the
outer_class_info_index
itemmustshould be a valid index into theconstant_pool
table, and the entry at that indexmustshould be aCONSTANT_Class_info
structure representing the class or interface of which C is a member. The value of theouter_class_info_index
itemmustshould not equal the the value of theinner_class_info_index
item.- inner_name_index
If C is anonymous (JLS §15.9.5), the value of the
inner_name_index
itemmustshould be zero.Otherwise, the value of the
inner_name_index
itemmustshould be a valid index into theconstant_pool
table, and the entry at that indexmustshould be aCONSTANT_Utf8_info
structure that represents the original simple name of C, as given in the source code from which thisclass
file was compiled.- inner_class_access_flags
The value of the
inner_class_access_flags
item is a mask of flags used to denote access permissions to and properties of class or interface C as declared in the source code from which thisclass
file was compiled. It is used by a compiler to recover the original information when source code is not available. The flags are specified in Table 4.7.6-A.Table 4.7.6-A. Nested class access and property flags
Flag Name Value Interpretation ACC_PUBLIC
0x0001 Marked or implicitly public
in source.ACC_PRIVATE
0x0002 Marked private
in source.ACC_PROTECTED
0x0004 Marked protected
in source.ACC_STATIC
0x0008 Marked or implicitly static
in source.ACC_FINAL
0x0010 Marked or implicitly final
in source.ACC_INTERFACE
0x0200 Was an interface
in source.ACC_ABSTRACT
0x0400 Marked or implicitly abstract
in source.ACC_SYNTHETIC
0x1000 Declared synthetic; not present in the source code. ACC_ANNOTATION
0x2000 Declared as an annotation type. ACC_ENUM
0x4000 Declared as an enum
type.All bits of the
inner_class_access_flags
item not assigned in Table 4.7.6-A are reserved for future use. They should be set to zero in generatedclass
filesand should be ignored by Java Virtual Machine implementations.The entire attribute is ignored by Java Virtual Machine implementations.
If aclass
file has a version number that is 51.0 or above, and has anInnerClasses
attribute in itsattributes
table, then for all entries in theclasses
array of theInnerClasses
attribute, the value of theouter_class_info_index
item must be zero if the value of theinner_name_index
item is zero.This is implied by the descriptions above, so the only reason to restate it is to clarify a specific validation rule. Anyway, it does not appear that Hotspot has ever enforced this rule.
The Java Virtual Machine does not check the consistency of an
InnerClasses
attribute against aclass
file representing a class or interface referenced by the attribute.
4.7.7 The EnclosingMethod
Attribute
The EnclosingMethod
attribute is a fixed-length attribute in the attributes
table of a ClassFile
structure (4.1) in a version 49.0 or later class
file. A class or interface should have an EnclosingMethod
attribute if and only if it represents a local class or an anonymous class (JLS §14.3, JLS §15.9.5).
There must be no more than one EnclosingMethod
attribute in the attributes
table of a ClassFile
structure representing a class or interface. There must not be an EnclosingMethod
attribute in the attributes
table of a ClassFile
structure representing a module.
The EnclosingMethod
attribute must should have the following format:
EnclosingMethod_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 class_index;
u2 method_index;
}
The items of the EnclosingMethod_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "EnclosingMethod
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes, andmustshould be four.- class_index
The value of the
class_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould be aCONSTANT_Class_info
structure (4.4.1) representing the innermost class that encloses the declaration of the current class.- method_index
If the current class is not immediately enclosed by a method or constructor, then the value of the
method_index
itemmustshould be zero.In particular,
method_index
mustshould be zero if the current class was immediately enclosed in source code by an instance initializer, static initializer, instance variable initializer, or class variable initializer. (The first two concern both local classes and anonymous classes, while the last two concern anonymous classes declared on the right hand side of a field assignment.)Otherwise, the value of the
method_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould be aCONSTANT_NameAndType_info
structure (4.4.6) representing the name and type of a method in the class referenced by theclass_index
attribute above.It is the responsibility of a Java compiler to ensure that the method identified via the
method_index
is indeed the closest lexically enclosing method of the class that contains thisEnclosingMethod
attribute.In JDK 13, no check is performed to ensure
method_index
references a valid method name and descriptor.
4.7.8 The Synthetic
Attribute
The Synthetic
attribute is a fixed-length attribute in the attributes
table of a ClassFile
, field_info
, or method_info
structure (4.1, 4.5, 4.6). A class member that does not appear in the source code should be marked using a Synthetic
attribute, or else it should have its ACC_SYNTHETIC
flag set. The only exceptions to this requirement are compiler-generated methods which are not considered implementation artifacts, namely the instance initialization method representing a default constructor of the Java programming language (2.9.1), the class or interface initialization method (2.9.2), and the Enum.values()
and Enum.valueOf()
methods.
The
Synthetic
attribute was introduced in JDK 1.1 to support nested classes and interfaces.
There must not be a Synthetic
attribute in the attributes
table of a ClassFile
structure representing a module.
The Synthetic
attribute must should have the following format:
Synthetic_attribute {
u2 attribute_name_index;
u4 attribute_length;
}
The items of the Synthetic_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "Synthetic
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes, andmustshould be zero.
The effect of these changes is that a Synthetic
attribute will not cause an error if it has a non-empty payload.
4.7.9 The Signature
Attribute
The Signature
attribute is a fixed-length attribute in the attributes
table of a ClassFile
, field_info
, or method_info
structure (4.1, 4.5, 4.6) in a version 49.0 or later class
file. A Signature
attribute records a signature (4.7.9.1) for a class, interface, constructor, method, or field whose declaration in the Java programming language uses type variables or parameterized types. See The Java Language Specification, Java SE 12 Edition for details about these constructs.
There must be no more than one Signature
attribute in the attributes
table of a ClassFile
, field_info
, or method_info
structure. There must not be a Signature
attribute in the attributes
table of a ClassFile
structure representing a module.
The Signature
attribute must should have the following format:
Signature_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 signature_index;
}
The items of the Signature_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "Signature
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes, andmustshould be two.- signature_index
The value of the
signature_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould be aCONSTANT_Utf8_info
structure (4.4.7) representing a class signature if thisSignature
attribute is an attribute of aClassFile
structure; a method signature if thisSignature
attribute is an attribute of amethod_info
structure; or a field signature otherwise.
The Java Virtual Machine does not check the well-formedness of
Signature
attributes during class loading or linking. Instead,Signature
attributes are checked by methods of the Java SE Platform class libraries which expose generic signatures of classes, interfaces, constructors, methods, and fields. Examples includegetGenericSuperclass
inClass
andtoGenericString
injava.lang.reflect.Executable
.
4.7.10 The SourceFile
Attribute
The SourceFile
attribute is an optional fixed-length attribute in the attributes
table of a ClassFile
structure (4.1).
There must be no more than one SourceFile
attribute in the attributes
table of a ClassFile
structure.
The SourceFile
attribute must should have the following format:
SourceFile_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 sourcefile_index;
}
The items of the SourceFile_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "SourceFile
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes, andmustshould be two.- sourcefile_index
The value of the
sourcefile_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould be aCONSTANT_Utf8_info
structure representing a string.The string referenced by the
sourcefile_index
item will be interpreted as indicating the name of the source file from which thisclass
file was compiled. It will not be interpreted as indicating the name of a directory containing the file or an absolute path name for the file; such platform-specific additional information should be supplied by the run-time interpreter or development tool at the time the file name is actually used.
4.7.12 The LineNumberTable
Attribute
The LineNumberTable
attribute is an optional variable-length attribute in the attributes
table of a Code
attribute (4.7.3). It may be used by debuggers to determine which part of the code
array corresponds to a given line number in the original source file.
Multiple
LineNumberTable
attributes may be present in theattributes
table of aCode
attribute, and may appear in any order. MultipleLineNumberTable
attributes may together represent a given line of a source file, and need not be one-to-one with source lines.
The LineNumberTable
attribute must should have the following format:
LineNumberTable_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 line_number_table_length;
{ u2 start_pc;
u2 line_number;
} line_number_table[line_number_table_length];
}
The items of the LineNumberTable_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "LineNumberTable
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes.- line_number_table_length
The value of the
line_number_table_length
itemindicatesshould indicate the number of entries in theline_number_table
array.- line_number_table[]
Each entry in the
line_number_table
array indicates that the line number in the original source file changes at a given point in thecode
array. Eachline_number_table
entrymustshould contain the following two items:- start_pc
The value of the
start_pc
itemmustshould be a valid index into thecode
array of thisCode
attribute. The item indicates the index into thecode
array at which the code for a new line in the original source file begins.JDK 13 ensures
start_pc
is in range, but does not ensure that it refers to an opcode.- line_number
The value of the
line_number
itemgivesshould give the corresponding line number in the original source file.
4.7.13 The LocalVariableTable
Attribute
The LocalVariableTable
attribute is an optional variable-length attribute in the attributes
table of a Code
attribute (4.7.3). It may be used by debuggers to determine the value of a given local variable during the execution of a method.
There may be no more than one LocalVariableTable
attribute per local variable in the attributes
table of a Code
attribute.
It's not entirely clear what this rule means, but it won't be enforced, and the below note captures the same idea.
In JDK 13, the same combination of (start
, length
, name
, index
) is not allowed to appear twice in a single LocalVariableTable
attribute or across multiple LocalVariableTable
attributes.
Multiple
LocalVariableTable
attributes may be present in theattributes
table of aCode
attribute, and they may appear in any order. Each local variable in the source code should appear at most once in one of theLocalVariableTable
attributes.
The LocalVariableTable
attribute must should have the following format:
LocalVariableTable_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 local_variable_table_length;
{ u2 start_pc;
u2 length;
u2 name_index;
u2 descriptor_index;
u2 index;
} local_variable_table[local_variable_table_length];
}
The items of the LocalVariableTable_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "LocalVariableTable
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes.- local_variable_table_length
The value of the
local_variable_table_length
itemindicatesshould indicate the number of entries in thelocal_variable_table
array.- local_variable_table[]
Each entry in the
local_variable_table
arrayindicatesshould indicate a range ofcode
array offsets within which a local variable has a value, andindicatesshould indicate the index into the local variable array of the current frame at which that local variable can be found. Each entrymustshould contain the following five items:- start_pc, length
The value of the
start_pc
itemmustshould be a valid index into thecode
array of thisCode
attribute andmustshould be the index of the opcode of an instruction.The value of
start_pc + length
mustshould either be a valid index into thecode
array of thisCode
attribute and be the index of the opcode of an instruction, or itmustshould be the first index beyond the end of thatcode
array.The
start_pc
andlength
items indicate that the given local variable has a value at indices into thecode
array in the interval [start_pc
,start_pc + length
), that is, betweenstart_pc
inclusive andstart_pc + length
exclusive.JDK 13 ensures that
start_pc
andstart_pc
+length
are in range, and then checks at verification time that they refer to opcodes.- name_index
The value of the
name_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould contain aCONSTANT_Utf8_info
structure representing a valid unqualified name denoting a local variable (4.2.2).- descriptor_index
The value of the
descriptor_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould contain aCONSTANT_Utf8_info
structure representing a field descriptor which encodes the type of a local variable in the source program (4.3.2).- index
The value of the
index
itemmustshould be a valid index into the local variable array of the current frame. The given local variableisshould be atindex
in the local variable array of the current frame.If the given local variable is of type
double
orlong
, itoccupiesshould occupy bothindex
andindex + 1
.JDK 13 permits
index
values that refer to the second half of adouble
orlong
.
4.7.14 The LocalVariableTypeTable
Attribute
The LocalVariableTypeTable
attribute is an optional variable-length attribute in the attributes
table of a Code
attribute (4.7.3) in a version 49.0 or later class
file. It may be used by debuggers to determine the value of a given local variable during the execution of a method.
There may be no more than one LocalVariableTypeTable
attribute per local variable in the attributes
table of a Code
attribute.
It's not entirely clear what this rule means, but it won't be enforced, and the below note captures the same idea.
In JDK 13, the same combination of (start
, length
, name
, index
) can appear twice in a single LocalVariableTable
attribute.
Multiple
LocalVariableTypeTable
attributes may be present in theattributes
table of aCode
attribute, and they may appear in any order. Each local variable in the source code should appear at most once in one of theLocalVariableTypeTable
attributes.
The
LocalVariableTypeTable
attribute differs from theLocalVariableTable
attribute (4.7.13) in that it provides signature information rather than descriptor information. This difference is only significant for variables whose type uses a type variable or parameterized type. Such variables should appear in both tables, while variables of other types might appear only inLocalVariableTable
.
The LocalVariableTypeTable
attribute must should have the following format:
LocalVariableTypeTable_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 local_variable_type_table_length;
{ u2 start_pc;
u2 length;
u2 name_index;
u2 signature_index;
u2 index;
} local_variable_type_table[local_variable_type_table_length];
}
The items of the LocalVariableTypeTable_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "LocalVariableTypeTable
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes.- local_variable_type_table_length
The value of the
local_variable_type_table_length
itemindicatesshould indicate the number of entries in thelocal_variable_type_table
array.- local_variable_type_table[]
Each entry in the
local_variable_type_table
arrayindicatesshould indicate a range ofcode
array offsets within which a local variable has a value, andindicatesshould indicate the index into the local variable array of the current frame at which that local variable can be found. Each entrymustshould contain the following five items:- start_pc, length
The value of the
start_pc
itemmustshould be a valid index into thecode
array of thisCode
attribute andmustshould be the index of the opcode of an instruction.The value of
start_pc + length
mustshould either be a valid index into thecode
array of thisCode
attribute and be the index of the opcode of an instruction, or itmustshould be the first index beyond the end of thatcode
array.The
start_pc
andlength
items indicate that the given local variable has a value at indices into thecode
array in the interval [start_pc
,start_pc + length
), that is, betweenstart_pc
inclusive andstart_pc + length
exclusive.JDK 13 ensures that
start_pc
andstart_pc
+length
are in range, and then checks at verification time that they refer to opcodes.- name_index
The value of the
name_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould contain aCONSTANT_Utf8_info
structure representing a valid unqualified name denoting a local variable (4.2.2).- signature_index
The value of the
signature_index
itemmustshould be a valid index into theconstant_pool
table. Theconstant_pool
entry at that indexmustshould contain aCONSTANT_Utf8_info
structure representing a field signature which encodes the type of a local variable in the source program (4.7.9.1).In JDK 13, the
signature_index
item is allowed to represent a method signature.- index
The value of the
index
itemmustshould be a valid index into the local variable array of the current frame. The given local variableisshould be atindex
in the local variable array of the current frame.If the given local variable is of type
double
orlong
, itoccupiesshould occupy bothindex
andindex + 1
.
4.7.15 The Deprecated
Attribute
The Deprecated
attribute is an optional fixed-length attribute in the attributes
table of a ClassFile
, field_info
, or method_info
structure (4.1, 4.5, 4.6). A class, interface, method, or field may be marked using a Deprecated
attribute to indicate that the class, interface, method, or field has been superseded.
There must not be a Deprecated
attribute in the attributes
table of a ClassFile
structure representing a module.
A run-time interpreter or tool that reads the class
file format, such as a compiler, can use this marking to advise the user that a superseded class, interface, method, or field is being referred to. The presence of a Deprecated
attribute does not alter the semantics of a class or interface.
The Deprecated
attribute must should have the following format:
Deprecated_attribute {
u2 attribute_name_index;
u4 attribute_length;
}
The items of the Deprecated_attribute
structure are as follows:
- attribute_name_index
The value of the
attribute_name_index
item is an index into theconstant_pool
table. Theconstant_pool
entry at that index is aCONSTANT_Utf8_info
structure (4.4.7) representing the string "Deprecated
".- attribute_length
The value of the
attribute_length
item indicates the length of the attribute, excluding the initial six bytes, andmustshould be zero.
The effect of these changes is that a Deprecated
attribute will not cause an error if it has a non-empty payload.
4.9 Constraints on Java Virtual Machine Code
The code for a method, instance initialization method (2.9.1), or class or interface initialization method (2.9.2) is stored in the code
array of the Code
attribute of a method_info
structure of a class
file (4.7.3). This section describes the constraints associated with the contents of the Code_attribute
structure and associated StackMapTable
structures.
Initialization methods are methods. No need to call them out separately.
These constraints are enforced by verification (4.10).
4.9.1 Static Constraints
The static constraints on a The class
file are those defining the well-formedness of the file. These constraints have been given in the previous sections, except for static constraints on the code in the class
file.static constraints static constraints on the code in a class
file specify how Java Virtual Machine instructions must be laid out in the code
array, and what the operands of individual instructions must be, and how exception_table
and StackMapTable
entries reference the code
array and the constant pool.
The deleted sentences unhelpfully blur the distinction between format checking and verification.
The static constraints on the instructions in the code
array are as follows:
Only instances of the instructions documented in 6.5 may appear in the
code
array. Instances of instructions using the reserved opcodes (6.2) or any opcodes not documented in this specification must not appear in thecode
array.If the
class
file version number is 51.0 or above, then neither the jsr opcode or the jsr_w opcode may appear in thecode
array.The opcode of the first instruction in the
code
array begins at index0
.For each instruction in the
code
array except the last, the index of the opcode of the next instruction equals the index of the opcode of the current instruction plus the length of that instruction, including all its operands.The wide instruction is treated like any other instruction for these purposes; the opcode specifying the operation that a wide instruction is to modify is treated as one of the operands of that wide instruction. That opcode must never be directly reachable by the computation.
The last byte of the last instruction in the
code
array must be the byte at indexcode_length - 1
.
The static constraints on the operands of instructions in the code
array are as follows:
The target of each jump and branch instruction (jsr, jsr_w, goto, goto_w, ifeq, ifne, ifle, iflt, ifge, ifgt, ifnull, ifnonnull, if_icmpeq, if_icmpne, if_icmple, if_icmplt, if_icmpge, if_icmpgt, if_acmpeq, if_acmpne) must be the opcode of an instruction within this method.
The target of a jump or branch instruction must never be the opcode used to specify the operation to be modified by a wide instruction; a jump or branch target may be the wide instruction itself.
Each target, including the default, of each tableswitch instruction must be the opcode of an instruction within this method.
Each tableswitch instruction must have a number of entries in its jump table that is consistent with the value of its low and high jump table operands, and its low value must be less than or equal to its high value.
No target of a tableswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a tableswitch target may be a wide instruction itself.
Each target, including the default, of each lookupswitch instruction must be the opcode of an instruction within this method.
Each lookupswitch instruction must have a number of match-offset pairs that is consistent with the value of its npairs operand. The match-offset pairs must be sorted in increasing numerical order by signed match value.
No target of a lookupswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a lookupswitch target may be a wide instruction itself.
The operands of each ldc instruction and each ldc_w instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be loadable (4.4), and not any of the following:An entry of kind
CONSTANT_Long
orCONSTANT_Double
.An entry of kind
CONSTANT_Dynamic
that references aCONSTANT_NameAndType_info
structure which indicates a descriptor ofJ
(denotinglong
) orD
(denotingdouble
).
The operands of each ldc2_w instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be loadable, and in particular one of the following:An entry of kind
CONSTANT_Long
orCONSTANT_Double
.An entry of kind
CONSTANT_Dynamic
that references aCONSTANT_NameAndType_info
structure which indicates a descriptor ofJ
(denotinglong
) orD
(denotingdouble
).
The subsequent constant pool index must also be a valid index into the constant pool, and the constant pool entry at that index must not be used.I don't know what this constraint means—perhaps something to do with two-slot constants (compare 4.4.5)? This is not the place to enforce constant pool layout rules.
The operands of each getfield, putfield, getstatic, and putstatic instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_Fieldref
.The indexbyte operands of each invokevirtual instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_Methodref
.The indexbyte operands of each invokespecial and invokestatic instruction must represent a valid index into the
constant_pool
table. If theclass
file version number is less than 52.0, the constant pool entry referenced by that index must be of kindCONSTANT_Methodref
; if theclass
file version number is 52.0 or above, the constant pool entry referenced by that index must be of kindCONSTANT_Methodref
orCONSTANT_InterfaceMethodref
.The indexbyte operands of each invokeinterface instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_InterfaceMethodref
.The value of the count operand of each invokeinterface instruction must reflect the number of local variables necessary to store the arguments to be passed to the interface method, as implied by the descriptor of the
CONSTANT_NameAndType_info
structure referenced by theCONSTANT_InterfaceMethodref
constant pool entry.The fourth operand byte of each invokeinterface instruction must have the value zero.
The indexbyte operands of each invokedynamic instruction must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_InvokeDynamic
.The third and fourth operand bytes of each invokedynamic instruction must have the value zero.
Only the invokespecial instruction is allowed to invoke an instance initialization method (2.9.1).The method name of the
CONSTANT_Methodref
orCONSTANT_InterfaceMethodref
referenced by one of the instructions invokevirtual, invokestatic, or invokeinterface must not be<init>
.No other method whose name begins with the character '<
' ('\u003c
') may be called by the method invocation instructions. In particular, the class or interface initialization method specially named<clinit>
is never called explicitly from Java Virtual Machine instructions, but only implicitly by the Java Virtual Machine itself.Only the invokespecial instruction is allowed to invoke an instance initialization method (2.9.1). No instruction is allowed to invoke a class initialization method, because it cannot be referenced (4.4.2)—such methods are only invoked implicitly by the Java Virtual Machine (5.5).
The operands of each instanceof, checkcast, new, and anewarray instruction, and the indexbyte operands of each multianewarray instruction, must represent a valid index into the
constant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_Class
.No new instruction may reference a constant pool entry of kind
CONSTANT_Class
that represents an array type (4.3.2). The new instruction cannot be used to create an array.No anewarray instruction may be used to create an array of more than 255 dimensions.A multianewarray instruction must be used only to create an array of a type that has at least as many dimensions as the value of its dimensions operand. That is, while a multianewarray instruction is not required to create all of the dimensions of the array type referenced by its indexbyte operands, it must not attempt to create more dimensions than are in the array type.
The dimensions operand of each multianewarray instruction must not be zero.
The atype operand of each newarray instruction must take one of the values
T_BOOLEAN
(4),T_CHAR
(5),T_FLOAT
(6),T_DOUBLE
(7),T_BYTE
(8),T_SHORT
(9),T_INT
(10), orT_LONG
(11).The index operand of each iload, fload, aload, istore, fstore, astore, iinc, and ret instruction must be a non-negative integer no greater than
max_locals - 1
.The implicit index of each iload_<n>, fload_<n>, aload_<n>, istore_<n>, fstore_<n>, and astore_<n> instruction must be no greater than
max_locals - 1
.The index operand of each lload, dload, lstore, and dstore instruction must be no greater than
max_locals - 2
.The implicit index of each lload_<n>, dload_<n>, lstore_<n>, and dstore_<n> instruction must be no greater than
max_locals - 2
.The indexbyte operands of each wide instruction modifying an iload, fload, aload, istore, fstore, astore, iinc, or ret instruction must represent a non-negative integer no greater than
max_locals - 1
.The indexbyte operands of each wide instruction modifying an lload, dload, lstore, or dstore instruction must represent a non-negative integer no greater than
max_locals - 2
.
The static constraints on the entries in the exception_table
array are as follows:
The
start_pc
item of each entry must be a valid index into thecode
array and must be the index of the opcode of an instruction.The
end_pc
item of each entry must either be a valid index into thecode
array or must equal thecode_length
item. Ifend_pc
is a valid index into thecode
array, it must be the index of the opcode of an instruction. The value ofend_pc
must be greater than the corresponding value ofstart_pc
.The
handler_pc
item of each entry must be a valid index into thecode
array and must be the index of the opcode of an instruction.The
catch_type
item of each entry must either be 0 or represent a valid index into theconstant_pool
table. Ifcatch_type
is nonzero, the constant pool entry referenced by that index must be of kindCONSTANT_Class
.
The static constraints on the entries in the the entries
table of a StackMapTable
attribute (4.7.4) are as follows:
Each table entry must match one of the items of the
stack_map_frame
discriminated union.The number of entries in the
entries
table must equal thenumber_of_entries
item of theStackMapTable
attribute.For each verification type in the
locals
orstack
table of any frame type:If the verification type has the form
Object_variable_info
, it must have acpool_index
item that is a valid index into theconstant_pool
table. The constant pool entry referenced by that index must be of kindCONSTANT_Class
.If the verification type has the form
Uninitialized_variable_info
, it must have anoffset
item that is a valid index into thecode
array of theCode
attribute to which thisStackMapTable
belongs. Theoffset
must be the index of the opcode of anew
instruction.
In JDK 13, the constraints on exception_table
are enforced both at class loading time and verification time. (Specifically, constraints that require parsing a code array are delayed until verification time.) This specification makes them all verification time checks.
All constraints on StackMapTable
are enforced by JDK 13 at verification time, despite previously being specified as format checks (4.7.4).
The Uninitialized_variable_info
constraint is treated by JDK 13 as a structural constraint—in version 50.0 class files, an invalid Object_variable_info
causes a failure, but an invalid Uninitialized_variable_info
can be recovered from by falling back to verification by type inference. There's not a strong argument for this discrepancy, so we propose changing the behavior, making both unrecoverable failures. (This occurs naturally when the checks are treated as static constraints—see 4.10 and 4.10.1 for details.)
4.9.2 Structural Constraints
The structural constraints structural constraints on the code in a code
arrayclass
file specify constraints on relationships between Java Virtual Machine instructions, exception handlers, and stack maps. The structural constraints are as follows:
For each stack map table entry, the bytecode offset specified by the frame must be a valid index into the
Code
array, and must be the index of the opcode of an instruction.Bug fix: this rule was not specified previously.
This seems like a static constraint, but due to the encoding—a delta applied to the offset of the previous entry—it makes sense to categorize it as a structural constraint.
In JDK 13, a violation of this constraint is reported as a
VerifyError
and can be recovered from in version 50.0 class files (where verification by type inference can be performed).Each instruction must only be executed with the appropriate type and number of arguments in the operand stack and local variable array, regardless of the execution path that leads to its invocation.
An instruction operating on values of type
int
is also permitted to operate on values of typeboolean
,byte
,char
, andshort
.As noted in 2.3.4 and 2.11.1, the Java Virtual Machine internally converts values of types
boolean
,byte
,short
, andchar
to typeint
.)If an instruction can be executed along several different execution paths, the operand stack must have the same depth (2.6.2) prior to the execution of the instruction, regardless of the path taken.
At no point during execution can the operand stack grow to a depth greater than that implied by the
max_stack
item.At no point during execution can more values be popped from the operand stack than it contains.
No stack map table entry may represent a stack frame with a stack type array of size greater than that implied by the
max_stack
item.No stack map table entry may represent a stack frame with a local variable type array of size less than 0 or greater than that implied by the
max_locals
item.Most assertions formerly specified in 4.7.4 can be treated as static constraints. However, these constraints on stack type and local variable type arrays may require contextual knowledge about the size of the previous stack frame, so it seems more appropriate to treat them as structural constraints.
In JDK 13, consistent with this specification, constraints on the stack and local variable sizes of stack map table entries are recoverable when falling back to verification by type inference in version 50.0 class files.
At no point during execution can the order of the local variable pair holding a value of type
long
ordouble
be reversed or the pair split up. At no point can the local variables of such a pair be operated on individually.No local variable (or local variable pair, in the case of a value of type
long
ordouble
) can be accessed before it is assigned a value.Each invokespecial instruction must name one of the following:
an instance initialization method (2.9.1)
a method in the current class or interface
a method in a superclass of the current class
a method in a direct superinterface of the current class or interface
a method in
Object
If an invokespecial instruction names an instance initialization method, then the target reference on the operand stack must be an uninitialized class instance. An instance initialization method must never be invoked on an initialized class instance. In addition:
If the target reference on the operand stack is an uninitialized class instance for the current class, then invokespecial must name an instance initialization method from the current class or its direct superclass.
If an invokespecial instruction names an instance initialization method and the target reference on the operand stack is a class instance created by an earlier new instruction, then invokespecial must name an instance initialization method from the class of that class instance.
If an invokespecial instruction names a method which is not an instance initialization method, then the target reference on the operand stack must be a class instance whose type is assignment compatible with the current class (JLS §5.2).
The general rule for invokespecial is that the class or interface named by invokespecial must be be "above" the caller class or interface, while the receiver object targeted by invokespecial must be "at" or "below" the caller class or interface. The latter clause is especially important: a class or interface can only perform invokespecial on its own objects. See 4.10.1.9.invokespecial for an explanation of how the latter clause is implemented in Prolog.
Each instance initialization method, except for the instance initialization method derived from the constructor of class
Object
, must call either another instance initialization method ofthis
or an instance initialization method of its direct superclasssuper
before its instance members are accessed.However, instance fields of
this
that are declared in the current class may be assigned by putfield before calling any instance initialization method.When any instance method is invoked or when any instance variable is accessed, the class instance that contains the instance method or instance variable must already be initialized.
- If there is an uninitialized class instance in a local variable in code protected by an exception handler, then (i) if the handler is inside an
<init>
method, the handler must throw an exception or loop forever; and- if the handler is not inside an
<init>
method, the uninitialized class instance must remain uninitialized.
- if the handler is not inside an
There must never be an uninitialized class instance on the operand stack or in a local variable when a jsr or jsr_w instruction is executed.
The type of every class instance that is the target of a method invocation instruction (that is, the type of the target reference on the operand stack) must be assignment compatible with the class or interface type specified in the instruction.
The types of the arguments to each method invocation must be method invocation compatible with the method descriptor (JLS §5.3, 4.3.3).
Each return instruction must match its method's return type:
If the method returns a
boolean
,byte
,char
,short
, orint
, only the ireturn instruction may be used.If the method returns a
float
,long
, ordouble
, only an freturn, lreturn, or dreturn instruction, respectively, may be used.If the method returns a
reference
type, only an areturn instruction may be used, and the type of the returned value must be assignment compatible with the return descriptor of the method (4.3.3).All
instance initialization methods, class or interface initialization methods, andmethods declared to returnvoid
must use only the return instruction.
The type of every class instance accessed by a getfield instruction or modified by a putfield instruction (that is, the type of the target reference on the operand stack) must be assignment compatible with the class type specified in the instruction.
The type of every value stored by a putfield or putstatic instruction must be compatible with the descriptor of the field (4.3.2) of the class instance or class being stored into:
If the descriptor type is
boolean
,byte
,char
,short
, orint
, then the value must be anint
.If the descriptor type is
float
,long
, ordouble
, then the value must be afloat
,long
, ordouble
, respectively.If the descriptor type is a
reference
type, then the value must be of a type that is assignment compatible with the descriptor type.
The type of every value stored into an array by an aastore instruction must be a
reference
type.The component type of the array being stored into by the aastore instruction must also be a
reference
type.Each athrow instruction must throw only values that are instances of class
Throwable
or of subclasses ofThrowable
.Each class mentioned in acatch_type
item of theexception_table
array of the method'sCode_attribute
structure must beThrowable
or a subclass ofThrowable
.Each type mentioned in a
catch_type
item of theexception_table
array must beThrowable
or a subtype ofThrowable
.This rule is unrelated to any particular athrow instruction. Better to present it independently.
If getfield or putfield is used to access a
protected
field declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must be assignment compatible with the current class.If invokevirtual or invokespecial is used to access a
protected
method declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must be assignment compatible with the current class.Execution never falls off the bottom of the
code
array.No return address (a value of type
returnAddress
) may be loaded from a local variable.The instruction following each jsr or jsr_w instruction may be returned to only by a single ret instruction.
No jsr or jsr_w instruction that is returned to may be used to recursively call a subroutine if that subroutine is already present in the subroutine call chain. (Subroutines can be nested when using
try
-finally
constructs from within afinally
clause.)Each instance of type
returnAddress
can be returned to at most once.If a ret instruction returns to a point in the subroutine call chain above the ret instruction corresponding to a given instance of type
returnAddress
, then that instance can never be used as a return address.
4.10 Verification of class
Files
In this revision, static constraints (4.9.1) are explicitly checked as a first step of verification, before one of the two verification algorithms are invoked. Assertions related to static constraints can be removed from the typing algorithms.
Even though a compiler for the Java programming language must only produce class
files that satisfy all the static and structural constraints in the previous sections of Section 4.9, the Java Virtual Machine has no guarantee that any file it is asked to load was generated by that compiler or is properly formed. Applications such as web browsers do not download source code, which they then compile; these applications download already-compiled class
files. The browser needs to determine whether the class
file was produced by a trustworthy compiler or by an adversary attempting to exploit the Java Virtual Machine.
For example, applications such as web browsers do not download source code, which they then compile; these applications download already-compiled
class
files. The browser needs to determine whether theclass
file was produced by a trustworthy compiler or by an adversary attempting to exploit the Java Virtual Machine.
This example is better treated as non-normative explanatory text.
An additional problem with compile-time checking is version skew. A user may have successfully compiled a class, sayPurchaseStockOptions
, to be a subclass ofTradingClass
. But the definition ofTradingClass
might have changed since the time the class was compiled in a way that is not compatible with pre-existing binaries. Methods might have been deleted or had their return types or modifiers changed. Fields might have changed types or changed from instance variables to class variables. The access modifiers of a method or variable may have changed frompublic
toprivate
. For a discussion of these issues, see Chapter 13, "Binary Compatibility," in The Java Language Specification, Java SE 12 Edition.
This example is not relevant to verification. If we don't want to delete it, perhaps it could be moved to 5.4.3?
Because of these potential problems, the Java Virtual Machine needs to verify for itself that the desired constraints are satisfied by the class
files it attempts to incorporate. However, it may cause unwanted delays to check these constraints during format checking (4.8), which occurs as each class is loaded.
A Thus, a Java Virtual Machine implementation verifies that each class
file satisfies the necessary constraints on Code
attributes (4.7.3) and StackMapTable
attributes (4.7.4) at linking time (5.4).
Link-time verification enhances the performance of the run-time interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java Virtual Machine can assume that these checks have already been performed. For example, the Java Virtual Machine will already know the following:
There are no operand stack overflows or underflows.
All local variable uses and stores are valid.
The arguments to all the Java Virtual Machine instructions are of valid types.
Link-time verification enhances the performance of the run-time interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java Virtual Machine can assume that these checks have already been performed. For example, the Java Virtual Machine will already know the following:
There are no operand stack overflows or underflows.
All local variable uses and stores are valid.
The arguments to all the Java Virtual Machine instructions are of valid types.
This performance discussion is better treated as non-normative explanatory text.
For each Code
attribute in a class
file, verification proceeds in two steps.
The
code
andexception_table
arrays are parsed and checked, according to the static constraints described in 4.9.1. If theattributes
table contains aStackMapTable
attribute, it is also parsed and checked, as described in 4.9.1.If any static constraints are violated, verification fails with a
VerifyError
.Historically, this step of verification has been overlooked, implicitly handled by, e.g., assumptions that a Prolog predicate will not succeed if it's impossible to encode an instruction.
Note that a version 50.0 class file may fail verification due to a static constraint violation in a
StackMapTable
attribute, even if it would have succeeded verification by type inference if the attribute had been absent. The ability to recover from an error by falling back to verification by type inference is only available in step 2 (see 4.10.1).The structural constraints described in 4.9.2 are checked.
There are two strategies that Java Virtual Machine implementations may use for structural constraint verification:
Verification by type checking must be used to verify
class
files whose version number is greater than or equal to 50.0.Verification by type inference must be supported by all Java Virtual Machine implementations, except those conforming to the Java ME CLDC and Java Card profiles, in order to verify
class
files whose version number is less than 50.0.Verification on Java Virtual Machine implementations supporting the Java ME CLDC and Java Card profiles is governed by their respective specifications.
In either case, if the specified algorithms are unable to prove that all structural constraints hold, verification fails with a
VerifyError
.
In JDK 13, constraint failures involving the exceptions
table or the StackMapTable
attribute result in a verification-time ClassFormatError
. This behavior is confusing, and we propose throwing a VerifyError
instead.
4.10.1 Verification by Type Checking
A class
file whose version number is 50.0 or above (4.1) must be verified verify structural constraints using the type checking rules given in this section.
If, and only if, a class
file's version number equals 50.0, then if the type checking fails, a Java Virtual Machine implementation may choose to attempt to perform verification by type inference (4.10.2). In this case, structural constraints on the StackMapTable
attribute are not enforced.
This is a pragmatic adjustment, designed to ease the transition to the new verification discipline. Many tools that manipulate
class
files may alter the bytecodes of a method in a manner that requires adjustment of the method's stack map frames. If a tool does not make the necessary adjustments to the stack map frames, type checking may fail even though the bytecode is in principle valid (and would consequently verify under the old type inference scheme). To allow implementors time to adapt their tools, Java Virtual Machine implementations may fall back to the older verification discipline, but only for a limited time.
In cases where type checking fails but type inference is invoked and succeeds, a certain performance penalty is expected. Such a penalty is unavoidable. It also should serve as a signal to tool vendors that their output needs to be adjusted, and provides vendors with additional incentive to make these adjustments.
In summary, failover to verification by type inference supports both the gradual addition of stack map frames to the Java SE Platform (if they are not present in a version 50.0
class
file, failover is allowed) and the gradual removal of the jsr and jsr_w instructions from the Java SE Platform (if they are present in a version 50.0class
file, failover is allowed).
If a Java Virtual Machine implementation ever attempts to perform verification by type inference on version 50.0 class files, it must do so in all cases where verification by type checking fails.
This means that a Java Virtual Machine implementation cannot choose to resort to type inference in once case and not in another. It must either reject
class
files that do not verify via type checking, or else consistently failover to the type inferencing verifier whenever type checking fails.
The type checker enforces type rules that are specified by means of Prolog clauses. English language text is used to describe the type rules in an informal way, while the Prolog clauses provide a formal specification.
The type checker requires a list of stack map frames for each method with a Code
attribute (4.7.3). A list of stack map frames is given by the StackMapTable
attribute (4.7.4) of a Code
attribute. The intent is that a stack map frame must appear at the beginning of each basic block in a method. The stack map frame specifies the verification type of each operand stack entry and of each local variable at the start of each basic block. The type checker reads the stack map frames for each method with a Code
attribute and uses these maps to generate a proof of the type safety of the instructions in the Code
attribute.
The Prolog predicate methodIsTypeSafe
(4.10.1.6) determines whether the Code
attribute of a particular method of a particular class or interface conforms to the structural constraints described in 4.9.2. The structural constraints are satisfied by the method if and only if the methodIsTypeSafe
predicate is true.
The rest of this section explains the process of type checking in detail:
First, we give Prolog predicates for core Java Virtual Machine artifacts like classes and methods (4.10.1.1).
Second, we specify the type system known to the type checker (4.10.1.2).
Third, we specify the Prolog representation of instructions and stack map frames (4.10.1.3, 4.10.1.4).
Fourth, we specify how a method is type checked (4.10.1.6).
Fifth, we discuss type checking issues common to all load and store instructions (4.10.1.7), and also issues of access to
protected
members (4.10.1.8).Finally, we specify the rules to type check each instruction (4.10.1.9).
4.10.1.1 Accessors for Java Virtual Machine Artifacts
...
- parseCodeAttribute(Class, Method, FrameSize, MaxStack, ParsedCode, Handlers, StackMap)
Extracts the instruction stream,
ParsedCode
, of the methodMethod
inClass
, as well as the maximum operand stack size,MaxStack
, the maximal number of local variables,FrameSize
, the exception handlers,Handlers
, and the stack mapStackMap
.The representation of the instruction stream and stack map attribute must be as specified in 4.10.1.3 and 4.10.1.4.
If a
StackMapTable
attribute is absent, the implicit stack map attribute is used (4.7.4).If an entry in the stack map table violates structural constraints (4.9.2) because of an invalid
Code
array offset, an invalid stack type array size, or an invalid local variable type array size, theparseCodeAttribute
predicate fails.
...
4.10.1.3 Instruction Representation
Bug fix: member references and instructions like checkcast
and ldc
can refer to array types, so CONSTANT_Class
structures need to be encoded as types, not class names.
Individual bytecode instructions are represented in Prolog as terms whose functor is the name of the instruction and whose arguments are its parsed operands.
For example, an aload instruction is represented as the term
aload(N)
, which includes the indexN
that is the operand of the instruction.
The instructions as a whole are represented as a list of terms of the form:
instruction(Offset, AnInstruction)
For example,
instruction(21, aload(1))
.
The order of instructions in this list must be the same as in the class
file.
Some instructions have operands that refer to entries in the constant_pool
table representing fields, methods, and dynamically-computed call sites. Such entries are represented as functor applications of the form:
class(N, L)
orarrayOf(T)
for a constant pool entry that is aCONSTANT_Class_info
structure (4.4.1).These are verification types, as described in 4.10.1.2.
If the
name_index
item of the structure gives the name of a class or interface,N
is that name, andL
is the class loader of the class or interface containing the constant pool.If the
name_index
item of the structure gives an array type,T
is the array component type.
field(
FieldClassName
FieldClass
, FieldName, FieldDescriptor)
for a constant pool entry that is aCONSTANT_Fieldref_info
structure (4.4.2).FieldClassName
is the name of the classFieldClass
is the verification type of the class, interface, or array type referenced by theclass_index
item in the structure.FieldName
andFieldDescriptor
correspond to the name and field descriptor referenced by thename_and_type_index
item of the structure.method(
MethodClassName
MethodClass
, MethodName, MethodDescriptor)
for a constant pool entry that is aCONSTANT_Methodref_info
structure (4.4.2).MethodClassName
is the name of the classMethodClass
is the verification type of the class, interface, or array type referenced by theclass_index
item of the structure.MethodName
andMethodDescriptor
correspond to the name and method descriptor referenced by thename_and_type_index
item of the structure.imethod(
MethodClassName
MethodClass
, MethodName, MethodDescriptor)
for a constant pool entry that is aCONSTANT_InterfaceMethodref_info
structure (4.4.2).MethodIntfName
is the name of the interfaceMethodClass
is the verification type of the class, interface, or array type referenced by theclass_index
item of the structure.MethodName
andMethodDescriptor
correspond to the name and method descriptor referenced by thename_and_type_index
item of the structure.
string(Value)
for a constant pool entry that is aCONSTANT_String_info
structure (4.4.3).Value
is the string referenced by thestring_index
item of the structure.int(Value)
for a constant pool entry that is aCONSTANT_Integer_info
structure (4.4.4).Value
is theint
constant represented by thebytes
item of the structure.float(Value)
for a constant pool entry that is aCONSTANT_Float_info
structure (4.4.4).Value
is thefloat
constant represented by thebytes
item of the structure.long(Value)
for a constant pool entry that is aCONSTANT_Long_info
structure (4.4.5).Value
is thelong
constant represented by thehigh_bytes
andlow_bytes
items of the structure.double(Value)
for a constant pool entry that is aCONSTANT_Double_info
structure (4.4.5).Value
is thedouble
constant represented by thehigh_bytes
andlow_bytes
items of the structure.methodHandle(Kind, Reference)
for a constant pool entry that is aCONSTANT_MethodHandle_info
structure (4.4.8).Kind
is the value of thereference_kind
item of the structure.Reference
is the value of thereference_index
item of the structure.methodType(MethodDescriptor)
for a constant pool entry that is aCONSTANT_MethodType_info
structure (4.4.9).MethodDescriptor
is the method descriptor referenced by thedescriptor_index
item of the structure.dconstant(ConstantName, FieldDescriptor)
for a constant pool entry that is aCONSTANT_Dynamic_info
structure (4.4.10).ConstantName
andFieldDescriptor
correspond to the name and field descriptor referenced by thename_and_type_index
item of the structure. (Thebootstrap_method_attr_index
item is irrelevant to verification.)
dmethod(CallSiteName, MethodDescriptor)
for a constant pool entry that is aCONSTANT_InvokeDynamic_info
structure (4.4.10).CallSiteName
andMethodDescriptor
correspond to the name and method descriptor referenced by thename_and_type_index
item of the structure. (Thebootstrap_method_attr_index
item is irrelevant to verification.)
We've combined the two lists of constant forms into one. Because CONSTANT_Class_info
is relevant to both lists, it's awkward to keep the two separate. It would require, say, a forward reference from the first list to the second.
Note that, since static constraints have already been checked, there's no particular reason to make a disinction between method
and imethod
. It would simplify some of the rules if we just used method
in both casees instead. (On the other hand, this would introduce an anomaly, being the only case in which multiple constant pool structures map to the same Prolog functor.)
For clarity, we assume that field and method descriptors (4.3.2, 4.3.3) are mapped into more readable names: the leading L
and trailing ;
are dropped from class names, and the BaseType characters used for primitive types are mapped to the names of those types.
The descriptor should always be processed with parseFieldDescriptor, so its format doesn't need to be specified.
For example, a getfield instruction whose operand refers to a constant pool entry representing a field
foo
of typeF
in classBar
would be represented asgetfield(field('Bar', 'foo', 'F'))
getfield(field(class('Bar', L), 'foo', 'F'))
, whereL
is the class loader of the class containing the instruction. An ldc instruction for loading theint
constant 91 would be represented asldc(int(91))
.
The ldc instruction, among others, has an operand that refers to a loadable entry in the constant_pool
table. There are nine kinds of loadable entry (see Table 4.4-C), represented by functor applications of the following forms:
int(Value)
for a constant pool entry that is aCONSTANT_Integer_info
structure (4.4.4).Value
is theint
constant represented by thebytes
item of the structure.For example, an ldc instruction for loading the
int
constant 91 would be represented asldc(int(91))
.float(Value)
for a constant pool entry that is aCONSTANT_Float_info
structure (4.4.4).Value
is thefloat
constant represented by thebytes
item of the structure.long(Value)
for a constant pool entry that is aCONSTANT_Long_info
structure (4.4.5).Value
is thelong
constant represented by thehigh_bytes
andlow_bytes
items of the structure.double(Value)
for a constant pool entry that is aCONSTANT_Double_info
structure (4.4.5).Value
is thedouble
constant represented by thehigh_bytes
andlow_bytes
items of the structure.class(ClassName)
for a constant pool entry that is aCONSTANT_Class_info
structure (4.4.1).ClassName
is the name of the class or interface referenced by thename_index
item in the structure.string(Value)
for a constant pool entry that is aCONSTANT_String_info
structure (4.4.3).Value
is the string referenced by thestring_index
item of the structure.methodHandle(Kind, Reference)
for a constant pool entry that is aCONSTANT_MethodHandle_info
structure (4.4.8).Kind
is the value of thereference_kind
item of the structure.Reference
is the value of thereference_index
item of the structure.methodType(MethodDescriptor)
for a constant pool entry that is aCONSTANT_MethodType_info
structure (4.4.9).MethodDescriptor
is the method descriptor referenced by thedescriptor_index
item of the structure.dconstant(ConstantName, FieldDescriptor)
for a constant pool entry that is aCONSTANT_Dynamic_info
structure (4.4.10).ConstantName
andFieldDescriptor
correspond to the name and field descriptor referenced by thename_and_type_index
item of the structure. (Thebootstrap_method_attr_index
item is irrelevant to verification.)
4.10.1.6 Type Checking Methods
A method with a Code
attribute is type safe if it is possible to merge the code and the stack map frames into a single stream such that each stack map frame precedes the instruction it corresponds to, and the merged stream is type correct. The method's exception handlers, if any, must also be legal.
methodIsTypeSafe(Class, Method) :-
parseCodeAttribute(Class, Method, FrameSize, MaxStack,
ParsedCode, Handlers, StackMap),
mergeStackMapAndCode(StackMap, ParsedCode, MergedCode),
methodInitialStackFrame(Class, Method, FrameSize, StackFrame, ReturnType),
Environment = environment(Class, Method, ReturnType, MergedCode,
MaxStack, Handlers),
handlersAreLegal(Environment),
mergedCodeIsTypeSafe(Environment, MergedCode, StackFrame).
Let us consider exception handlers first.
An exception handler is represented by a functor application of the form:
handler(Start, End, Target, ClassName)
whose arguments are, respectively, the start and end of the range of instructions covered by the handler, the first instruction of the handler code, and the name of the exception class that this handler is designed to handle.
An exception handler is legal if its start ( there is a stack frame at the start of the handler code (Start
) is less than its end (End
), there exists an instruction whose offset is equal to Start
, there exists an instruction whose offset equals End
,Target
) and the handler's exception class is assignable to the class Throwable
. The exception class of a handler is Throwable
if the handler's class entry is 0, otherwise it is the class named in the handler.
An additional requirement exists for a handler inside an <init>
method if one of the instructions covered by the handler is invokespecial of an <init>
method. In this case, the fact that a handler is running means the object under construction is likely broken, so it is important that the handler does not swallow the exception and allow the enclosing <init>
method to return normally to the caller. Accordingly, the handler is required to either complete abruptly by throwing an exception to the caller of the enclosing <init>
method, or to loop forever.
handlersAreLegal(Environment) :-
exceptionHandlers(Environment, Handlers),
checklist(handlerIsLegal(Environment), Handlers).
handlerIsLegal(Environment, Handler) :-
Handler = handler(Start, End, Target, _),
Start < End,
allInstructions(Environment, Instructions),
member(instruction(Start, _), Instructions),
offsetStackFrame(Environment, Target, _),
instructionsIncludeEnd(Instructions, End),
currentClassLoader(Environment, CurrentLoader),
handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
isBootstrapLoader(BL),
isAssignable(ExceptionClass, class('java/lang/Throwable', BL)),
initHandlerIsLegal(Environment, Handler).
instructionsIncludeEnd(Instructions, End) :-
member(instruction(End, _), Instructions).
instructionsIncludeEnd(Instructions, End) :-
member(endOfCode(End), Instructions).
handlerExceptionClass(handler(_, _, _, 0),
class('java/lang/Throwable', BL), _) :-
isBootstrapLoader(BL).
handlerExceptionClass(handler(_, _, _, Name),
class(Name, L), L) :-
Name \= 0.
initHandlerIsLegal(Environment, Handler) :-
notInitHandler(Environment, Handler).
notInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isNotInit(Method).
notInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method),
member(instruction(_, invokespecial(CP)), Instructions),
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`'.
initHandlerIsLegal(Environment, Handler) :-
isInitHandler(Environment, Handler),
sublist(isApplicableInstruction(Target), Instructions,
HandlerInstructions),
noAttemptToReturnNormally(HandlerInstructions).
isInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method).
member(instruction(_, invokespecial(CP)), Instructions),
CP = method(MethodClassName, '`<init>`', Descriptor).
isApplicableInstruction(HandlerStart, instruction(Offset, _)) :-
Offset >= HandlerStart.
noAttemptToReturnNormally(Instructions) :-
notMember(instruction(_, return), Instructions).
noAttemptToReturnNormally(Instructions) :-
member(instruction(_, athrow), Instructions).
Let us now turn to the stream of instructions and stack map frames.
Merging instructions and stack map frames into a single stream involves four cases:
Merging an empty
StackMap
and a list of instructions yields the original list of instructions.mergeStackMapAndCode([], CodeList, CodeList).
Given a list of stack map frames beginning with the type state for the instruction at
Offset
, and a list of instructions beginning atOffset
, the merged list is the head of the stack map frame list, followed by the head of the instruction list, followed by the merge of the tails of the two lists.mergeStackMapAndCode([stackMap(Offset, Map) | RestMap], [instruction(Offset, Parse) | RestCode], [stackMap(Offset, Map), instruction(Offset, Parse) | RestMerge]) :- mergeStackMapAndCode(RestMap, RestCode, RestMerge).
Otherwise, given a list of stack map frames beginning with the type state for the instruction at
OffsetM
, and a list of instructions beginning atOffsetP
, then, ifOffsetP < OffsetM
, the merged list consists of the head of the instruction list, followed by the merge of the stack map frame list and the tail of the instruction list.mergeStackMapAndCode([stackMap(OffsetM, Map) | RestMap], [instruction(OffsetP, Parse) | RestCode], [instruction(OffsetP, Parse) | RestMerge]) :- OffsetP < OffsetM, mergeStackMapAndCode([stackMap(OffsetM, Map) | RestMap], RestCode, RestMerge).
Otherwise, the merge of the two lists is undefined. Since the instruction list has monotonically increasing offsets, the merge of the two lists is not defined unless every stack map frame offset has a corresponding instruction offset
and the stack map frames are in monotonically increasing order.The stack map frames will always be in monotonically increasing order, due to their encoding (4.7.4).
To determine if the merged stream for a method is type correct, we first infer the method's initial type state.
The initial type state of a method consists of an empty operand stack and local variable types derived from the type of this
and the arguments, as well as the appropriate flag, depending on whether this is an <init>
method.
methodInitialStackFrame(Class, Method, FrameSize, frame(Locals, [], Flags),
ReturnType):-
methodDescriptor(Method, Descriptor),
parseMethodDescriptor(Descriptor, RawArgs, ReturnType),
expandTypeList(RawArgs, Args),
methodInitialThisType(Class, Method, ThisList),
flags(ThisList, Flags),
append(ThisList, Args, ThisArgs),
expandToLength(ThisArgs, FrameSize, top, Locals).
Given a list of types, the following clause produces a list where every type of size 2 has been substituted by two entries: one for itself, and one top
entry. The result then corresponds to the representation of the list as 32-bit words in the Java Virtual Machine.
expandTypeList([], []).
expandTypeList([Item | List], [Item | Result]) :-
sizeOf(Item, 1),
expandTypeList(List, Result).
expandTypeList([Item | List], [Item, top | Result]) :-
sizeOf(Item, 2),
expandTypeList(List, Result).
flags([uninitializedThis], [flagThisUninit]).
flags(X, []) :- X \= [uninitializedThis].
expandToLength(List, Size, _Filler, List) :-
length(List, Size).
expandToLength(List, Size, Filler, Result) :-
length(List, ListLength),
ListLength < Size,
Delta is Size - ListLength,
length(Extra, Delta),
checklist(=(Filler), Extra),
append(List, Extra, Result).
For the initial type state of an instance method, we compute the type of this
and put it in a list. The type of this
in the <init>
method of Object
is Object
; in other <init>
methods, the type of this
is uninitializedThis
; otherwise, the type of this
in an instance method is class(N, L)
where N
is the name of the class containing the method and L
is its defining class loader.
For the initial type state of a static method, this
is irrelevant, so the list is empty.
methodInitialThisType(_Class, Method, []) :-
methodAccessFlags(Method, AccessFlags),
member(static, AccessFlags),
methodName(Method, MethodName),
MethodName \= '`<init>`'.
methodInitialThisType(Class, Method, [This]) :-
methodAccessFlags(Method, AccessFlags),
notMember(static, AccessFlags),
instanceMethodInitialThisType(Class, Method, This).
instanceMethodInitialThisType(Class, Method, class('java/lang/Object', L)) :-
methodName(Method, '`<init>`'),
classDefiningLoader(Class, L),
isBootstrapLoader(L),
classClassName(Class, 'java/lang/Object').
instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
methodName(Method, '`<init>`'),
classClassName(Class, ClassName),
classDefiningLoader(Class, CurrentLoader),
superclassChain(ClassName, CurrentLoader, Chain),
Chain \= [].
instanceMethodInitialThisType(Class, Method, class(ClassName, L)) :-
methodName(Method, MethodName),
MethodName \= '`<init>`',
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
We now compute whether the merged stream for a method is type correct, using the method's initial type state:
If we have a stack map frame and an incoming type state, the type state must be assignable to the one in the stack map frame. We may then proceed to type check the rest of the stream with the type state given in the stack map frame.
mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode], frame(Locals, OperandStack, Flags)) :- frameIsAssignable(frame(Locals, OperandStack, Flags), MapFrame), mergedCodeIsTypeSafe(Environment, MoreCode, MapFrame).
A merged code stream is type safe relative to an incoming type state
T
if it begins with an instructionI
that is type safe relative toT
, andI
satisfies its exception handlers (see below), and the tail of the stream is type safe given the type state following that execution ofI
.NextStackFrame
indicates what falls through to the following instruction. For an unconditional branch instruction, it will have the special valueafterGoto
.ExceptionStackFrame
indicates what is passed to exception handlers.mergedCodeIsTypeSafe(Environment, [instruction(Offset, Parse) | MoreCode], frame(Locals, OperandStack, Flags)) :- instructionIsTypeSafe(Parse, Environment, Offset, frame(Locals, OperandStack, Flags), NextStackFrame, ExceptionStackFrame), instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame), mergedCodeIsTypeSafe(Environment, MoreCode, NextStackFrame).
After an unconditional branch (indicated by an incoming type state of
afterGoto
), if we have a stack map frame giving the type state for the following instructions, we can proceed and type check them using the type state provided by the stack map frame.mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode], afterGoto) :- mergedCodeIsTypeSafe(Environment, MoreCode, MapFrame).
It is illegal to have code after an unconditional branch without a stack map frame being provided for it.
mergedCodeIsTypeSafe(_Environment, [instruction(_, _) | _MoreCode], afterGoto) :- write_ln('No stack frame after unconditional branch'), fail.
If we have an unconditional branch at the end of the code, stop.
mergedCodeIsTypeSafe(_Environment, [endOfCode(Offset)], afterGoto).
Branching to a target is type safe if the target has an associated stack frame, Frame
, and the current stack frame, StackFrame
, is assignable to Frame
.
targetIsTypeSafe(Environment, StackFrame, Target) :-
offsetStackFrame(Environment, Target, Frame),
frameIsAssignable(StackFrame, Frame).
An instruction satisfies its exception handlers if it satisfies every exception handler that is applicable to the instruction.
instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame) :-
exceptionHandlers(Environment, Handlers),
sublist(isApplicableHandler(Offset), Handlers, ApplicableHandlers),
checklist(instructionSatisfiesHandler(Environment, ExceptionStackFrame),
ApplicableHandlers).
An exception handler is applicable to an instruction if the offset of the instruction is greater or equal to the start of the handler's range and less than the end of the handler's range.
isApplicableHandler(Offset, handler(Start, End, _Target, _ClassName)) :-
Offset >= Start,
Offset < End.
An instruction satisfies an exception handler if the instructions's outgoing type state is ExcStackFrame
, and the handler's target (the initial instruction of the handler code) is type safe assuming an incoming type state T
. The type state T
is derived from ExcStackFrame
by replacing the operand stack with a stack whose sole element is the handler's exception class.
instructionSatisfiesHandler(Environment, ExcStackFrame, Handler) :-
Handler = handler(_, _, Target, _),
currentClassLoader(Environment, CurrentLoader),
handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
/* The stack consists of just the exception. */
ExcStackFrame = frame(Locals, _, Flags),
TrueExcStackFrame = frame(Locals, [ ExceptionClass ], Flags),
operandStackHasLegalLength(Environment, TrueExcStackFrame),
targetIsTypeSafe(Environment, TrueExcStackFrame, Target).
4.10.1.8 Type Checking for protected
Members
Member references can refer to array types, so the rules in this section need to be defined in terms of a type, not a class name.
All instructions that access members must contend with the rules concerning protected
members. This section describes the protected
check that corresponds to JLS §6.6.2.1.
The protected
check applies only to protected
members of superclasses of the current class. protected
members in other classes will be caught by the access checking done at resolution (5.4.4). There are four cases:
If the
name of a class is not the name of any superclassreferenced type is not a class type with the same name as a superclass, it cannot be a superclass, and so it can safely be ignored.passesProtectedCheck(Environment, MemberClassName, MemberName,
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName,
MemberDescriptor, StackFrame) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), superclassChain(CurrentClassName, CurrentLoader, Chain), notMember(class(MemberClassName, _), Chain).
passesProtectedCheck(Environment, arrayOf(_), MemberName, MemberDescriptor, StackFrame).
If the
MemberClassName
is the same as the name of a superclass, the class being resolved may indeed be a superclass. In this case, if no superclass namedMemberClassName
in a different run-time package has aprotected
member namedMemberName
with descriptorMemberDescriptor
, theprotected
check does not apply.This is because the actual class being resolved will either be one of these superclasses, in which case we know that it is either in the same run-time package, and the access is legal; or the member in question is not
protected
and the check does not apply; or it will be a subclass, in which case the check would succeed anyway; or it will be some other class in the same run-time package, in which case the access is legal and the check need not take place; or the verifier need not flag this as a problem, since it will be caught anyway because resolution will per force fail.passesProtectedCheck(Environment, MemberClassName, MemberName,
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName,
MemberDescriptor, StackFrame) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), superclassChain(CurrentClassName, CurrentLoader, Chain), member(class(MemberClassName, _), Chain), classesInOtherPkgWithProtectedMember( class(CurrentClassName, CurrentLoader), MemberName, MemberDescriptor, MemberClassName, Chain, []).
If there does exist a
protected
superclass member in a different run-time package, then loadMemberClassName
; if the member in question is notprotected
, the check does not apply. (Using a superclass member that is notprotected
is trivially correct.)passesProtectedCheck(Environment, MemberClassName, MemberName,
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName,
MemberDescriptor, frame(_Locals, [Target | Rest], _Flags)) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), superclassChain(CurrentClassName, CurrentLoader, Chain), member(class(MemberClassName, _), Chain), classesInOtherPkgWithProtectedMember( class(CurrentClassName, CurrentLoader), MemberName, MemberDescriptor, MemberClassName, Chain, List), List \= [], loadedClass(MemberClassName, CurrentLoader, ReferencedClass), isNotProtected(ReferencedClass, MemberName, MemberDescriptor).
Otherwise, use of a member of an object of type
Target
requires thatTarget
be assignable to the type of the current class.passesProtectedCheck(Environment, MemberClassName, MemberName,
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName,
MemberDescriptor, frame(_Locals, [Target | Rest], _Flags)) :- thisClass(Environment, class(CurrentClassName, CurrentLoader)), superclassChain(CurrentClassName, CurrentLoader, Chain), member(class(MemberClassName, _), Chain), classesInOtherPkgWithProtectedMember( class(CurrentClassName, CurrentLoader), MemberName, MemberDescriptor, MemberClassName, Chain, List), List \= [], loadedClass(MemberClassName, CurrentLoader, ReferencedClass), isProtected(ReferencedClass, MemberName, MemberDescriptor), isAssignable(Target, class(CurrentClassName, CurrentLoader)).
The predicate classesInOtherPkgWithProtectedMember(Class, MemberName, MemberDescriptor, MemberClassName, Chain, List)
is true if List
is the set of classes in Chain
with name MemberClassName
that are in a different run-time package than Class
which have a protected
member named MemberName
with descriptor MemberDescriptor
.
classesInOtherPkgWithProtectedMember(_, _, _, _, [], []).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
[class(MemberClassName, L) | T]) :-
differentRuntimePackage(Class, class(MemberClassName, L)),
loadedClass(MemberClassName, L, Super),
isProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
T) :-
differentRuntimePackage(Class, class(MemberClassName, L)),
loadedClass(MemberClassName, L, Super),
isNotProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
T] :-
sameRuntimePackage(Class, class(MemberClassName, L)),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
sameRuntimePackage(Class1, Class2) :-
classDefiningLoader(Class1, L),
classDefiningLoader(Class2, L),
samePackageName(Class1, Class2).
differentRuntimePackage(Class1, Class2) :-
classDefiningLoader(Class1, L1),
classDefiningLoader(Class2, L2),
L1 \= L2.
differentRuntimePackage(Class1, Class2) :-
differentPackageName(Class1, Class2).
4.10.1.9 Type Checking Instructions
In general, the type rule for an instruction is given relative to an environment Environment
that defines the class and method in which the instruction occurs (4.10.1.1), and the offset Offset
within the method at which the instruction occurs. The rule states that if the incoming type state StackFrame
fulfills certain requirements, then:
The instruction is type safe.
It is provable that the type state after the instruction completes normally has a particular form given by
NextStackFrame
, and that the type state after the instruction completes abruptly is given byExceptionStackFrame
.The type state after an instruction completes abruptly is the same as the incoming type state, except that the operand stack is empty.
exceptionStackFrame(StackFrame, ExceptionStackFrame) :- StackFrame = frame(Locals, _OperandStack, Flags), ExceptionStackFrame = frame(Locals, [], Flags).
Many instructions have type rules that are completely isomorphic to the rules for other instructions. If an instruction b1
is isomorphic to another instruction b2
, then the type rule for b1
is the same as the type rule for b2
.
instructionIsTypeSafe(Instruction, Environment, Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
instructionHasEquivalentTypeRule(Instruction, IsomorphicInstruction),
instructionIsTypeSafe(IsomorphicInstruction, Environment, Offset,
StackFrame, NextStackFrame,
ExceptionStackFrame).
The English language description of each rule is intended to be readable, intuitive, and concise. As such, the description avoids repeating all the contextual assumptions given above. In particular:
The description does not explicitly mention the environment.
When the description speaks of the operand stack or local variables in the following, it is referring to the operand stack and local variable components of a type state: either the incoming type state or the outgoing one.
The type state after the instruction completes abruptly is almost always identical to the incoming type state. The description only discusses the type state after the instruction completes abruptly when that is not the case.
The description speaks of popping and pushing types onto the operand stack, and does not explicitly discuss issues of stack underflow or overflow. The description assumes these operations can be completed successfully, but the Prolog clauses for operand stack manipulation ensure that the necessary checks are made.
The description discusses only the manipulation of logical types. In practice, some types take more than one word. The description abstracts from these representation details, but the Prolog clauses that manipulate data do not.
Any ambiguities can be resolved by referring to the formal Prolog clauses.
anewarray
An anewarray instruction with operand CP
is type safe iff one can legally replace a type matching CP
refers to a constant pool entry denoting a class, interface, or array type, andint
on the incoming operand stack with an array with component type CP
yielding the outgoing type state.
instructionIsTypeSafe(anewarray(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = class(_, _) ; CP = arrayOf(_)),
validTypeTransition(Environment, [int], arrayOf(CP),
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
checkcast
A checkcast instruction with operand CP
is type safe iff one can validly replace the type CP
refers to a constant pool entry denoting either a class or an array, andObject
on top of the incoming operand stack with the type denoted by CP
yielding the outgoing type state.
instructionIsTypeSafe(checkcast(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = class(_, _) ; CP = arrayOf(_)),
isBootstrapLoader(BL),
validTypeTransition(Environment, [class('java/lang/Object', BL)], CP,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
getfield
A getfield instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting a field whose declared type is FieldType
, declared in a class type FieldClassName
FieldClass
, and one can validly replace a type matching FieldClassName
FieldClass
with type FieldType
on the incoming operand stack yielding the outgoing type state. FieldClassName
must not be an array type.protected
fields are subject to additional checks (4.10.1.8).
Array types are allowed here (4.4.2).
instructionIsTypeSafe(getfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
passesProtectedCheck(Environment, FieldClassName, FieldName,
FieldDescriptor, StackFrame),
currentClassLoader(Environment, CurrentLoader),
validTypeTransition(Environment,
[class(FieldClassName, CurrentLoader)], FieldType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(getfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClass, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
passesProtectedCheck(Environment, FieldClass, FieldName,
FieldDescriptor, StackFrame),
currentClassLoader(Environment, CurrentLoader),
validTypeTransition(Environment, [FieldClass], FieldType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
getstatic
A getstatic instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting a field whose declared type is FieldType
, and one can validly push FieldType
on the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(getstatic(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(_FieldClassName, _FieldName, FieldDescriptor),
CP = field(_FieldClass, _FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
validTypeTransition(Environment, [], FieldType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instanceof
An instanceof instruction with operand is type safe iff CP
one can validly replace the type CP
refers to a constant pool entry denoting either a class or an array, andObject
on top of the incoming operand stack with type int
yielding the outgoing type state.
instructionIsTypeSafe(instanceof(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = class(_, _) ; CP = arrayOf(_)),
isBootstrapLoader(BL),
validTypeTransition(Environment, [class('java/lang/Object', BL)], int,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
invokedynamic
An invokedynamic instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting an dynamic call sitewith namewith descriptorCallSiteName
Descriptor
.CallSiteName
is not<init>
.CallSiteName
is not<clinit>
.One can validly replace types matching the argument types given in
Descriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.
instructionIsTypeSafe(invokedynamic(CP,0,0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = dmethod(CallSiteName, Descriptor),
CallSiteName \= '`<init>`',
CallSiteName \= '`<clinit>`',
CP = dmethod(_CallSiteName, Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Method names are checked by format checking (4.4.10).
invokeinterface
An invokeinterface instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting an interface methodnamedwith descriptorMethodName
Descriptor
that is a member ofan interfacea typeMethodIntfName
MethodClass
.MethodName
is not<init>
.MethodName
is not<clinit>
.Its second operand,Count
, is a valid count operand (see below).One can validly replace types matching
the typeMethodIntfName
MethodClass
and the argument types given inDescriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.
instructionIsTypeSafe(invokeinterface(CP, Count, 0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = imethod(MethodIntfName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
CP = imethod(MethodClass, _MethodName, Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
currentClassLoader(Environment, CurrentLoader),
reverse([class(MethodIntfName, CurrentLoader) | OperandArgList],
StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
validTypeTransition(Environment, [], ReturnType,
TempFrame, NextStackFrame),
countIsValid(Count, StackFrame, TempFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokeinterface(CP, _Count, 0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = imethod(MethodClass, _MethodName, Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(MethodClass | OperandArgList], StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
validTypeTransition(Environment, [], ReturnType,
TempFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Array types are allowed here (4.4.2).
The <clinit>
method name is prohibited by format checking (4.4.2). The <init>
method name is prohibited by static constraints (4.9.1, 4.10).
The Count
operand of an invokeinterface instruction is valid if it equals the size of the arguments to the instruction. This is equal to the difference between the size of InputFrame
and OutputFrame
.
countIsValid(Count, InputFrame, OutputFrame) :-
InputFrame = frame(_Locals1, OperandStack1, _Flags1),
OutputFrame = frame(_Locals2, OperandStack2, _Flags2),
length(OperandStack1, Length1),
length(OperandStack2, Length2),
Count =:= Length1 - Length2.
invokespecial
An invokespecial instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting a method namedMethodName
with descriptorDescriptor
that is a member of aclasstypeMethodClassName
MethodClass
.Either:
MethodName
is not<init>
.MethodName
is not<clinit>
.MethodClass
is the current class, a superclass of the current class, or a direct superinterface of the current class.One can validly replace types matching the current class and the argument types given in
Descriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.One can validly replace types matching the classMethodClassName
and the argument types given inDescriptor
on the incoming operand stack with the return type given inDescriptor
.
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
isAssignable(class(CurrentClassName, CurrentLoader),
class(MethodClassName, CurrentLoader)),
reverse([class(CurrentClassName, CurrentLoader) | OperandArgList],
StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList2),
validTypeTransition(Environment, StackArgList2, ReturnType,
StackFrame, _ResultStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = method(MethodClass, MethodName, Descriptor) ;
CP = imethod(MethodClass, MethodName, Descriptor)),
MethodName \= '`<init>`',
validSpecialMethodClass(Environment, MethodClass),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
thisClass(Environment, CurrentClass),
reverse([CurrentClass | OperandArgList], StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
validSpecialMethodClass(Environment, MethodClass) :-
thisClass(Environment, MethodClass).
validSpecialMethodClass(Environment, class(MethodClassName, L)) :-
loadedClass(MethodClassName, L, LoadedMethodClass),
\+ classIsInterface(LoadedMethodClass),
thisClass(Environment, CurrentClass),
isAssignable(CurrentClass, class(MethodClassName, L)).
validSpecialMethodClass(Environment, class(MethodClassName, L)) :-
loadedClass(MethodClassName, L, LoadedMethodClass),
classIsInterface(LoadedMethodClass),
Environment = environment(CurrentClass, _, _, _, _, _),
classInterfaces(CurrentClass, InterfaceNames),
member(MethodClassName, InterfaceNames).
Interface methods are allowed here (4.9.1).
The <clinit>
method name is prohibited by format checking (4.4.2).
The old rules attempt to enforce the constraints on MethodClassName
via an isAssignable
call, deferring some complexity to subtype testing. Unfortunately, this isn't correct for interfaces: every reference type is "assignable" to every interface type.
In practice, Hotspot appears to perform the following check:
If a
Methodref
is used, test forisAssignable
.If an
InterfaceMethodref
is used, make sure it is named as a direct superinterface.
This allows, for example, the name of a valid interface that is not related to the current class to appear in a Methodref
. Verification succeeds, and no error occurs until resolution of the Methodref
.
That's some unnecessary complexity that doesn't quite align with 4.9.2. Instead, these new rules directly test for a valid class/interface name: the current class, a superclass, or a direct superinterface. The rules do some class loading, but note that the same loading occurred before in the isAssignable
test.
Another problem with the old rules is a redundant subtyping check via validTypeTransition
. Given that CurrentClass
<: MethodClass
and the stack operand <: CurrentClass
, there's no need to also check that the stack operand <: MethodClass
.
Array types are syntactically allowed here (4.4.2), but the validSpecialMethodClass
clause will reject them.
The
isAssignable
validSpecialMethodClass
clause enforces the structural constraint that invokespecial, for other than an instance initialization method, must name a method in the current class/interface or a superclass/superinterface.
The
firstvalidTypeTransition
clause enforces the structural constraint that invokespecial, for other than an instance initialization method, targets a receiver object of the current class or deeper. To see why, consider thatStackArgList
simulates the list of types on the operand stack expected by the method, starting with the current class (the class performing invokespecial). The actual types on the operand stack are inStackFrame
. The effect ofvalidTypeTransition
is to pop the first type from the operand stack inStackFrame
and check it is a subtype of the first term ofStackArgList
, namely the current class. Thus, the actual receiver type is compatible with the current class.
A sharp-eyed reader might notice that enforcing this structural constraint supercedes the structural constraint pertaining to invokespecial of a
protected
method. Thus, the Prolog code above makes no reference topassesProtectedCheck
(4.10.1.8), whereas the Prolog code for invokespecial of an instance initialization method usespassesProtectedCheck
to ensure the actual receiver type is compatible with the current class when certainprotected
instance initialization methods are named.
The secondvalidTypeTransition
clause enforces the structural constraint that any method invocation instruction must target a receiver object whose type is compatible with the type named by the instruction. To see why, consider thatStackArgList2
simulates the list of types on the operand stack expected by the method, starting with the type named by the instruction. Again, the actual types on the operand stack are inStackFrame
, and the effect ofvalidTypeTransition
is to check the actual receiver type inStackFrame
is compatible with the type named by the instruction inStackArgList2
.
Or:
MethodName is
<init>
.Descriptor
specifies avoid
return type.One can validly pop types matching the argument types given in
Descriptor
and an uninitialized type,UninitializedArg
, off the incoming operand stack, yieldingOperandStack
.The outgoing type state is derived from the incoming type state by first replacing the incoming operand stack with
OperandStack
and then replacing all instances ofUninitializedArg
with the type of instance being initialized.If the instruction calls an instance initialization method on a class instance created by an earlier new instruction, and the method is
protected
, the usage conforms to the special rules governing access toprotected
members (4.10.1.8).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, '`<init>`', Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitializedThis | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitializedThis, Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenInitializationFlags(uninitializedThis, Flags, NextFlags),
substitute(uninitializedThis, This, OperandStack, NextOperandStack),
substitute(uninitializedThis, This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = method(MethodClass, '`<init>`', Descriptor) ;
CP = imethod(MethodClass, '`<init>`', Descriptor)),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitializedThis | OperandStack], Flags),
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, This),
rewrittenInitializationFlags(uninitializedThis, Flags, NextFlags),
substitute(uninitializedThis, This, OperandStack, NextOperandStack),
substitute(uninitializedThis, This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, '`<init>`', Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitialized(Address) | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitialized(Address), Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenInitializationFlags(uninitialized(Address), Flags, NextFlags),
substitute(uninitialized(Address), This, OperandStack, NextOperandStack),
substitute(uninitialized(Address), This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags),
passesProtectedCheck(Environment, MethodClassName, '`<init>`',
Descriptor, NextStackFrame).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = method(MethodClass, '`<init>`', Descriptor) ;
CP = imethod(MethodClass, '`<init>`', Descriptor)),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitialized(Address) | OperandStack], Flags),
rewrittenUninitializedType(uninitialized(Address), Environment,
MethodClass, This),
rewrittenInitializationFlags(uninitialized(Address), Flags, NextFlags),
substitute(uninitialized(Address), This, OperandStack, NextOperandStack),
substitute(uninitialized(Address), This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags),
passesProtectedCheck(Environment, MethodClass, '`<init>`',
Descriptor, NextStackFrame).
Array types are syntactically allowed here (4.4.2), but the rewrittenUninitializedType
clause will reject them.
The return type must be void
, per 4.4.2.
I've confirmed that Hotspot behavior in JDK 13 is to allow interface method references named <init>
, delaying any errors until resolution/runtime.
To compute what type the uninitialized argument's type needs to be rewritten to, there are two cases:
If we are initializing an object within its constructor, its type is initially
uninitializedThis
. This type will be rewritten to the type of the class of the<init>
method.The second case arises from initialization of an object created by new. The uninitialized arg type is rewritten to
MethodClass
, the type of the method holder of<init>
. We check whether there really is a new instruction atAddress
.
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, MethodClass) :-
MethodClass = class(MethodClassName, CurrentLoader),
thisClass(Environment, MethodClass).
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, MethodClass) :-
MethodClass = class(MethodClassName, CurrentLoader),
thisClass(Environment, class(thisClassName, thisLoader)),
superclassChain(thisClassName, thisLoader, [MethodClass | Rest]).
rewrittenUninitializedType(uninitialized(Address), Environment,
MethodClass, MethodClass) :-
allInstructions(Environment, Instructions),
member(instruction(Address, new(MethodClass)), Instructions).
rewrittenInitializationFlags(uninitializedThis, _Flags, []).
rewrittenInitializationFlags(uninitialized(_), Flags, Flags).
substitute(_Old, _New, [], []).
substitute(Old, New, [Old | FromRest], [New | ToRest]) :-
substitute(Old, New, FromRest, ToRest).
substitute(Old, New, [From1 | FromRest], [From1 | ToRest]) :-
From1 \= Old,
substitute(Old, New, FromRest, ToRest).
The rule for invokespecial of an
<init>
method is the sole motivation for passing back a distinct exception stack frame. The concern is that when initializing an object within its constructor, invokespecial can cause a superclass<init>
method to be invoked, and that invocation could fail, leavingthis
uninitialized. This situation cannot be created using source code in the Java programming language, but can be created by programming in bytecode directly.
In this situation, the original frame holds an uninitialized object in local variable 0 and has flag
flagThisUninit
. Normal termination of invokespecial initializes the uninitialized object and turns off theflagThisUninit
flag. But if the invocation of an<init>
method throws an exception, the uninitialized object might be left in a partially initialized state, and needs to be made permanently unusable. This is represented by an exception frame containing the broken object (the new value of the local) and theflagThisUninit
flag (the old flag). There is no way to get from an apparently-initialized object bearing theflagThisUninit
flag to a properly initialized object, so the object is permanently unusable.
If not for this situation, the flags of the exception stack frame would always be the same as the flags of the input stack frame.
invokestatic
An invokestatic instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting a methodnamedwith descriptorMethodName
Descriptor
.MethodName
is not<init>
.MethodName
is not<clinit>
.One can validly replace types matching the argument types given in
Descriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.
instructionIsTypeSafe(invokestatic(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(_MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
(CP = method(_MethodClass, _MethodName, Descriptor) ;
CP = imethod(_MethodClass, _MethodName, Descriptor)),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The <clinit>
method name is prohibited by format checking (4.4.2). The <init>
method name is prohibited by static constraints (4.9.1, 4.10).
Interface methods are allowed here (4.9.1).
invokevirtual
An invokevirtual instruction is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting a methodnamedwith descriptorMethodName
Descriptor
that is a member of aclasstypeMethodClassName
MethodClass
.MethodName
is not<init>
.MethodName
is not<clinit>
.One can validly replace types matching
the classMethodClassName
MethodClass
and the argument types given inDescriptor
on the incoming operand stack with the return type given inDescriptor
, yielding the outgoing type state.If the method is
protected
, the usage conforms to the special rules governing access toprotected
members (4.10.1.8).
instructionIsTypeSafe(invokevirtual(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, ArgList),
currentClassLoader(Environment, CurrentLoader),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
canPop(StackFrame, ArgList, PoppedFrame),
passesProtectedCheck(Environment, MethodClassName, MethodName,
Descriptor, PoppedFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokevirtual(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClass, _MethodName, Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse(OperandArgList, ArgList),
reverse([MethodClass | OperandArgList], StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
canPop(StackFrame, ArgList, PoppedFrame),
passesProtectedCheck(Environment, MethodClass, MethodName,
Descriptor, PoppedFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The <clinit>
method name is prohibited by format checking (4.4.2). The <init>
method name is prohibited by static constraints (4.9.1, 4.10).
ldc, ldc_w, ldc2_w
An ldc instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting an entity of type Type
, where and one can validly push Type
is loadable (4.4), but not long
or double
,Type
onto the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(ldc(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
loadableConstant(CP, Type),
Type \= long,
Type \= double,
validTypeTransition(Environment, [], Type, StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
loadableConstant(CP, Type) :-
member([CP, Type], [
[int(_), int],
[float(_), float],
[long(_), long],
[double(_), double]
]).
loadableConstant(CP, Type) :-
isBootstrapLoader(BL),
member([CP, Type], [
[class(_), class('java/lang/Class', BL)],
[class(_,_), class('java/lang/Class', BL)],
[arrayOf(_), class('java/lang/Class', BL)],
[string(_), class('java/lang/String', BL)],
[methodHandle(_,_), class('java/lang/invoke/MethodHandle', BL)],
[methodType(_,_), class('java/lang/invoke/MethodType', BL)]
]).
loadableConstant(CP, Type) :-
CP = dconstant(_, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, Type).
An ldc_w or ldc2_w instruction is type safe iff the equivalent ldc instruction is type safe.
instructionHasEquivalentTypeRule(ldc_w(CP), ldc(CP))
instructionHasEquivalentTypeRule(ldc2_w(CP), ldc(CP))
An ldc2_w instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting an entity of type Type
, where and one can validly push Type
is either long
or double
,Type
onto the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(ldc2_w(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
loadableConstant(CP, Type),
(Type = long ; Type = double),
validTypeTransition(Environment, [], Type, StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
lookupswitch
A lookupswitch instruction is type safe if its keys are sorted, one can validly pop int
off the incoming operand stack yielding a new type state BranchStackFrame
, and all of the instruction's targets are valid branch targets assuming BranchStackFrame
as their incoming type state.
instructionIsTypeSafe(lookupswitch(Targets, Keys), Environment, _, StackFrame,
afterGoto, ExceptionStackFrame) :-
sort(Keys, Keys),
canPop(StackFrame, [int], BranchStackFrame),
checklist(targetIsTypeSafe(Environment, BranchStackFrame), Targets),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
multianewarray
A multianewarray instruction with operands CP
and Dim
is type safe iff CP
refers to a constant pool entry denoting an array type whose dimension is greater or equal to Dim
, Dim
is strictly positive, and one can validly replace Dim
int
types on the incoming operand stack with the type denoted by CP
yielding the outgoing type state.
instructionIsTypeSafe(multianewarray(CP, Dim), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = arrayOf(_),
classDimension(CP, Dimension),
Dimension >= Dim,
Dim > 0,
/* Make a list of Dim ints */
findall(int, between(1, Dim, _), IntList),
validTypeTransition(Environment, IntList, CP,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The dimension of an array type whose component type is also an array type is one more than the dimension of its component type.
classDimension(arrayOf(X), Dimension) :-
classDimension(X, Dimension1),
Dimension is Dimension1 + 1.
classDimension(_, Dimension) :-
Dimension = 0.
new
A new instruction with operand CP
at offset Offset
is type safe iff CP
refers to a constant pool entry denoting a class or interface type, the type uninitialized(Offset)
does not appear in the incoming operand stack, and one can validly push uninitialized(Offset)
onto the incoming operand stack and replace uninitialized(Offset)
with top
in the incoming local variables yielding the outgoing type state.
instructionIsTypeSafe(new(CP), Environment, Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
StackFrame = frame(Locals, OperandStack, Flags),
CP = class(_, _),
NewItem = uninitialized(Offset),
notMember(NewItem, OperandStack),
substitute(NewItem, top, Locals, NewLocals),
validTypeTransition(Environment, [], NewItem,
frame(NewLocals, OperandStack, Flags),
NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The
substitute
predicate is defined in the rule for invokespecial (4.10.1.9.invokespecial).
putfield
A putfield instruction with operand CP
is type safe iff all of the following are true:
Its first operand,
CP
, refers to a constant pool entry denoting a field whose declared type isFieldType
, declared in aclasstypeFieldClassName
FieldClass
.FieldClassName
must not be an array type.Array types are allowed here (4.4.2).
Either:
One can validly pop types matching
FieldType
andFieldClassName
FieldClass
off the incoming operand stack yielding the outgoing type state.protected
fields are subject to additional checks (4.10.1.8).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
canPop(StackFrame, [FieldType], PoppedFrame),
passesProtectedCheck(Environment, FieldClassName, FieldName,
FieldDescriptor, PoppedFrame),
currentClassLoader(Environment, CurrentLoader),
canPop(StackFrame, [FieldType, class(FieldClassName, CurrentLoader)],
NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClass, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
canPop(StackFrame, [FieldType], PoppedFrame),
passesProtectedCheck(Environment, FieldClass, FieldName,
FieldDescriptor, PoppedFrame),
canPop(StackFrame, [FieldType, FieldClass], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Or:
- If the instruction occurs in an instance initialization method of the class
FieldClassName
FieldClass
, then one can validly pop types matchingFieldType
anduninitializedThis
off the incoming operand stack yielding the outgoing type state. This allows instance fields ofthis
that are declared in the current class to be assigned prior to complete initialization ofthis
.
- If the instruction occurs in an instance initialization method of the class
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, _FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
Environment = environment(CurrentClass, CurrentMethod, _, _, _, _),
CurrentClass = class(FieldClassName, _),
isInit(CurrentMethod),
canPop(StackFrame, [FieldType, uninitializedThis], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClass, _FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
Environment = environment(FieldClass, CurrentMethod, _, _, _, _),
isInit(CurrentMethod),
canPop(StackFrame, [FieldType, uninitializedThis], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
putstatic
A putstatic instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting a field whose declared type is FieldType
, and one can validly pop a type matching FieldType
off the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(putstatic(CP), _Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(_FieldClassName, _FieldName, FieldDescriptor),
CP = field(_FieldClass, _FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
canPop(StackFrame, [FieldType], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
tableswitch
A tableswitch instruction is type safe if its keys are sorted, one can validly pop int
off the incoming operand stack yielding a new type state BranchStackFrame
, and all of the instruction's targets are valid branch targets assuming BranchStackFrame
as their incoming type state.
instructionIsTypeSafe(tableswitch(Targets, Keys), Environment, _Offset,
StackFrame, afterGoto, ExceptionStackFrame) :-
sort(Keys, Keys),
canPop(StackFrame, [int], BranchStackFrame),
checklist(targetIsTypeSafe(Environment, BranchStackFrame), Targets),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The keys are implicit, and so are guaranteed to be sorted.
4.10.2 Verification by Type Inference
A class
file that does not contain a StackMapTable
attribute (which necessarily has a version number of 49.0 or below) must be verified have its structural constraints verified using type inference.
4.10.2.1 The Process of Verification by Type Inference
During linking, the verifier checks the code
array of the Code
attribute for each method of the class
file by performing data-flow analysis on each method. The verifier ensures that at any given point in the program, no matter what code path is taken to reach that point, all of the following are true:
The operand stack is always the same size and contains the same types of values.
No local variable is accessed unless it is known to contain a value of an appropriate type.
Methods are invoked with the appropriate arguments.
Fields are assigned only using values of appropriate types.
All opcodes have appropriately typed arguments on the operand stack and in the local variable array.
For efficiency reasons, certain tests that could in principle be performed by the verifier are delayed until the first time the code for the method is actually invoked. In so doing, the verifier avoids loading class
files unless it has to.
For example, if a method invokes another method that returns an instance of class A, and that instance is assigned only to a field of the same type, the verifier does not bother to check if the class A actually exists. However, if it is assigned to a field of the type B, the definitions of both A and B must be loaded in to ensure that A is a subclass of B.
This discussion doesn't belong here; it's covered by 5.4.
4.10.2.2 The Bytecode Verifier
The code for each method is verified independently. First, the bytes that make up the code are broken up into a sequence of instructions, and the index into the code
array of the start of each instruction is placed in an array. The verifier then goes through the code a second time and parses the instructions. During this pass a data structure is built to hold information about each Java Virtual Machine instruction in the method. The operands, if any, of each instruction are checked to make sure they are valid. For instance:
Branches must be within the bounds of the
code
array for the method.The targets of all control-flow instructions are each the start of an instruction. In the case of a wide instruction, the wide opcode is considered the start of the instruction, and the opcode giving the operation modified by that wide instruction is not considered to start an instruction. Branches into the middle of an instruction are disallowed.
No instruction can access or modify a local variable at an index greater than or equal to the number of local variables that its method indicates it allocates.
All references to the constant pool must be to an entry of the appropriate type. (For example, the instruction getfield must reference a field.)
The code does not end in the middle of an instruction.
Execution cannot fall off the end of the code.
For each exception handler, the starting and ending point of code protected by the handler must be at the beginning of an instruction or, in the case of the ending point, immediately past the end of the code. The starting point must be before the ending point. The exception handler code must start at a valid instruction, and it must not start at an opcode being modified by the wide instruction.
These restrictions are enforced by static constraints (4.9.1) before attempting verification by type inference (4.10).
For each instruction of the method, the verifier records the contents of the operand stack and the contents of the local variable array prior to the execution of that instruction. For the operand stack, it needs to know the stack height and the type of each value on it. For each local variable, it needs to know either the type of the contents of that local variable or that the local variable contains an unusable or unknown value (it might be uninitialized). The bytecode verifier does not need to distinguish between the integral types (e.g., byte
, short
, char
) when determining the value types on the operand stack.
Next, a data-flow analyzer is initialized. For the first instruction of the method, the local variables that represent parameters initially contain values of the types indicated by the method's type descriptor; the operand stack is empty. All other local variables contain an illegal value. For the other instructions, which have not been examined yet, no information is available regarding the operand stack or local variables.
Finally, the data-flow analyzer is run. For each instruction, a "changed" bit indicates whether this instruction needs to be looked at. Initially, the "changed" bit is set only for the first instruction. The data-flow analyzer executes the following loop:
Select a Java Virtual Machine instruction whose "changed" bit is set. If no instruction remains whose "changed" bit is set, the method has successfully been verified. Otherwise, turn off the "changed" bit of the selected instruction.
Model the effect of the instruction on the operand stack and local variable array by doing the following:
If the instruction uses values from the operand stack, ensure that there are a sufficient number of values on the stack and that the top values on the stack are of an appropriate type. Otherwise, verification fails.
If the instruction uses a local variable, ensure that the specified local variable contains a value of the appropriate type. Otherwise, verification fails.
If the instruction pushes values onto the operand stack, ensure that there is sufficient room on the operand stack for the new values. Add the indicated types to the top of the modeled operand stack.
If the instruction modifies a local variable, record that the local variable now contains the new type.
Determine the instructions that can follow the current instruction. Successor instructions can be one of the following:
The next instruction, if the current instruction is not an unconditional control transfer instruction (for instance, goto, return, or athrow). Verification fails if it is possible to "fall off" the last instruction of the method.
The target(s) of a conditional or unconditional branch or switch.
Any exception handlers for this instruction.
Merge the state of the operand stack and local variable array at the end of the execution of the current instruction into each of the successor instructions, as follows:
If this is the first time the successor instruction has been visited, record that the operand stack and local variable values calculated in step 2 are the state of the operand stack and local variable array prior to executing the successor instruction. Set the "changed" bit for the successor instruction.
If the successor instruction has been seen before, merge the operand stack and local variable values calculated in step 2 into the values already there. Set the "changed" bit if there is any modification to the values.
In the special case of control transfer to an exception handler:
Record that a single object, of the exception type indicated by the exception handler, is the state of the operand stack prior to executing the successor instruction. There must be sufficient room on the operand stack for this single value, as if an instruction had pushed it.
Record that the local variable values from immediately before step 2 are the state of the local variable array prior to executing the successor instruction. The local variable values calculated in step 2 are irrelevant.
Continue at step 1.
To merge two operand stacks, the number of values on each stack must be identical. Then, corresponding values on the two stacks are compared and the value on the merged stack is computed, as follows:
If one value is a primitive type, then the corresponding value must be the same primitive type. The merged value is the primitive type.
If one value is a non-array reference type, then the corresponding value must be a reference type (array or non-array). The merged value is a reference to an instance of the first common supertype of the two reference types. (Such a reference type always exists because the type
Object
is a supertype of all class, interface, and array types.)For example,
Object
andString
can be merged; the result isObject
. Similarly,Object
andString[]
can be merged; the result is againObject
. EvenObject
andint[]
can be merged, orString
andint[]
; the result isObject
for both.If corresponding values are both array reference types, then their dimensions are examined. If the array types have the same dimensions, then the merged value is a
reference
to an instance of an array type which is first common supertype of both array types. (If either or both of the array types has a primitive element type, thenObject
is used as the element type instead.) If the array types have different dimensions, then the merged value is areference
to an instance of an array type whose dimension is the smaller of the two; the element type isCloneable
orjava.io.Serializable
if the smaller array type wasCloneable
orjava.io.Serializable
, andObject
otherwise.For example,
Object[]
andString[]
can be merged; the result isObject[]
.Cloneable[]
andString[]
can be merged, orjava.io.Serializable[]
andString[]
; the result isCloneable[]
andjava.io.Serializable[]
respectively. Evenint[]
andString[]
can be merged; the result isObject[]
, becauseObject
is used instead ofint
when computing the first common supertype.Since the array types can have different dimensions,
Object[]
andString[][]
can be merged, orObject[][]
andString[]
; in both cases the result isObject[]
.Cloneable[]
andString[][]
can be merged; the result isCloneable[]
. Finally,Cloneable[][]
andString[]
can be merged; the result isObject[]
.
If the operand stacks cannot be merged, verification of the method fails.
To merge two local variable array states, corresponding pairs of local variables are compared. The value of the merged local variable is computed using the rules above, except that the corresponding values are permitted to be different primitive types. In that case, the verifier records that the merged local variable contains an unusable value.
If the data-flow analyzer runs on a method without reporting a verification failure, then the method has been successfully verified by the class
file verifier.
Certain instructions and data types complicate the data-flow analyzer. We now examine each of these in more detail.