This specification is not final and is subject to change. Use is subject to license terms.

Simplified Attribute Validation

Changes to the Java® Virtual Machine Specification • Version 14-internal+0-adhoc.dlsmith.20191107

This document describes changes to the Java Virtual Machine Specification, as modified by Format Checking Cleanup and Class Loading Cleanup, to simplify the process of validating attributes that are either closely tied to bytecode verification or non-essential to JVM semantics.

The following changes to implementation behavior are proposed:

This document also shifts the specification of various validation checks for Code and StackMapTable attributes to verification time, consistent with longstanding behavior of the reference implementation (4.9, 4.10). And it addresses some bugs and redundancies in the verification rules.

Changes are described with respect to existing sections of the JVM Specification. New text is indicated like this and deleted text is indicated like this. Explanation and discussion, as needed, is set aside in grey boxes.

Chapter 4: The class File Format

4.7 Attributes

Attributes are used in the ClassFile, field_info, method_info, and Code_attribute structures of the class file format (4.1, 4.5, 4.6, 4.7.3).

All attributes have the following general format:

attribute_info {
    u2 attribute_name_index;
    u4 attribute_length;
    u1 info[attribute_length];
}

For all attributes, the attribute_name_index item must be a valid unsigned 16-bit index into the constant pool of the class. The constant_pool entry at attribute_name_index must be a CONSTANT_Utf8_info structure (4.4.7) which represents the name of the attribute. The value of the attribute_length item indicates the length of the subsequent information in bytes. The length does not include the initial six bytes that contain the attribute_name_index and attribute_length items.

28 attributes are predefined by this specification. They are listed three times, for ease of navigation:

Within the context of their use in this specification, that is, in the attributes tables of the structures of appropriately-versioned class files in which they appear, the names of these predefined attributes are reserved.

Any conditions on the presence of a predefined attribute in an attributes table are specified explicitly in the section which describes the attribute. If no conditions are specified, then the attribute may appear any number of times in an attributes table.

The predefined attributes are categorized into three groups according to their purpose:

  1. Nine attributes are critical to correct interpretation of the class file by the Java Virtual Machine:

    • ConstantValue
    • Code
    • StackMapTable
    • BootstrapMethods
    • Module
    • ModulePackages
    • ModuleMainClass
    • NestHost
    • NestMembers

    In a class file whose version number is v, each of these attributes must be recognized and correctly read by an implementation of the Java Virtual Machine if the implementation supports version v of the class file format, and the attribute was first defined in version v or earlier of the class file format, and the attribute appears in a location where it is defined to appear.

  1. Ten attributes are not critical to correct interpretation of the class file by the Java Virtual Machine, but are either critical to correct interpretation of the class file by the class libraries of the Java SE Platform, or are useful for tools (in which case the section that specifies an attribute describes it as "optional"):

    • Exceptions
    • InnerClasses
    • EnclosingMethod
    • Synthetic
    • Signature
    • SourceFile
    • LineNumberTable
    • LocalVariableTable
    • LocalVariableTypeTable
    • Deprecated

    In a class file whose version number is v, each of these attributes must be recognized and correctly read by an implementation of the Java Virtual Machine if the implementation supports version v of the class file format, and the attribute was first defined in version v or earlier of the class file format, and the attribute appears in a location where it is defined to appear.

  1. Nine Nineteen attributes are not critical to correct interpretation of the class file by the Java Virtual Machine, but contain metadata about the class file that is either exposed by the class libraries of the Java SE Platform, or made available by tools (in which case the section that specifies an attribute describes it as "optional"):

    • Exceptions
    • InnerClasses
    • EnclosingMethod
    • Synthetic
    • Signature
    • SourceFile
    • SourceDebugExtension
    • LineNumberTable
    • LocalVariableTable
    • LocalVariableTypeTable
    • Deprecated
    • RuntimeVisibleAnnotations
    • RuntimeInvisibleAnnotations
    • RuntimeVisibleParameterAnnotations
    • RuntimeInvisibleParameterAnnotations
    • RuntimeVisibleTypeAnnotations
    • RuntimeInvisibleTypeAnnotations
    • AnnotationDefault
    • MethodParameters

    An implementation of the Java Virtual Machine may use the information that these attributes contain, or otherwise must silently ignore these attributes.

    The effect of grouping these two categories together is that the old group #2 is no longer subject to format checking—the contents of the attributes must be silently ignored (or silently observed) by the JVM.

Table 4.7-A. Predefined class file attributes (by section)

Attribute Section class file Java SE
ConstantValue 4.7.2 45.3 1.0.2
Code 4.7.3 45.3 1.0.2
StackMapTable 4.7.4 50.0 6
Exceptions 4.7.5 45.3 1.0.2
InnerClasses 4.7.6 45.3 1.1
EnclosingMethod 4.7.7 49.0 5.0
Synthetic 4.7.8 45.3 1.1
Signature 4.7.9 49.0 5.0
SourceFile 4.7.10 45.3 1.0.2
SourceDebugExtension 4.7.11 49.0 5.0
LineNumberTable 4.7.12 45.3 1.0.2
LocalVariableTable 4.7.13 45.3 1.0.2
LocalVariableTypeTable 4.7.14 49.0 5.0
Deprecated 4.7.15 45.3 1.1
RuntimeVisibleAnnotations 4.7.16 49.0 5.0
RuntimeInvisibleAnnotations 4.7.17 49.0 5.0
RuntimeVisibleParameterAnnotations 4.7.18 49.0 5.0
RuntimeInvisibleParameterAnnotations 4.7.19 49.0 5.0
RuntimeVisibleTypeAnnotations 4.7.20 52.0 8
RuntimeInvisibleTypeAnnotations 4.7.21 52.0 8
AnnotationDefault 4.7.22 49.0 5.0
BootstrapMethods 4.7.23 51.0 7
MethodParameters 4.7.24 52.0 8
Module 4.7.25 53.0 9
ModulePackages 4.7.26 53.0 9
ModuleMainClass 4.7.27 53.0 9
NestHost 4.7.28 55.0 11
NestMembers 4.7.29 55.0 11

Table 4.7-B. Predefined class file attributes (by class file format)

Attribute class file Java SE Section
ConstantValue 45.3 1.0.2 4.7.2
Code 45.3 1.0.2 4.7.3
Exceptions 45.3 1.0.2 4.7.5
SourceFile 45.3 1.0.2 4.7.10
LineNumberTable 45.3 1.0.2 4.7.12
LocalVariableTable 45.3 1.0.2 4.7.13
InnerClasses 45.3 1.1 4.7.6
Synthetic 45.3 1.1 4.7.8
Deprecated 45.3 1.1 4.7.15
EnclosingMethod 49.0 5.0 4.7.7
Signature 49.0 5.0 4.7.9
SourceDebugExtension 49.0 5.0 4.7.11
LocalVariableTypeTable 49.0 5.0 4.7.14
RuntimeVisibleAnnotations 49.0 5.0 4.7.16
RuntimeInvisibleAnnotations 49.0 5.0 4.7.17
RuntimeVisibleParameterAnnotations 49.0 5.0 4.7.18
RuntimeInvisibleParameterAnnotations 49.0 5.0 4.7.19
AnnotationDefault 49.0 5.0 4.7.22
StackMapTable 50.0 6 4.7.4
BootstrapMethods 51.0 7 4.7.23
RuntimeVisibleTypeAnnotations 52.0 8 4.7.20
RuntimeInvisibleTypeAnnotations 52.0 8 4.7.21
MethodParameters 52.0 8 4.7.24
Module 53.0 9 4.7.25
ModulePackages 53.0 9 4.7.26
ModuleMainClass 53.0 9 4.7.27
NestHost 55.0 11 4.7.28
NestMembers 55.0 11 4.7.29

Table 4.7-C. Predefined class file attributes (by location)

Attribute Location class file
SourceFile ClassFile 45.3
InnerClasses ClassFile 45.3
EnclosingMethod ClassFile 49.0
SourceDebugExtension ClassFile 49.0
BootstrapMethods ClassFile 51.0
Module, ModulePackages, ModuleMainClass ClassFile 53.0
NestHost, NestMembers ClassFile 55.0
ConstantValue field_info 45.3
Code method_info 45.3
Exceptions method_info 45.3
RuntimeVisibleParameterAnnotations, RuntimeInvisibleParameterAnnotations method_info 49.0
AnnotationDefault method_info 49.0
MethodParameters method_info 52.0
Synthetic ClassFile, field_info, method_info 45.3
Deprecated ClassFile, field_info, method_info 45.3
Signature ClassFile, field_info, method_info 49.0
RuntimeVisibleAnnotations, RuntimeInvisibleAnnotations ClassFile, field_info, method_info 49.0
LineNumberTable Code 45.3
LocalVariableTable Code 45.3
LocalVariableTypeTable Code 49.0
StackMapTable Code 50.0
RuntimeVisibleTypeAnnotations, RuntimeInvisibleTypeAnnotations ClassFile, field_info, method_info, Code 52.0

4.7.3 The Code Attribute

The Code attribute is a variable-length attribute in the attributes table of a method_info structure (4.6). A Code attribute contains the Java Virtual Machine instructions and auxiliary information for a method, including an instance initialization method and a class or interface initialization method (2.9.1, 2.9.2).

If the method is either native or abstract, its method_info structure must not have a Code attribute in its attributes table. Otherwise, its method_info structure must have exactly one Code attribute in its attributes table.

The Code attribute must have the following format:

Code_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 max_stack;
    u2 max_locals;
    u4 code_length;
    u1 code[code_length];
    u2 exception_table_length;
    {   u2 start_pc;
        u2 end_pc;
        u2 handler_pc;
        u2 catch_type;
    } exception_table[exception_table_length];
    u2 attributes_count;
    attribute_info attributes[attributes_count];
}

The items of the Code_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "Code".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

max_stack

The value of the max_stack item gives should give the maximum depth of the operand stack of this method (2.6.2) at any point during execution of the method.

max_locals

The value of the max_locals item gives should give the number of local variables in the local variable array allocated upon invocation of this method (2.6.1), including the local variables used to pass parameters to the method on its invocation.

The greatest local variable index for a value of type long or double is max_locals - 2. The greatest local variable index for a value of any other type is max_locals - 1.

code_length

The value of the code_length item gives the number of bytes in the code array for this method.

The value of code_length must be greater than zero (as the code array must not be empty) and less than 65536.

code[]

The code array gives the actual bytes of Java Virtual Machine code that implement the method.

When the code array is read into memory on a byte-addressable machine, if the first byte of the array is aligned on a 4-byte boundary, the tableswitch and lookupswitch 32-bit offsets will be 4-byte aligned. (Refer to the descriptions of those instructions for more information on the consequences of code array alignment.)

The detailed constraints on the contents of the code array are extensive and are given in a separate section (4.9).

The contents of the code array are subject to extensive constraints during verification (5.4.1), as described in 4.9.

exception_table_length

The value of the exception_table_length item gives the number of entries in the exception_table table.

exception_table[]

Each entry in the exception_table array describes one exception handler in the code array. The order of the handlers in the exception_table array is significant (2.10).

The contents of the exception_table array are validated during verification, along with the code array.

Each exception_table entry contains the following four items:

start_pc, end_pc

The values of the two items start_pc and end_pc indicate the ranges in the code array at which the exception handler is active. The value of start_pc must should be a valid index into the code array of the opcode of an instruction. The value of end_pc either must should be a valid index into the code array of the opcode of an instruction or must should be equal to code_length, the length of the code array. The value of start_pc must should be less than the value of end_pc.

The start_pc is inclusive and end_pc is exclusive; that is, the exception handler must will be active while the program counter is within the interval [start_pc, end_pc).

The fact that end_pc is exclusive is a historical mistake in the design of the Java Virtual Machine: if the Java Virtual Machine code for a method is exactly 65535 bytes long and ends with an instruction that is 1 byte long, then that instruction cannot be protected by an exception handler. A compiler writer can work around this bug by limiting the maximum size of the generated Java Virtual Machine code for any method, instance initialization method, or static initializer (the size of any code array) to 65534 bytes.

handler_pc

The value of the handler_pc item indicates the start of the exception handler. The value of the item must should be a valid index into the code array and must should be the index of the opcode of an instruction.

catch_type

If the value of the catch_type item is nonzero, it must should be a valid index into the constant_pool table. The constant_pool entry at that index must should be a CONSTANT_Class_info structure (4.4.1) representing a class of exceptions that this exception handler is designated to catch. The exception handler will be called only if the thrown exception is an instance of the given class or one of its subclasses.

The verifier checks that the class is Throwable or a subclass of Throwable (4.9.2).

If the value of the catch_type item is zero, this exception handler is called for all exceptions.

This is used to implement finally (3.13).

In JDK 13, some of these assertions are checked during format checking, but assertions about opcode indices are delayed until verification. For simplicity, it makes more sense to perform all checks on the exception table during verification (see 4.9.1).

attributes_count

The value of the attributes_count item indicates the number of attributes of the Code attribute.

attributes[]

Each value of the attributes table must be an attribute_info structure (4.7).

The attributes defined by this specification as appearing in the attributes table of a Code attribute are listed in Table 4.7-C.

The rules concerning attributes defined to appear in the attributes table of a Code attribute are given in 4.7.

The rules concerning nonstandard attributes in the attributes table of a Code attribute are given in 4.7.1.

4.7.4 The StackMapTable Attribute

The StackMapTable attribute is a variable-length attribute in the attributes table of a Code attribute (4.7.3) in a version 50.0 or later class file. A StackMapTable attribute is used during the process of verification by type checking (4.10.1).

There must be no more than one StackMapTable attribute in the attributes table of a Code attribute.

In a class file whose version number is 50.0 or above, if a method's Code attribute does not have a StackMapTable attribute, it has an implicit stack map attribute (4.10.1). This implicit stack map attribute is equivalent to a StackMapTable attribute with number_of_entries equal to zero.

The StackMapTable attribute must should have the following format:

StackMapTable_attribute {
    u2              attribute_name_index;
    u4              attribute_length;
    u2              number_of_entries;
    stack_map_frame entries[number_of_entries];
}

The contents of the StackMapTable attribute are validated during verification, along with the code and exception_table arrays of the Code attribute (4.7.3).

The items of the StackMapTable_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "StackMapTable".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

number_of_entries

The value of the number_of_entries item gives should give the number of stack_map_frame entries in the entries table.

entries[]

Each entry in the entries table describes should describe one stack map frame of the method. The order of the stack map frames in the entries table is significant.

A stack map frame specifies (either explicitly or implicitly) the bytecode offset at which it applies, and the verification types of local variables and operand stack entries for that offset.

Each stack map frame described in the entries table relies on the previous frame for some of its semantics. The first stack map frame of a method is implicit, and computed from the method descriptor by the type checker (4.10.1.6). The stack_map_frame structure at entries[0] therefore describes the second stack map frame of the method.

The bytecode offset at which a stack map frame applies is calculated by taking the value offset_delta specified in the frame (either explicitly or implicitly), and adding offset_delta + 1 to the bytecode offset of the previous frame, unless the previous frame is the initial frame of the method. In that case, the bytecode offset at which the stack map frame applies is the value offset_delta specified in the frame.

By using an offset delta rather than storing the actual bytecode offset, we ensure, by definition, that stack map frames are in the correctly sorted order. Furthermore, by consistently using the formula offset_delta + 1 for all explicit frames (as opposed to the implicit first frame), we guarantee the absence of duplicates.

We say that an instruction in the bytecode has a corresponding stack map frame if the instruction starts at offset i in the code array of a Code attribute, and the Code attribute has a StackMapTable attribute whose entries array contains a stack map frame that applies at bytecode offset i.

A verification type specifies the type of either one or two locations, where a location is either a single local variable or a single operand stack entry. A verification type is represented by a discriminated union, verification_type_info, that consists of a one-byte tag, indicating which item of the union is in use, followed by zero or more bytes, giving more information about the tag.

union verification_type_info {
    Top_variable_info;
    Integer_variable_info;
    Float_variable_info;
    Long_variable_info;
    Double_variable_info;
    Null_variable_info;
    UninitializedThis_variable_info;
    Object_variable_info;
    Uninitialized_variable_info;
}

A verification type that specifies one location in the local variable array or in the operand stack is represented by the following items of the verification_type_info union:

A verification type that specifies two locations in the local variable array or in the operand stack is represented by the following items of the verification_type_info union:

A stack map frame is represented by a discriminated union, stack_map_frame, which consists of a one-byte tag, indicating which item of the union is in use, followed by zero or more bytes, giving more information about the tag.

union stack_map_frame {
    same_frame;
    same_locals_1_stack_item_frame;
    same_locals_1_stack_item_frame_extended;
    chop_frame;
    same_frame_extended;
    append_frame;
    full_frame;
}

The tag indicates the frame type of the stack map frame:

4.7.5 The Exceptions Attribute

The Exceptions attribute is a variable-length attribute in the attributes table of a method_info structure (4.6). The Exceptions attribute indicates which checked exceptions a method may throw.

There must be no more than one Exceptions attribute in the attributes table of a method_info structure.

The Exceptions attribute must should have the following format:

Exceptions_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 number_of_exceptions;
    u2 exception_index_table[number_of_exceptions];
}

The items of the Exceptions_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "Exceptions".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

number_of_exceptions

The value of the number_of_exceptions item indicates should indicate the number of entries in the exception_index_table.

exception_index_table[]

Each value in the exception_index_table array must should be a valid index into the constant_pool table. The constant_pool entry at that index must should be a CONSTANT_Class_info structure (4.4.1) representing a class type that this method is declared to throw.

A method should throw an exception only if at least one of the following three criteria is met:

These requirements are not enforced in the Java Virtual Machine; they are enforced only at compile time.

4.7.6 The InnerClasses Attribute

The InnerClasses attribute is a variable-length attribute in the attributes table of a ClassFile structure (4.1).

There must be no more than one InnerClasses attribute in the attributes table of a ClassFile structure.

The InnerClasses attribute must should have the following format:

InnerClasses_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 number_of_classes;
    {   u2 inner_class_info_index;
        u2 outer_class_info_index;
        u2 inner_name_index;
        u2 inner_class_access_flags;
    } classes[number_of_classes];
}

The items of the InnerClasses_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "InnerClasses".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

number_of_classes

The value of the number_of_classes item indicates should indicate the number of entries in the classes array.

classes[]

Every CONSTANT_Class_info entry (4.4.1) in the constant_pool table which represents a class or interface C that is not a package member should have exactly one corresponding entry in the classes array.

If a class or interface has members that are classes or interfaces, its constant_pool table (and hence its InnerClasses attribute) should refer to each such member (JLS §13.1), even if that member is not otherwise mentioned by the class.

In addition, the constant_pool table of every nested class and nested interface should refer to its enclosing class, so altogether, every nested class and nested interface will have InnerClasses information for each enclosing class and for each of its own nested classes and interfaces.

Each entry in the classes array contains should contain the following four items:

inner_class_info_index

The value of the inner_class_info_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should be a CONSTANT_Class_info structure representing C.

outer_class_info_index

If C is not a member of a class or an interface - that is, if C is a top-level class or interface (JLS §7.6) or a local class (JLS §14.3) or an anonymous class (JLS §15.9.5) - then the value of the outer_class_info_index item must should be zero.

Otherwise, the value of the outer_class_info_index item must should be a valid index into the constant_pool table, and the entry at that index must should be a CONSTANT_Class_info structure representing the class or interface of which C is a member. The value of the outer_class_info_index item must should not equal the the value of the inner_class_info_index item.

inner_name_index

If C is anonymous (JLS §15.9.5), the value of the inner_name_index item must should be zero.

Otherwise, the value of the inner_name_index item must should be a valid index into the constant_pool table, and the entry at that index must should be a CONSTANT_Utf8_info structure that represents the original simple name of C, as given in the source code from which this class file was compiled.

inner_class_access_flags

The value of the inner_class_access_flags item is a mask of flags used to denote access permissions to and properties of class or interface C as declared in the source code from which this class file was compiled. It is used by a compiler to recover the original information when source code is not available. The flags are specified in Table 4.7.6-A.

Table 4.7.6-A. Nested class access and property flags

Flag Name Value Interpretation
ACC_PUBLIC 0x0001 Marked or implicitly public in source.
ACC_PRIVATE 0x0002 Marked private in source.
ACC_PROTECTED 0x0004 Marked protected in source.
ACC_STATIC 0x0008 Marked or implicitly static in source.
ACC_FINAL 0x0010 Marked or implicitly final in source.
ACC_INTERFACE 0x0200 Was an interface in source.
ACC_ABSTRACT 0x0400 Marked or implicitly abstract in source.
ACC_SYNTHETIC 0x1000 Declared synthetic; not present in the source code.
ACC_ANNOTATION 0x2000 Declared as an annotation type.
ACC_ENUM 0x4000 Declared as an enum type.

All bits of the inner_class_access_flags item not assigned in Table 4.7.6-A are reserved for future use. They should be set to zero in generated class files and should be ignored by Java Virtual Machine implementations.

The entire attribute is ignored by Java Virtual Machine implementations.

If a class file has a version number that is 51.0 or above, and has an InnerClasses attribute in its attributes table, then for all entries in the classes array of the InnerClasses attribute, the value of the outer_class_info_index item must be zero if the value of the inner_name_index item is zero.

This is implied by the descriptions above, so the only reason to restate it is to clarify a specific validation rule. Anyway, it does not appear that Hotspot has ever enforced this rule.

The Java Virtual Machine does not check the consistency of an InnerClasses attribute against a class file representing a class or interface referenced by the attribute.

4.7.7 The EnclosingMethod Attribute

The EnclosingMethod attribute is a fixed-length attribute in the attributes table of a ClassFile structure (4.1) in a version 49.0 or later class file. A class or interface should have an EnclosingMethod attribute if and only if it represents a local class or an anonymous class (JLS §14.3, JLS §15.9.5).

There must be no more than one EnclosingMethod attribute in the attributes table of a ClassFile structure representing a class or interface. There must not be an EnclosingMethod attribute in the attributes table of a ClassFile structure representing a module.

The EnclosingMethod attribute must should have the following format:

EnclosingMethod_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 class_index;
    u2 method_index;
}

The items of the EnclosingMethod_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "EnclosingMethod".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes, and must should be four.

class_index

The value of the class_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should be a CONSTANT_Class_info structure (4.4.1) representing the innermost class that encloses the declaration of the current class.

method_index

If the current class is not immediately enclosed by a method or constructor, then the value of the method_index item must should be zero.

In particular, method_index must should be zero if the current class was immediately enclosed in source code by an instance initializer, static initializer, instance variable initializer, or class variable initializer. (The first two concern both local classes and anonymous classes, while the last two concern anonymous classes declared on the right hand side of a field assignment.)

Otherwise, the value of the method_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should be a CONSTANT_NameAndType_info structure (4.4.6) representing the name and type of a method in the class referenced by the class_index attribute above.

It is the responsibility of a Java compiler to ensure that the method identified via the method_index is indeed the closest lexically enclosing method of the class that contains this EnclosingMethod attribute.

In JDK 13, no check is performed to ensure method_index references a valid method name and descriptor.

4.7.8 The Synthetic Attribute

The Synthetic attribute is a fixed-length attribute in the attributes table of a ClassFile, field_info, or method_info structure (4.1, 4.5, 4.6). A class member that does not appear in the source code should be marked using a Synthetic attribute, or else it should have its ACC_SYNTHETIC flag set. The only exceptions to this requirement are compiler-generated methods which are not considered implementation artifacts, namely the instance initialization method representing a default constructor of the Java programming language (2.9.1), the class or interface initialization method (2.9.2), and the Enum.values() and Enum.valueOf() methods.

The Synthetic attribute was introduced in JDK 1.1 to support nested classes and interfaces.

There must not be a Synthetic attribute in the attributes table of a ClassFile structure representing a module.

The Synthetic attribute must should have the following format:

Synthetic_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
}

The items of the Synthetic_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "Synthetic".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes, and must should be zero.

The effect of these changes is that a Synthetic attribute will not cause an error if it has a non-empty payload.

4.7.9 The Signature Attribute

The Signature attribute is a fixed-length attribute in the attributes table of a ClassFile, field_info, or method_info structure (4.1, 4.5, 4.6) in a version 49.0 or later class file. A Signature attribute records a signature (4.7.9.1) for a class, interface, constructor, method, or field whose declaration in the Java programming language uses type variables or parameterized types. See The Java Language Specification, Java SE 12 Edition for details about these constructs.

There must be no more than one Signature attribute in the attributes table of a ClassFile, field_info, or method_info structure. There must not be a Signature attribute in the attributes table of a ClassFile structure representing a module.

The Signature attribute must should have the following format:

Signature_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 signature_index;
}

The items of the Signature_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "Signature".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes, and must should be two.

signature_index

The value of the signature_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should be a CONSTANT_Utf8_info structure (4.4.7) representing a class signature if this Signature attribute is an attribute of a ClassFile structure; a method signature if this Signature attribute is an attribute of a method_info structure; or a field signature otherwise.

The Java Virtual Machine does not check the well-formedness of Signature attributes during class loading or linking. Instead, Signature attributes are checked by methods of the Java SE Platform class libraries which expose generic signatures of classes, interfaces, constructors, methods, and fields. Examples include getGenericSuperclass in Class and toGenericString in java.lang.reflect.Executable.

4.7.10 The SourceFile Attribute

The SourceFile attribute is an optional fixed-length attribute in the attributes table of a ClassFile structure (4.1).

There must be no more than one SourceFile attribute in the attributes table of a ClassFile structure.

The SourceFile attribute must should have the following format:

SourceFile_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 sourcefile_index;
}

The items of the SourceFile_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "SourceFile".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes, and must should be two.

sourcefile_index

The value of the sourcefile_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should be a CONSTANT_Utf8_info structure representing a string.

The string referenced by the sourcefile_index item will be interpreted as indicating the name of the source file from which this class file was compiled. It will not be interpreted as indicating the name of a directory containing the file or an absolute path name for the file; such platform-specific additional information should be supplied by the run-time interpreter or development tool at the time the file name is actually used.

4.7.12 The LineNumberTable Attribute

The LineNumberTable attribute is an optional variable-length attribute in the attributes table of a Code attribute (4.7.3). It may be used by debuggers to determine which part of the code array corresponds to a given line number in the original source file.

Multiple LineNumberTable attributes may be present in the attributes table of a Code attribute, and may appear in any order. Multiple LineNumberTable attributes may together represent a given line of a source file, and need not be one-to-one with source lines.

The LineNumberTable attribute must should have the following format:

LineNumberTable_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 line_number_table_length;
    {   u2 start_pc;
        u2 line_number;
    } line_number_table[line_number_table_length];
}

The items of the LineNumberTable_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "LineNumberTable".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

line_number_table_length

The value of the line_number_table_length item indicates should indicate the number of entries in the line_number_table array.

line_number_table[]

Each entry in the line_number_table array indicates that the line number in the original source file changes at a given point in the code array. Each line_number_table entry must should contain the following two items:

start_pc

The value of the start_pc item must should be a valid index into the code array of this Code attribute. The item indicates the index into the code array at which the code for a new line in the original source file begins.

JDK 13 ensures start_pc is in range, but does not ensure that it refers to an opcode.

line_number

The value of the line_number item gives should give the corresponding line number in the original source file.

4.7.13 The LocalVariableTable Attribute

The LocalVariableTable attribute is an optional variable-length attribute in the attributes table of a Code attribute (4.7.3). It may be used by debuggers to determine the value of a given local variable during the execution of a method.

There may be no more than one LocalVariableTable attribute per local variable in the attributes table of a Code attribute.

It's not entirely clear what this rule means, but it won't be enforced, and the below note captures the same idea.

In JDK 13, the same combination of (start, length, name, index) is not allowed to appear twice in a single LocalVariableTable attribute or across multiple LocalVariableTable attributes.

Multiple LocalVariableTable attributes may be present in the attributes table of a Code attribute, and they may appear in any order. Each local variable in the source code should appear at most once in one of the LocalVariableTable attributes.

The LocalVariableTable attribute must should have the following format:

LocalVariableTable_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 local_variable_table_length;
    {   u2 start_pc;
        u2 length;
        u2 name_index;
        u2 descriptor_index;
        u2 index;
    } local_variable_table[local_variable_table_length];
}

The items of the LocalVariableTable_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "LocalVariableTable".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

local_variable_table_length

The value of the local_variable_table_length item indicates should indicate the number of entries in the local_variable_table array.

local_variable_table[]

Each entry in the local_variable_table array indicates should indicate a range of code array offsets within which a local variable has a value, and indicates should indicate the index into the local variable array of the current frame at which that local variable can be found. Each entry must should contain the following five items:

start_pc, length

The value of the start_pc item must should be a valid index into the code array of this Code attribute and must should be the index of the opcode of an instruction.

The value of start_pc + length must should either be a valid index into the code array of this Code attribute and be the index of the opcode of an instruction, or it must should be the first index beyond the end of that code array.

The start_pc and length items indicate that the given local variable has a value at indices into the code array in the interval [start_pc, start_pc + length), that is, between start_pc inclusive and start_pc + length exclusive.

JDK 13 ensures that start_pc and start_pc + length are in range, and then checks at verification time that they refer to opcodes.

name_index

The value of the name_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should contain a CONSTANT_Utf8_info structure representing a valid unqualified name denoting a local variable (4.2.2).

descriptor_index

The value of the descriptor_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should contain a CONSTANT_Utf8_info structure representing a field descriptor which encodes the type of a local variable in the source program (4.3.2).

index

The value of the index item must should be a valid index into the local variable array of the current frame. The given local variable is should be at index in the local variable array of the current frame.

If the given local variable is of type double or long, it occupies should occupy both index and index + 1.

JDK 13 permits index values that refer to the second half of a double or long.

4.7.14 The LocalVariableTypeTable Attribute

The LocalVariableTypeTable attribute is an optional variable-length attribute in the attributes table of a Code attribute (4.7.3) in a version 49.0 or later class file. It may be used by debuggers to determine the value of a given local variable during the execution of a method.

There may be no more than one LocalVariableTypeTable attribute per local variable in the attributes table of a Code attribute.

It's not entirely clear what this rule means, but it won't be enforced, and the below note captures the same idea.

In JDK 13, the same combination of (start, length, name, index) can appear twice in a single LocalVariableTable attribute.

Multiple LocalVariableTypeTable attributes may be present in the attributes table of a Code attribute, and they may appear in any order. Each local variable in the source code should appear at most once in one of the LocalVariableTypeTable attributes.

The LocalVariableTypeTable attribute differs from the LocalVariableTable attribute (4.7.13) in that it provides signature information rather than descriptor information. This difference is only significant for variables whose type uses a type variable or parameterized type. Such variables should appear in both tables, while variables of other types might appear only in LocalVariableTable.

The LocalVariableTypeTable attribute must should have the following format:

LocalVariableTypeTable_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 local_variable_type_table_length;
    {   u2 start_pc;
        u2 length;
        u2 name_index;
        u2 signature_index;
        u2 index;
    } local_variable_type_table[local_variable_type_table_length];
}

The items of the LocalVariableTypeTable_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "LocalVariableTypeTable".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

local_variable_type_table_length

The value of the local_variable_type_table_length item indicates should indicate the number of entries in the local_variable_type_table array.

local_variable_type_table[]

Each entry in the local_variable_type_table array indicates should indicate a range of code array offsets within which a local variable has a value, and indicates should indicate the index into the local variable array of the current frame at which that local variable can be found. Each entry must should contain the following five items:

start_pc, length

The value of the start_pc item must should be a valid index into the code array of this Code attribute and must should be the index of the opcode of an instruction.

The value of start_pc + length must should either be a valid index into the code array of this Code attribute and be the index of the opcode of an instruction, or it must should be the first index beyond the end of that code array.

The start_pc and length items indicate that the given local variable has a value at indices into the code array in the interval [start_pc, start_pc + length), that is, between start_pc inclusive and start_pc + length exclusive.

JDK 13 ensures that start_pc and start_pc + length are in range, and then checks at verification time that they refer to opcodes.

name_index

The value of the name_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should contain a CONSTANT_Utf8_info structure representing a valid unqualified name denoting a local variable (4.2.2).

signature_index

The value of the signature_index item must should be a valid index into the constant_pool table. The constant_pool entry at that index must should contain a CONSTANT_Utf8_info structure representing a field signature which encodes the type of a local variable in the source program (4.7.9.1).

In JDK 13, the signature_index item is allowed to represent a method signature.

index

The value of the index item must should be a valid index into the local variable array of the current frame. The given local variable is should be at index in the local variable array of the current frame.

If the given local variable is of type double or long, it occupies should occupy both index and index + 1.

4.7.15 The Deprecated Attribute

The Deprecated attribute is an optional fixed-length attribute in the attributes table of a ClassFile, field_info, or method_info structure (4.1, 4.5, 4.6). A class, interface, method, or field may be marked using a Deprecated attribute to indicate that the class, interface, method, or field has been superseded.

There must not be a Deprecated attribute in the attributes table of a ClassFile structure representing a module.

A run-time interpreter or tool that reads the class file format, such as a compiler, can use this marking to advise the user that a superseded class, interface, method, or field is being referred to. The presence of a Deprecated attribute does not alter the semantics of a class or interface.

The Deprecated attribute must should have the following format:

Deprecated_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
}

The items of the Deprecated_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item is an index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Utf8_info structure (4.4.7) representing the string "Deprecated".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes, and must should be zero.

The effect of these changes is that a Deprecated attribute will not cause an error if it has a non-empty payload.

4.9 Constraints on Java Virtual Machine Code

The code for a method, instance initialization method (2.9.1), or class or interface initialization method (2.9.2) is stored in the code array of the Code attribute of a method_info structure of a class file (4.7.3). This section describes the constraints associated with the contents of the Code_attribute structure and associated StackMapTable structures.

Initialization methods are methods. No need to call them out separately.

These constraints are enforced by verification (4.10).

4.9.1 Static Constraints

The static constraints on a class file are those defining the well-formedness of the file. These constraints have been given in the previous sections, except for static constraints on the code in the class file. The static constraints static constraints on the code in a class file specify how Java Virtual Machine instructions must be laid out in the code array, and what the operands of individual instructions must be, and how exception_table and StackMapTable entries reference the code array and the constant pool.

The deleted sentences unhelpfully blur the distinction between format checking and verification.

The static constraints on the instructions in the code array are as follows:

The static constraints on the operands of instructions in the code array are as follows:

The static constraints on the entries in the exception_table array are as follows:

The static constraints on the entries in the the entries table of a StackMapTable attribute (4.7.4) are as follows:

In JDK 13, the constraints on exception_table are enforced both at class loading time and verification time. (Specifically, constraints that require parsing a code array are delayed until verification time.) This specification makes them all verification time checks.

All constraints on StackMapTable are enforced by JDK 13 at verification time, despite previously being specified as format checks (4.7.4).

The Uninitialized_variable_info constraint is treated by JDK 13 as a structural constraint—in version 50.0 class files, an invalid Object_variable_info causes a failure, but an invalid Uninitialized_variable_info can be recovered from by falling back to verification by type inference. There's not a strong argument for this discrepancy, so we propose changing the behavior, making both unrecoverable failures. (This occurs naturally when the checks are treated as static constraints—see 4.10 and 4.10.1 for details.)

4.9.2 Structural Constraints

The structural constraints structural constraints on the code array code in a class file specify constraints on relationships between Java Virtual Machine instructions, exception handlers, and stack maps. The structural constraints are as follows:

4.10 Verification of class Files

In this revision, static constraints (4.9.1) are explicitly checked as a first step of verification, before one of the two verification algorithms are invoked. Assertions related to static constraints can be removed from the typing algorithms.

Even though a compiler for the Java programming language must only produce class files that satisfy all the static and structural constraints in the previous sections of Section 4.9, the Java Virtual Machine has no guarantee that any file it is asked to load was generated by that compiler or is properly formed. Applications such as web browsers do not download source code, which they then compile; these applications download already-compiled class files. The browser needs to determine whether the class file was produced by a trustworthy compiler or by an adversary attempting to exploit the Java Virtual Machine.

For example, applications such as web browsers do not download source code, which they then compile; these applications download already-compiled class files. The browser needs to determine whether the class file was produced by a trustworthy compiler or by an adversary attempting to exploit the Java Virtual Machine.

This example is better treated as non-normative explanatory text.

An additional problem with compile-time checking is version skew. A user may have successfully compiled a class, say PurchaseStockOptions, to be a subclass of TradingClass. But the definition of TradingClass might have changed since the time the class was compiled in a way that is not compatible with pre-existing binaries. Methods might have been deleted or had their return types or modifiers changed. Fields might have changed types or changed from instance variables to class variables. The access modifiers of a method or variable may have changed from public to private. For a discussion of these issues, see Chapter 13, "Binary Compatibility," in The Java Language Specification, Java SE 12 Edition.

This example is not relevant to verification. If we don't want to delete it, perhaps it could be moved to 5.4.3?

Because of these potential problems, the Java Virtual Machine needs to verify for itself that the desired constraints are satisfied by the class files it attempts to incorporate. However, it may cause unwanted delays to check these constraints during format checking (4.8), which occurs as each class is loaded.

A Thus, a Java Virtual Machine implementation verifies that each class file satisfies the necessary constraints on Code attributes (4.7.3) and StackMapTable attributes (4.7.4) at linking time (5.4).

Link-time verification enhances the performance of the run-time interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java Virtual Machine can assume that these checks have already been performed. For example, the Java Virtual Machine will already know the following:

Link-time verification enhances the performance of the run-time interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java Virtual Machine can assume that these checks have already been performed. For example, the Java Virtual Machine will already know the following:

  • There are no operand stack overflows or underflows.

  • All local variable uses and stores are valid.

  • The arguments to all the Java Virtual Machine instructions are of valid types.

This performance discussion is better treated as non-normative explanatory text.

For each Code attribute in a class file, verification proceeds in two steps.

  1. The code and exception_table arrays are parsed and checked, according to the static constraints described in 4.9.1. If the attributes table contains a StackMapTable attribute, it is also parsed and checked, as described in 4.9.1.

    If any static constraints are violated, verification fails with a VerifyError.

    Historically, this step of verification has been overlooked, implicitly handled by, e.g., assumptions that a Prolog predicate will not succeed if it's impossible to encode an instruction.

    Note that a version 50.0 class file may fail verification due to a static constraint violation in a StackMapTable attribute, even if it would have succeeded verification by type inference if the attribute had been absent. The ability to recover from an error by falling back to verification by type inference is only available in step 2 (see 4.10.1).

  2. The structural constraints described in 4.9.2 are checked.

    There are two strategies that Java Virtual Machine implementations may use for structural constraint verification:

    • Verification by type checking must be used to verify class files whose version number is greater than or equal to 50.0.

    • Verification by type inference must be supported by all Java Virtual Machine implementations, except those conforming to the Java ME CLDC and Java Card profiles, in order to verify class files whose version number is less than 50.0.

      Verification on Java Virtual Machine implementations supporting the Java ME CLDC and Java Card profiles is governed by their respective specifications.

    In either case, if the specified algorithms are unable to prove that all structural constraints hold, verification fails with a VerifyError.

In JDK 13, constraint failures involving the exceptions table or the StackMapTable attribute result in a verification-time ClassFormatError. This behavior is confusing, and we propose throwing a VerifyError instead.

4.10.1 Verification by Type Checking

A class file whose version number is 50.0 or above (4.1) must be verified verify structural constraints using the type checking rules given in this section.

If, and only if, a class file's version number equals 50.0, then if the type checking fails, a Java Virtual Machine implementation may choose to attempt to perform verification by type inference (4.10.2). In this case, structural constraints on the StackMapTable attribute are not enforced.

This is a pragmatic adjustment, designed to ease the transition to the new verification discipline. Many tools that manipulate class files may alter the bytecodes of a method in a manner that requires adjustment of the method's stack map frames. If a tool does not make the necessary adjustments to the stack map frames, type checking may fail even though the bytecode is in principle valid (and would consequently verify under the old type inference scheme). To allow implementors time to adapt their tools, Java Virtual Machine implementations may fall back to the older verification discipline, but only for a limited time.

In cases where type checking fails but type inference is invoked and succeeds, a certain performance penalty is expected. Such a penalty is unavoidable. It also should serve as a signal to tool vendors that their output needs to be adjusted, and provides vendors with additional incentive to make these adjustments.

In summary, failover to verification by type inference supports both the gradual addition of stack map frames to the Java SE Platform (if they are not present in a version 50.0 class file, failover is allowed) and the gradual removal of the jsr and jsr_w instructions from the Java SE Platform (if they are present in a version 50.0 class file, failover is allowed).

If a Java Virtual Machine implementation ever attempts to perform verification by type inference on version 50.0 class files, it must do so in all cases where verification by type checking fails.

This means that a Java Virtual Machine implementation cannot choose to resort to type inference in once case and not in another. It must either reject class files that do not verify via type checking, or else consistently failover to the type inferencing verifier whenever type checking fails.

The type checker enforces type rules that are specified by means of Prolog clauses. English language text is used to describe the type rules in an informal way, while the Prolog clauses provide a formal specification.

The type checker requires a list of stack map frames for each method with a Code attribute (4.7.3). A list of stack map frames is given by the StackMapTable attribute (4.7.4) of a Code attribute. The intent is that a stack map frame must appear at the beginning of each basic block in a method. The stack map frame specifies the verification type of each operand stack entry and of each local variable at the start of each basic block. The type checker reads the stack map frames for each method with a Code attribute and uses these maps to generate a proof of the type safety of the instructions in the Code attribute.

The Prolog predicate methodIsTypeSafe (4.10.1.6) determines whether the Code attribute of a particular method of a particular class or interface conforms to the structural constraints described in 4.9.2. The structural constraints are satisfied by the method if and only if the methodIsTypeSafe predicate is true.

The rest of this section explains the process of type checking in detail:

4.10.1.1 Accessors for Java Virtual Machine Artifacts

...

parseCodeAttribute(Class, Method, FrameSize, MaxStack, ParsedCode, Handlers, StackMap)

Extracts the instruction stream, ParsedCode, of the method Method in Class, as well as the maximum operand stack size, MaxStack, the maximal number of local variables, FrameSize, the exception handlers, Handlers, and the stack map StackMap.

The representation of the instruction stream and stack map attribute must be as specified in 4.10.1.3 and 4.10.1.4.

If a StackMapTable attribute is absent, the implicit stack map attribute is used (4.7.4).

If an entry in the stack map table violates structural constraints (4.9.2) because of an invalid Code array offset, an invalid stack type array size, or an invalid local variable type array size, the parseCodeAttribute predicate fails.

...

4.10.1.3 Instruction Representation

Bug fix: member references and instructions like checkcast and ldc can refer to array types, so CONSTANT_Class structures need to be encoded as types, not class names.

Individual bytecode instructions are represented in Prolog as terms whose functor is the name of the instruction and whose arguments are its parsed operands.

For example, an aload instruction is represented as the term aload(N), which includes the index N that is the operand of the instruction.

The instructions as a whole are represented as a list of terms of the form:

instruction(Offset, AnInstruction)

For example, instruction(21, aload(1)).

The order of instructions in this list must be the same as in the class file.

Some instructions have operands that refer to entries in the constant_pool table representing fields, methods, and dynamically-computed call sites. Such entries are represented as functor applications of the form:

We've combined the two lists of constant forms into one. Because CONSTANT_Class_info is relevant to both lists, it's awkward to keep the two separate. It would require, say, a forward reference from the first list to the second.

Note that, since static constraints have already been checked, there's no particular reason to make a disinction between method and imethod. It would simplify some of the rules if we just used method in both casees instead. (On the other hand, this would introduce an anomaly, being the only case in which multiple constant pool structures map to the same Prolog functor.)

For clarity, we assume that field and method descriptors (4.3.2, 4.3.3) are mapped into more readable names: the leading L and trailing ; are dropped from class names, and the BaseType characters used for primitive types are mapped to the names of those types.

The descriptor should always be processed with parseFieldDescriptor, so its format doesn't need to be specified.

For example, a getfield instruction whose operand refers to a constant pool entry representing a field foo of type F in class Bar would be represented as getfield(field('Bar', 'foo', 'F')) getfield(field(class('Bar', L), 'foo', 'F')), where L is the class loader of the class containing the instruction. An ldc instruction for loading the int constant 91 would be represented as ldc(int(91)).

The ldc instruction, among others, has an operand that refers to a loadable entry in the constant_pool table. There are nine kinds of loadable entry (see Table 4.4-C), represented by functor applications of the following forms:

4.10.1.6 Type Checking Methods

A method with a Code attribute is type safe if it is possible to merge the code and the stack map frames into a single stream such that each stack map frame precedes the instruction it corresponds to, and the merged stream is type correct. The method's exception handlers, if any, must also be legal.

methodIsTypeSafe(Class, Method) :-
    parseCodeAttribute(Class, Method, FrameSize, MaxStack,
                       ParsedCode, Handlers, StackMap),
    mergeStackMapAndCode(StackMap, ParsedCode, MergedCode),
    methodInitialStackFrame(Class, Method, FrameSize, StackFrame, ReturnType),
    Environment = environment(Class, Method, ReturnType, MergedCode,
                              MaxStack, Handlers),
    handlersAreLegal(Environment),
    mergedCodeIsTypeSafe(Environment, MergedCode, StackFrame).

Let us consider exception handlers first.

An exception handler is represented by a functor application of the form:

handler(Start, End, Target, ClassName)

whose arguments are, respectively, the start and end of the range of instructions covered by the handler, the first instruction of the handler code, and the name of the exception class that this handler is designed to handle.

An exception handler is legal if its start (Start) is less than its end (End), there exists an instruction whose offset is equal to Start, there exists an instruction whose offset equals End, there is a stack frame at the start of the handler code (Target) and the handler's exception class is assignable to the class Throwable. The exception class of a handler is Throwable if the handler's class entry is 0, otherwise it is the class named in the handler.

The constraints on Start and End are already checked as static constraints (4.9.1, 4.10).

An additional requirement exists for a handler inside an <init> method if one of the instructions covered by the handler is invokespecial of an <init> method. In this case, the fact that a handler is running means the object under construction is likely broken, so it is important that the handler does not swallow the exception and allow the enclosing <init> method to return normally to the caller. Accordingly, the handler is required to either complete abruptly by throwing an exception to the caller of the enclosing <init> method, or to loop forever.

handlersAreLegal(Environment) :-
    exceptionHandlers(Environment, Handlers),
    checklist(handlerIsLegal(Environment), Handlers).
handlerIsLegal(Environment, Handler) :-
    Handler = handler(Start, End, Target, _),
    Start < End,
    allInstructions(Environment, Instructions),
    member(instruction(Start, _), Instructions),
    offsetStackFrame(Environment, Target, _),
    instructionsIncludeEnd(Instructions, End),
    currentClassLoader(Environment, CurrentLoader),
    handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
    isBootstrapLoader(BL),
    isAssignable(ExceptionClass, class('java/lang/Throwable', BL)),
    initHandlerIsLegal(Environment, Handler).
instructionsIncludeEnd(Instructions, End) :-
    member(instruction(End, _), Instructions).
instructionsIncludeEnd(Instructions, End) :-
    member(endOfCode(End), Instructions).
handlerExceptionClass(handler(_, _, _, 0),
                      class('java/lang/Throwable', BL), _) :-
    isBootstrapLoader(BL).

handlerExceptionClass(handler(_, _, _, Name),
                      class(Name, L), L) :-
    Name \= 0.
initHandlerIsLegal(Environment, Handler) :-
    notInitHandler(Environment, Handler).

notInitHandler(Environment, Handler) :-
    Environment = environment(_Class, Method, _, Instructions, _, _),
    isNotInit(Method).

notInitHandler(Environment, Handler) :-
    Environment = environment(_Class, Method, _, Instructions, _, _),
    isInit(Method),
    member(instruction(_, invokespecial(CP)), Instructions),
    CP = method(MethodClassName, MethodName, Descriptor),
    MethodName \= '`<init>`'.


initHandlerIsLegal(Environment, Handler) :-
    isInitHandler(Environment, Handler),
    sublist(isApplicableInstruction(Target), Instructions,
            HandlerInstructions),
    noAttemptToReturnNormally(HandlerInstructions).

isInitHandler(Environment, Handler) :-
    Environment = environment(_Class, Method, _, Instructions, _, _),
    isInit(Method).
    member(instruction(_, invokespecial(CP)), Instructions),
    CP = method(MethodClassName, '`<init>`', Descriptor).

isApplicableInstruction(HandlerStart, instruction(Offset, _)) :-
    Offset >= HandlerStart.

noAttemptToReturnNormally(Instructions) :-
    notMember(instruction(_, return), Instructions).

noAttemptToReturnNormally(Instructions) :-
    member(instruction(_, athrow), Instructions).

Let us now turn to the stream of instructions and stack map frames.

Merging instructions and stack map frames into a single stream involves four cases:

To determine if the merged stream for a method is type correct, we first infer the method's initial type state.

The initial type state of a method consists of an empty operand stack and local variable types derived from the type of this and the arguments, as well as the appropriate flag, depending on whether this is an <init> method.

methodInitialStackFrame(Class, Method, FrameSize, frame(Locals, [], Flags),
                        ReturnType):-
    methodDescriptor(Method, Descriptor),
    parseMethodDescriptor(Descriptor, RawArgs, ReturnType),
    expandTypeList(RawArgs, Args),
    methodInitialThisType(Class, Method, ThisList),
    flags(ThisList, Flags),
    append(ThisList, Args, ThisArgs),
    expandToLength(ThisArgs, FrameSize, top, Locals).

Given a list of types, the following clause produces a list where every type of size 2 has been substituted by two entries: one for itself, and one top entry. The result then corresponds to the representation of the list as 32-bit words in the Java Virtual Machine.

expandTypeList([], []).
expandTypeList([Item | List], [Item | Result]) :-
    sizeOf(Item, 1),
    expandTypeList(List, Result).
expandTypeList([Item | List], [Item, top | Result]) :-
    sizeOf(Item, 2),
    expandTypeList(List, Result).
flags([uninitializedThis], [flagThisUninit]).
flags(X, []) :- X \= [uninitializedThis].

expandToLength(List, Size, _Filler, List) :-
    length(List, Size).
expandToLength(List, Size, Filler, Result) :-
    length(List, ListLength),
    ListLength < Size,
    Delta is Size - ListLength,
    length(Extra, Delta),
    checklist(=(Filler), Extra),
    append(List, Extra, Result).

For the initial type state of an instance method, we compute the type of this and put it in a list. The type of this in the <init> method of Object is Object; in other <init> methods, the type of this is uninitializedThis; otherwise, the type of this in an instance method is class(N, L) where N is the name of the class containing the method and L is its defining class loader.

For the initial type state of a static method, this is irrelevant, so the list is empty.

methodInitialThisType(_Class, Method, []) :-
    methodAccessFlags(Method, AccessFlags),
    member(static, AccessFlags),
    methodName(Method, MethodName),
    MethodName \= '`<init>`'.

methodInitialThisType(Class, Method, [This]) :-
    methodAccessFlags(Method, AccessFlags),
    notMember(static, AccessFlags),
    instanceMethodInitialThisType(Class, Method, This).

instanceMethodInitialThisType(Class, Method, class('java/lang/Object', L)) :-
    methodName(Method, '`<init>`'),
    classDefiningLoader(Class, L),
    isBootstrapLoader(L),
    classClassName(Class, 'java/lang/Object').

instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
    methodName(Method, '`<init>`'),
    classClassName(Class, ClassName),
    classDefiningLoader(Class, CurrentLoader),
    superclassChain(ClassName, CurrentLoader, Chain),
    Chain \= [].

instanceMethodInitialThisType(Class, Method, class(ClassName, L)) :-
    methodName(Method, MethodName),
    MethodName \= '`<init>`',
    classDefiningLoader(Class, L),
    classClassName(Class, ClassName).

We now compute whether the merged stream for a method is type correct, using the method's initial type state:

Branching to a target is type safe if the target has an associated stack frame, Frame, and the current stack frame, StackFrame, is assignable to Frame.

targetIsTypeSafe(Environment, StackFrame, Target) :-
    offsetStackFrame(Environment, Target, Frame),
    frameIsAssignable(StackFrame, Frame).

An instruction satisfies its exception handlers if it satisfies every exception handler that is applicable to the instruction.

instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame) :-
    exceptionHandlers(Environment, Handlers),
    sublist(isApplicableHandler(Offset), Handlers, ApplicableHandlers),
    checklist(instructionSatisfiesHandler(Environment, ExceptionStackFrame),
              ApplicableHandlers).

An exception handler is applicable to an instruction if the offset of the instruction is greater or equal to the start of the handler's range and less than the end of the handler's range.

isApplicableHandler(Offset, handler(Start, End, _Target, _ClassName)) :-
    Offset >= Start,
    Offset < End.

An instruction satisfies an exception handler if the instructions's outgoing type state is ExcStackFrame, and the handler's target (the initial instruction of the handler code) is type safe assuming an incoming type state T. The type state T is derived from ExcStackFrame by replacing the operand stack with a stack whose sole element is the handler's exception class.

instructionSatisfiesHandler(Environment, ExcStackFrame, Handler) :-
    Handler = handler(_, _, Target, _),
    currentClassLoader(Environment, CurrentLoader),
    handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
    /* The stack consists of just the exception. */
    ExcStackFrame = frame(Locals, _, Flags),
    TrueExcStackFrame = frame(Locals, [ ExceptionClass ], Flags),
    operandStackHasLegalLength(Environment, TrueExcStackFrame),
    targetIsTypeSafe(Environment, TrueExcStackFrame, Target).
4.10.1.8 Type Checking for protected Members

Member references can refer to array types, so the rules in this section need to be defined in terms of a type, not a class name.

All instructions that access members must contend with the rules concerning protected members. This section describes the protected check that corresponds to JLS §6.6.2.1.

The protected check applies only to protected members of superclasses of the current class. protected members in other classes will be caught by the access checking done at resolution (5.4.4). There are four cases:

The predicate classesInOtherPkgWithProtectedMember(Class, MemberName, MemberDescriptor, MemberClassName, Chain, List) is true if List is the set of classes in Chain with name MemberClassName that are in a different run-time package than Class which have a protected member named MemberName with descriptor MemberDescriptor.

classesInOtherPkgWithProtectedMember(_, _, _, _, [], []).

classesInOtherPkgWithProtectedMember(Class, MemberName,
                                     MemberDescriptor, MemberClassName,
                                     [class(MemberClassName, L) | Tail],
                                     [class(MemberClassName, L) | T]) :-
    differentRuntimePackage(Class, class(MemberClassName, L)),
    loadedClass(MemberClassName, L, Super),
    isProtected(Super, MemberName, MemberDescriptor),
    classesInOtherPkgWithProtectedMember(
      Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).

classesInOtherPkgWithProtectedMember(Class, MemberName,
                                     MemberDescriptor, MemberClassName,
                                     [class(MemberClassName, L) | Tail],
                                     T) :-
    differentRuntimePackage(Class, class(MemberClassName, L)),
    loadedClass(MemberClassName, L, Super),
    isNotProtected(Super, MemberName, MemberDescriptor),
    classesInOtherPkgWithProtectedMember(
      Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).

classesInOtherPkgWithProtectedMember(Class, MemberName,
                                     MemberDescriptor, MemberClassName,
                                     [class(MemberClassName, L) | Tail],
                                     T] :-
    sameRuntimePackage(Class, class(MemberClassName, L)),
    classesInOtherPkgWithProtectedMember(
      Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).

sameRuntimePackage(Class1, Class2) :-
    classDefiningLoader(Class1, L),
    classDefiningLoader(Class2, L),
    samePackageName(Class1, Class2).

differentRuntimePackage(Class1, Class2) :-
    classDefiningLoader(Class1, L1),
    classDefiningLoader(Class2, L2),
    L1 \= L2.

differentRuntimePackage(Class1, Class2) :-
    differentPackageName(Class1, Class2).
4.10.1.9 Type Checking Instructions

In general, the type rule for an instruction is given relative to an environment Environment that defines the class and method in which the instruction occurs (4.10.1.1), and the offset Offset within the method at which the instruction occurs. The rule states that if the incoming type state StackFrame fulfills certain requirements, then:

Many instructions have type rules that are completely isomorphic to the rules for other instructions. If an instruction b1 is isomorphic to another instruction b2, then the type rule for b1 is the same as the type rule for b2.

instructionIsTypeSafe(Instruction, Environment, Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    instructionHasEquivalentTypeRule(Instruction, IsomorphicInstruction),
    instructionIsTypeSafe(IsomorphicInstruction, Environment, Offset,
                          StackFrame, NextStackFrame,
                          ExceptionStackFrame).

The English language description of each rule is intended to be readable, intuitive, and concise. As such, the description avoids repeating all the contextual assumptions given above. In particular:

Any ambiguities can be resolved by referring to the formal Prolog clauses.

anewarray

An anewarray instruction with operand CP is type safe iff CP refers to a constant pool entry denoting a class, interface, or array type, and one can legally replace a type matching int on the incoming operand stack with an array with component type CP yielding the outgoing type state.

instructionIsTypeSafe(anewarray(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = class(_, _) ; CP = arrayOf(_)),
    validTypeTransition(Environment, [int], arrayOf(CP),
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The form of the constant pool entry is checked by static constraints (4.9.1, 4.10).

checkcast

A checkcast instruction with operand CP is type safe iff CP refers to a constant pool entry denoting either a class or an array, and one can validly replace the type Object on top of the incoming operand stack with the type denoted by CP yielding the outgoing type state.

instructionIsTypeSafe(checkcast(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = class(_, _) ; CP = arrayOf(_)),
    isBootstrapLoader(BL),
    validTypeTransition(Environment, [class('java/lang/Object', BL)], CP,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The form of the constant pool entry is checked by static constraints (4.9.1, 4.10).

getfield

A getfield instruction with operand CP is type safe iff CP refers to a constant pool entry denoting a field whose declared type is FieldType, declared in a class FieldClassName type FieldClass, and one can validly replace a type matching FieldClassName FieldClass with type FieldType on the incoming operand stack yielding the outgoing type state. FieldClassName must not be an array type. protected fields are subject to additional checks (4.10.1.8).

Array types are allowed here (4.4.2).

instructionIsTypeSafe(getfield(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(FieldClassName, FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    passesProtectedCheck(Environment, FieldClassName, FieldName,
                         FieldDescriptor, StackFrame),
    currentClassLoader(Environment, CurrentLoader),
    validTypeTransition(Environment,
                        [class(FieldClassName, CurrentLoader)], FieldType,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(getfield(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(FieldClass, FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    passesProtectedCheck(Environment, FieldClass, FieldName,
                         FieldDescriptor, StackFrame),
    currentClassLoader(Environment, CurrentLoader),
    validTypeTransition(Environment, [FieldClass], FieldType,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
getstatic

A getstatic instruction with operand CP is type safe iff CP refers to a constant pool entry denoting a field whose declared type is FieldType, and one can validly push FieldType on the incoming operand stack yielding the outgoing type state.

instructionIsTypeSafe(getstatic(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(_FieldClassName, _FieldName, FieldDescriptor),
    CP = field(_FieldClass, _FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    validTypeTransition(Environment, [], FieldType,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
instanceof

An instanceof instruction with operand CP is type safe iff CP refers to a constant pool entry denoting either a class or an array, and one can validly replace the type Object on top of the incoming operand stack with type int yielding the outgoing type state.

instructionIsTypeSafe(instanceof(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = class(_, _) ; CP = arrayOf(_)),
    isBootstrapLoader(BL),
    validTypeTransition(Environment, [class('java/lang/Object', BL)], int,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The form of the constant pool entry is checked by static constraints (4.9.1, 4.10).

invokedynamic

An invokedynamic instruction is type safe iff all of the following are true:

instructionIsTypeSafe(invokedynamic(CP,0,0), Environment, _Offset,
                      StackFrame, NextStackFrame, ExceptionStackFrame) :-
    CP = dmethod(CallSiteName, Descriptor),
    CallSiteName \= '`<init>`',
    CallSiteName \= '`<clinit>`',
    CP = dmethod(_CallSiteName, Descriptor),
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    reverse(OperandArgList, StackArgList),
    validTypeTransition(Environment, StackArgList, ReturnType,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

Method names are checked by format checking (4.4.10).

invokeinterface

An invokeinterface instruction is type safe iff all of the following are true:

instructionIsTypeSafe(invokeinterface(CP, Count, 0), Environment, _Offset,
                      StackFrame, NextStackFrame, ExceptionStackFrame) :-
    CP = imethod(MethodIntfName, MethodName, Descriptor),
    MethodName \= '`<init>`',
    MethodName \= '`<clinit>`',
    CP = imethod(MethodClass, _MethodName, Descriptor),
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    currentClassLoader(Environment, CurrentLoader),
    reverse([class(MethodIntfName, CurrentLoader) | OperandArgList],
            StackArgList),
    canPop(StackFrame, StackArgList, TempFrame),
    validTypeTransition(Environment, [], ReturnType,
                        TempFrame, NextStackFrame),
    countIsValid(Count, StackFrame, TempFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokeinterface(CP, _Count, 0), Environment, _Offset,
                      StackFrame, NextStackFrame, ExceptionStackFrame) :-
    CP = imethod(MethodClass, _MethodName, Descriptor),
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    reverse(MethodClass | OperandArgList], StackArgList),
    canPop(StackFrame, StackArgList, TempFrame),
    validTypeTransition(Environment, [], ReturnType,
                        TempFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

Array types are allowed here (4.4.2).

The <clinit> method name is prohibited by format checking (4.4.2). The <init> method name is prohibited by static constraints (4.9.1, 4.10).

The Count operand of an invokeinterface instruction is valid if it equals the size of the arguments to the instruction. This is equal to the difference between the size of InputFrame and OutputFrame.

countIsValid(Count, InputFrame, OutputFrame) :-
    InputFrame = frame(_Locals1, OperandStack1, _Flags1),
    OutputFrame = frame(_Locals2, OperandStack2, _Flags2),
    length(OperandStack1, Length1),
    length(OperandStack2, Length2),
    Count =:= Length1 - Length2.

The Count operand is checked by static constraints (4.9.1, 4.10).

invokespecial

An invokespecial instruction is type safe iff all of the following are true:

instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = method(MethodClassName, MethodName, Descriptor),
    MethodName \= '`<init>`',
    MethodName \= '`<clinit>`',
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    thisClass(Environment, class(CurrentClassName, CurrentLoader)),
    isAssignable(class(CurrentClassName, CurrentLoader),
                 class(MethodClassName,  CurrentLoader)),
    reverse([class(CurrentClassName, CurrentLoader) | OperandArgList],
            StackArgList),
    validTypeTransition(Environment, StackArgList, ReturnType,
                        StackFrame, NextStackFrame),
    reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
            StackArgList2),
    validTypeTransition(Environment, StackArgList2, ReturnType,
                        StackFrame, _ResultStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = method(MethodClass, MethodName, Descriptor) ;
     CP = imethod(MethodClass, MethodName, Descriptor)),
    MethodName \= '`<init>`',
    validSpecialMethodClass(Environment, MethodClass),
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    thisClass(Environment, CurrentClass),
    reverse([CurrentClass | OperandArgList], StackArgList),
    validTypeTransition(Environment, StackArgList, ReturnType,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

validSpecialMethodClass(Environment, MethodClass) :-
    thisClass(Environment, MethodClass).

validSpecialMethodClass(Environment, class(MethodClassName, L)) :-
    loadedClass(MethodClassName, L, LoadedMethodClass),
    \+ classIsInterface(LoadedMethodClass),
    thisClass(Environment, CurrentClass),
    isAssignable(CurrentClass, class(MethodClassName, L)).

validSpecialMethodClass(Environment, class(MethodClassName, L)) :-
    loadedClass(MethodClassName, L, LoadedMethodClass),
    classIsInterface(LoadedMethodClass),
    Environment = environment(CurrentClass, _, _, _, _, _),
    classInterfaces(CurrentClass, InterfaceNames),
    member(MethodClassName, InterfaceNames).

Interface methods are allowed here (4.9.1).

The <clinit> method name is prohibited by format checking (4.4.2).

The old rules attempt to enforce the constraints on MethodClassName via an isAssignable call, deferring some complexity to subtype testing. Unfortunately, this isn't correct for interfaces: every reference type is "assignable" to every interface type.

In practice, Hotspot appears to perform the following check:

This allows, for example, the name of a valid interface that is not related to the current class to appear in a Methodref. Verification succeeds, and no error occurs until resolution of the Methodref.

That's some unnecessary complexity that doesn't quite align with 4.9.2. Instead, these new rules directly test for a valid class/interface name: the current class, a superclass, or a direct superinterface. The rules do some class loading, but note that the same loading occurred before in the isAssignable test.

Another problem with the old rules is a redundant subtyping check via validTypeTransition. Given that CurrentClass <: MethodClass and the stack operand <: CurrentClass, there's no need to also check that the stack operand <: MethodClass.

Array types are syntactically allowed here (4.4.2), but the validSpecialMethodClass clause will reject them.

The isAssignable validSpecialMethodClass clause enforces the structural constraint that invokespecial, for other than an instance initialization method, must name a method in the current class/interface or a superclass/superinterface.

The first validTypeTransition clause enforces the structural constraint that invokespecial, for other than an instance initialization method, targets a receiver object of the current class or deeper. To see why, consider that StackArgList simulates the list of types on the operand stack expected by the method, starting with the current class (the class performing invokespecial). The actual types on the operand stack are in StackFrame. The effect of validTypeTransition is to pop the first type from the operand stack in StackFrame and check it is a subtype of the first term of StackArgList, namely the current class. Thus, the actual receiver type is compatible with the current class.

A sharp-eyed reader might notice that enforcing this structural constraint supercedes the structural constraint pertaining to invokespecial of a protected method. Thus, the Prolog code above makes no reference to passesProtectedCheck (4.10.1.8), whereas the Prolog code for invokespecial of an instance initialization method uses passesProtectedCheck to ensure the actual receiver type is compatible with the current class when certain protected instance initialization methods are named.

The second validTypeTransition clause enforces the structural constraint that any method invocation instruction must target a receiver object whose type is compatible with the type named by the instruction. To see why, consider that StackArgList2 simulates the list of types on the operand stack expected by the method, starting with the type named by the instruction. Again, the actual types on the operand stack are in StackFrame, and the effect of validTypeTransition is to check the actual receiver type in StackFrame is compatible with the type named by the instruction in StackArgList2.

instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = method(MethodClassName, '`<init>`', Descriptor),
    parseMethodDescriptor(Descriptor, OperandArgList, void),
    reverse(OperandArgList, StackArgList),
    canPop(StackFrame, StackArgList, TempFrame),
    TempFrame = frame(Locals, [uninitializedThis | OperandStack], Flags),
    currentClassLoader(Environment, CurrentLoader),
    rewrittenUninitializedType(uninitializedThis, Environment,
                               class(MethodClassName, CurrentLoader), This),
    rewrittenInitializationFlags(uninitializedThis, Flags, NextFlags),
    substitute(uninitializedThis, This, OperandStack, NextOperandStack),
    substitute(uninitializedThis, This, Locals, NextLocals),
    NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
    ExceptionStackFrame = frame(Locals, [], Flags).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = method(MethodClass, '`<init>`', Descriptor) ;
     CP = imethod(MethodClass, '`<init>`', Descriptor)),
    parseMethodDescriptor(Descriptor, OperandArgList, void),
    reverse(OperandArgList, StackArgList),
    canPop(StackFrame, StackArgList, TempFrame),
    TempFrame = frame(Locals, [uninitializedThis | OperandStack], Flags),
    rewrittenUninitializedType(uninitializedThis, Environment,
                               MethodClass, This),
    rewrittenInitializationFlags(uninitializedThis, Flags, NextFlags),
    substitute(uninitializedThis, This, OperandStack, NextOperandStack),
    substitute(uninitializedThis, This, Locals, NextLocals),
    NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
    ExceptionStackFrame = frame(Locals, [], Flags).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = method(MethodClassName, '`<init>`', Descriptor),
    parseMethodDescriptor(Descriptor, OperandArgList, void),
    reverse(OperandArgList, StackArgList),
    canPop(StackFrame, StackArgList, TempFrame),
    TempFrame = frame(Locals, [uninitialized(Address) | OperandStack], Flags),
    currentClassLoader(Environment, CurrentLoader),
    rewrittenUninitializedType(uninitialized(Address), Environment,
                               class(MethodClassName, CurrentLoader), This),
    rewrittenInitializationFlags(uninitialized(Address), Flags, NextFlags),
    substitute(uninitialized(Address), This, OperandStack, NextOperandStack),
    substitute(uninitialized(Address), This, Locals, NextLocals),
    NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
    ExceptionStackFrame = frame(Locals, [], Flags),
    passesProtectedCheck(Environment, MethodClassName, '`<init>`',
                         Descriptor, NextStackFrame).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = method(MethodClass, '`<init>`', Descriptor) ;
     CP = imethod(MethodClass, '`<init>`', Descriptor)),
    parseMethodDescriptor(Descriptor, OperandArgList, void),
    reverse(OperandArgList, StackArgList),
    canPop(StackFrame, StackArgList, TempFrame),
    TempFrame = frame(Locals, [uninitialized(Address) | OperandStack], Flags),
    rewrittenUninitializedType(uninitialized(Address), Environment,
                               MethodClass, This),
    rewrittenInitializationFlags(uninitialized(Address), Flags, NextFlags),
    substitute(uninitialized(Address), This, OperandStack, NextOperandStack),
    substitute(uninitialized(Address), This, Locals, NextLocals),
    NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
    ExceptionStackFrame = frame(Locals, [], Flags),
    passesProtectedCheck(Environment, MethodClass, '`<init>`',
                         Descriptor, NextStackFrame).

Array types are syntactically allowed here (4.4.2), but the rewrittenUninitializedType clause will reject them.

The return type must be void, per 4.4.2.

I've confirmed that Hotspot behavior in JDK 13 is to allow interface method references named <init>, delaying any errors until resolution/runtime.

To compute what type the uninitialized argument's type needs to be rewritten to, there are two cases:

rewrittenUninitializedType(uninitializedThis, Environment,
                           MethodClass, MethodClass) :-
    MethodClass = class(MethodClassName, CurrentLoader),
    thisClass(Environment, MethodClass).
rewrittenUninitializedType(uninitializedThis, Environment,
                           MethodClass, MethodClass) :-
    MethodClass = class(MethodClassName, CurrentLoader),
    thisClass(Environment, class(thisClassName, thisLoader)),
    superclassChain(thisClassName, thisLoader, [MethodClass | Rest]).
rewrittenUninitializedType(uninitialized(Address), Environment,
                           MethodClass, MethodClass) :-
    allInstructions(Environment, Instructions),
    member(instruction(Address, new(MethodClass)), Instructions).
rewrittenInitializationFlags(uninitializedThis, _Flags, []).
rewrittenInitializationFlags(uninitialized(_), Flags, Flags).

substitute(_Old, _New, [], []).
substitute(Old, New, [Old | FromRest], [New | ToRest]) :-
    substitute(Old, New, FromRest, ToRest).
substitute(Old, New, [From1 | FromRest], [From1 | ToRest]) :-
    From1 \= Old,
    substitute(Old, New, FromRest, ToRest).

The rule for invokespecial of an <init> method is the sole motivation for passing back a distinct exception stack frame. The concern is that when initializing an object within its constructor, invokespecial can cause a superclass <init> method to be invoked, and that invocation could fail, leaving this uninitialized. This situation cannot be created using source code in the Java programming language, but can be created by programming in bytecode directly.

In this situation, the original frame holds an uninitialized object in local variable 0 and has flag flagThisUninit. Normal termination of invokespecial initializes the uninitialized object and turns off the flagThisUninit flag. But if the invocation of an <init> method throws an exception, the uninitialized object might be left in a partially initialized state, and needs to be made permanently unusable. This is represented by an exception frame containing the broken object (the new value of the local) and the flagThisUninit flag (the old flag). There is no way to get from an apparently-initialized object bearing the flagThisUninit flag to a properly initialized object, so the object is permanently unusable.

If not for this situation, the flags of the exception stack frame would always be the same as the flags of the input stack frame.

invokestatic

An invokestatic instruction is type safe iff all of the following are true:

instructionIsTypeSafe(invokestatic(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = method(_MethodClassName, MethodName, Descriptor),
    MethodName \= '`<init>`',
    MethodName \= '`<clinit>`',
    (CP = method(_MethodClass, _MethodName, Descriptor) ;
     CP = imethod(_MethodClass, _MethodName, Descriptor)),
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    reverse(OperandArgList, StackArgList),
    validTypeTransition(Environment, StackArgList, ReturnType,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The <clinit> method name is prohibited by format checking (4.4.2). The <init> method name is prohibited by static constraints (4.9.1, 4.10).

Interface methods are allowed here (4.9.1).

invokevirtual

An invokevirtual instruction is type safe iff all of the following are true:

instructionIsTypeSafe(invokevirtual(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = method(MethodClassName, MethodName, Descriptor),
    MethodName \= '`<init>`',
    MethodName \= '`<clinit>`',
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    reverse(OperandArgList, ArgList),
    currentClassLoader(Environment, CurrentLoader),
    reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
            StackArgList),
    validTypeTransition(Environment, StackArgList, ReturnType,
                        StackFrame, NextStackFrame),
    canPop(StackFrame, ArgList, PoppedFrame),
    passesProtectedCheck(Environment, MethodClassName, MethodName,
                         Descriptor, PoppedFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokevirtual(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = method(MethodClass, _MethodName, Descriptor),
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    reverse(OperandArgList, ArgList),
    reverse([MethodClass | OperandArgList], StackArgList),
    validTypeTransition(Environment, StackArgList, ReturnType,
                        StackFrame, NextStackFrame),
    canPop(StackFrame, ArgList, PoppedFrame),
    passesProtectedCheck(Environment, MethodClass, MethodName,
                         Descriptor, PoppedFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The <clinit> method name is prohibited by format checking (4.4.2). The <init> method name is prohibited by static constraints (4.9.1, 4.10).

ldc, ldc_w, ldc2_w

An ldc instruction with operand CP is type safe iff CP refers to a constant pool entry denoting an entity of type Type, where Type is loadable (4.4), but not long or double, and one can validly push Type onto the incoming operand stack yielding the outgoing type state.

instructionIsTypeSafe(ldc(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    loadableConstant(CP, Type),
    Type \= long,
    Type \= double,
    validTypeTransition(Environment, [], Type, StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

loadableConstant(CP, Type) :-
    member([CP, Type], [
        [int(_),    int],
        [float(_),  float],
        [long(_),   long],
        [double(_), double]
    ]).

loadableConstant(CP, Type) :-
    isBootstrapLoader(BL),
    member([CP, Type], [
        [class(_),          class('java/lang/Class', BL)],
        [class(_,_),        class('java/lang/Class', BL)],
        [arrayOf(_),        class('java/lang/Class', BL)],
        [string(_),         class('java/lang/String', BL)],
        [methodHandle(_,_), class('java/lang/invoke/MethodHandle', BL)],
        [methodType(_,_),   class('java/lang/invoke/MethodType', BL)]
    ]).

loadableConstant(CP, Type) :-
    CP = dconstant(_, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, Type).

An ldc_w or ldc2_w instruction is type safe iff the equivalent ldc instruction is type safe.

instructionHasEquivalentTypeRule(ldc_w(CP), ldc(CP))
instructionHasEquivalentTypeRule(ldc2_w(CP), ldc(CP))

An ldc2_w instruction with operand CP is type safe iff CP refers to a constant pool entry denoting an entity of type Type, where Type is either long or double, and one can validly push Type onto the incoming operand stack yielding the outgoing type state.

instructionIsTypeSafe(ldc2_w(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    loadableConstant(CP, Type),
    (Type = long ; Type = double),
    validTypeTransition(Environment, [], Type, StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The types of constants are checked by static constraints (4.9.1, 4.10).

lookupswitch

A lookupswitch instruction is type safe if its keys are sorted, one can validly pop int off the incoming operand stack yielding a new type state BranchStackFrame, and all of the instruction's targets are valid branch targets assuming BranchStackFrame as their incoming type state.

instructionIsTypeSafe(lookupswitch(Targets, Keys), Environment, _, StackFrame,
                      afterGoto, ExceptionStackFrame) :-
    sort(Keys, Keys),
    canPop(StackFrame, [int], BranchStackFrame),
    checklist(targetIsTypeSafe(Environment, BranchStackFrame), Targets),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The order of keys is checked by static constraints (4.9.1, 4.10).

multianewarray

A multianewarray instruction with operands CP and Dim is type safe iff CP refers to a constant pool entry denoting an array type whose dimension is greater or equal to Dim, Dim is strictly positive, and one can validly replace Dim int types on the incoming operand stack with the type denoted by CP yielding the outgoing type state.

instructionIsTypeSafe(multianewarray(CP, Dim), Environment, _Offset,
                      StackFrame, NextStackFrame, ExceptionStackFrame) :-
    CP = arrayOf(_),
    classDimension(CP, Dimension),
    Dimension >= Dim,
    Dim > 0,
    /* Make a list of Dim ints */
    findall(int, between(1, Dim, _), IntList),
    validTypeTransition(Environment, IntList, CP,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The form of the constant pool entry is checked by static constraints (4.9.1, 4.10).

The dimension of an array type whose component type is also an array type is one more than the dimension of its component type.

classDimension(arrayOf(X), Dimension) :-
    classDimension(X, Dimension1),
    Dimension is Dimension1 + 1.

classDimension(_, Dimension) :-
    Dimension = 0.
new

A new instruction with operand CP at offset Offset is type safe iff CP refers to a constant pool entry denoting a class or interface type, the type uninitialized(Offset) does not appear in the incoming operand stack, and one can validly push uninitialized(Offset) onto the incoming operand stack and replace uninitialized(Offset) with top in the incoming local variables yielding the outgoing type state.

instructionIsTypeSafe(new(CP), Environment, Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    StackFrame = frame(Locals, OperandStack, Flags),
    CP = class(_, _),
    NewItem = uninitialized(Offset),
    notMember(NewItem, OperandStack),
    substitute(NewItem, top, Locals, NewLocals),
    validTypeTransition(Environment, [], NewItem,
                        frame(NewLocals, OperandStack, Flags),
                        NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The substitute predicate is defined in the rule for invokespecial (4.10.1.9.invokespecial).

The form of the constant pool entry is checked by static constraints (4.9.1, 4.10).

putfield

A putfield instruction with operand CP is type safe iff all of the following are true:

instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(FieldClassName, FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    canPop(StackFrame, [FieldType], PoppedFrame),
    passesProtectedCheck(Environment, FieldClassName, FieldName,
                         FieldDescriptor, PoppedFrame),
    currentClassLoader(Environment, CurrentLoader),
    canPop(StackFrame, [FieldType, class(FieldClassName, CurrentLoader)],
           NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(FieldClass, FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    canPop(StackFrame, [FieldType], PoppedFrame),
    passesProtectedCheck(Environment, FieldClass, FieldName,
                         FieldDescriptor, PoppedFrame),
    canPop(StackFrame, [FieldType, FieldClass], NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(FieldClassName, _FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    Environment = environment(CurrentClass, CurrentMethod, _, _, _, _),
    CurrentClass = class(FieldClassName, _),
    isInit(CurrentMethod),
    canPop(StackFrame, [FieldType, uninitializedThis], NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(FieldClass, _FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    Environment = environment(FieldClass, CurrentMethod, _, _, _, _),
    isInit(CurrentMethod),
    canPop(StackFrame, [FieldType, uninitializedThis], NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
putstatic

A putstatic instruction with operand CP is type safe iff CP refers to a constant pool entry denoting a field whose declared type is FieldType, and one can validly pop a type matching FieldType off the incoming operand stack yielding the outgoing type state.

instructionIsTypeSafe(putstatic(CP), _Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(_FieldClassName, _FieldName, FieldDescriptor),
    CP = field(_FieldClass, _FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    canPop(StackFrame, [FieldType], NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
tableswitch

A tableswitch instruction is type safe if its keys are sorted, one can validly pop int off the incoming operand stack yielding a new type state BranchStackFrame, and all of the instruction's targets are valid branch targets assuming BranchStackFrame as their incoming type state.

instructionIsTypeSafe(tableswitch(Targets, Keys), Environment, _Offset,
                      StackFrame, afterGoto, ExceptionStackFrame) :-
    sort(Keys, Keys),
    canPop(StackFrame, [int], BranchStackFrame),
    checklist(targetIsTypeSafe(Environment, BranchStackFrame), Targets),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The keys are implicit, and so are guaranteed to be sorted.

4.10.2 Verification by Type Inference

A class file that does not contain a StackMapTable attribute (which necessarily has a version number of 49.0 or below) must be verified have its structural constraints verified using type inference.

4.10.2.1 The Process of Verification by Type Inference

During linking, the verifier checks the code array of the Code attribute for each method of the class file by performing data-flow analysis on each method. The verifier ensures that at any given point in the program, no matter what code path is taken to reach that point, all of the following are true:

For efficiency reasons, certain tests that could in principle be performed by the verifier are delayed until the first time the code for the method is actually invoked. In so doing, the verifier avoids loading class files unless it has to.

For example, if a method invokes another method that returns an instance of class A, and that instance is assigned only to a field of the same type, the verifier does not bother to check if the class A actually exists. However, if it is assigned to a field of the type B, the definitions of both A and B must be loaded in to ensure that A is a subclass of B.

This discussion doesn't belong here; it's covered by 5.4.

4.10.2.2 The Bytecode Verifier

The code for each method is verified independently. First, the bytes that make up the code are broken up into a sequence of instructions, and the index into the code array of the start of each instruction is placed in an array. The verifier then goes through the code a second time and parses the instructions. During this pass a data structure is built to hold information about each Java Virtual Machine instruction in the method. The operands, if any, of each instruction are checked to make sure they are valid. For instance:

These restrictions are enforced by static constraints (4.9.1) before attempting verification by type inference (4.10).

For each instruction of the method, the verifier records the contents of the operand stack and the contents of the local variable array prior to the execution of that instruction. For the operand stack, it needs to know the stack height and the type of each value on it. For each local variable, it needs to know either the type of the contents of that local variable or that the local variable contains an unusable or unknown value (it might be uninitialized). The bytecode verifier does not need to distinguish between the integral types (e.g., byte, short, char) when determining the value types on the operand stack.

Next, a data-flow analyzer is initialized. For the first instruction of the method, the local variables that represent parameters initially contain values of the types indicated by the method's type descriptor; the operand stack is empty. All other local variables contain an illegal value. For the other instructions, which have not been examined yet, no information is available regarding the operand stack or local variables.

Finally, the data-flow analyzer is run. For each instruction, a "changed" bit indicates whether this instruction needs to be looked at. Initially, the "changed" bit is set only for the first instruction. The data-flow analyzer executes the following loop:

  1. Select a Java Virtual Machine instruction whose "changed" bit is set. If no instruction remains whose "changed" bit is set, the method has successfully been verified. Otherwise, turn off the "changed" bit of the selected instruction.

  2. Model the effect of the instruction on the operand stack and local variable array by doing the following:

    • If the instruction uses values from the operand stack, ensure that there are a sufficient number of values on the stack and that the top values on the stack are of an appropriate type. Otherwise, verification fails.

    • If the instruction uses a local variable, ensure that the specified local variable contains a value of the appropriate type. Otherwise, verification fails.

    • If the instruction pushes values onto the operand stack, ensure that there is sufficient room on the operand stack for the new values. Add the indicated types to the top of the modeled operand stack.

    • If the instruction modifies a local variable, record that the local variable now contains the new type.

  3. Determine the instructions that can follow the current instruction. Successor instructions can be one of the following:

    • The next instruction, if the current instruction is not an unconditional control transfer instruction (for instance, goto, return, or athrow). Verification fails if it is possible to "fall off" the last instruction of the method.

    • The target(s) of a conditional or unconditional branch or switch.

    • Any exception handlers for this instruction.

  4. Merge the state of the operand stack and local variable array at the end of the execution of the current instruction into each of the successor instructions, as follows:

    • If this is the first time the successor instruction has been visited, record that the operand stack and local variable values calculated in step 2 are the state of the operand stack and local variable array prior to executing the successor instruction. Set the "changed" bit for the successor instruction.

    • If the successor instruction has been seen before, merge the operand stack and local variable values calculated in step 2 into the values already there. Set the "changed" bit if there is any modification to the values.

    In the special case of control transfer to an exception handler:

    • Record that a single object, of the exception type indicated by the exception handler, is the state of the operand stack prior to executing the successor instruction. There must be sufficient room on the operand stack for this single value, as if an instruction had pushed it.

    • Record that the local variable values from immediately before step 2 are the state of the local variable array prior to executing the successor instruction. The local variable values calculated in step 2 are irrelevant.

  5. Continue at step 1.

To merge two operand stacks, the number of values on each stack must be identical. Then, corresponding values on the two stacks are compared and the value on the merged stack is computed, as follows:

If the operand stacks cannot be merged, verification of the method fails.

To merge two local variable array states, corresponding pairs of local variables are compared. The value of the merged local variable is computed using the rules above, except that the corresponding values are permitted to be different primitive types. In that case, the verifier records that the merged local variable contains an unusable value.

If the data-flow analyzer runs on a method without reporting a verification failure, then the method has been successfully verified by the class file verifier.

Certain instructions and data types complicate the data-flow analyzer. We now examine each of these in more detail.