Here is a detailed application of the rules for value-based classes to the concept of extending “classic” Java arrays with immutability. The basic idea is to define a method
a.freeze()) which produces an immutable copy of the array referenced by
a.clone()), the operation does not change its operand
a in any way.
To quote from the JDK 8 definition, instances of value-based classes:
[immutable](though may contain references to mutable objects);
[toString]which are computed solely from the instance’s state and not from its identity or the state of any other object or variable;
[acmp]reference equality (
==) between instances,
[idhash]identity hash code of instances, or
[sync]synchronization on an instance’s intrinsic lock;
equals(), not based on reference equality (
[factory]methods which make no committment as to the
[identity]of returned instances;
[substitutable]when equal, meaning that interchanging any two instances x and y that are equal according to
equals()in any computation or method invocation should produce no visible change in behavior.
The term “value-based” is defined to apply evenly to whole classes. But in the case of frozen arrays, there is no whole class to call “value-based”. Rather, individual frozen arrays must be value-based instances of regular types like
Note: The standard array types
Objectwill, of course, never be value-based classes, since many of their instances are mutable. We could attempt to introduce new types for frozen (and/or mutable) arrays, along the lines of
int @Mutable, but this appears to be needlessly disruptive to existing code bases.
Therefore, a few of the rules for value-based classes do not apply. Some (but not all) object arrays are frozen, but the following do not occur:
Objectdo not become final. (Obviously, since arrays don’t have fields.)
toStringmethods do not change for non-frozen arrays. For compatibility, those arrays continue to inherit those behaviors from
[immutable] A regular array can have any of its components updated (using
aastore, etc.), but a frozen array will instead throw an exception (
FrozenArrayStoreException or the like).
[equals] Array types need an
equals method which respects frozen-ness by consulting
Arrays.equals. For compatibility, non-frozen arrays must continue to use reference equality.
[hashCode] Array types need a
hashCode method which respects frozen-ness by consulting
Arrays.hashCode. For compatibility, non-frozen arrays must continue to use the identity hash code.
[toString] Array types need a
toString method which respects frozen-ness by consulting
Arrays.toString. For compatibility, non-frozen arrays must continue to use the simple string produced by
acmp instruction (reference equality operator) could be modified for frozen arrays, but I believe this is a bridge too far. The value-based doc carefully avoids going there. Instead, the system must guide coders away from relying on identity comparisons on frozen arrays. JDK methods which operate on arrays must be decoupled from identity comparisons on them, as appropriate.
Note: There are many occurrences of reference equality checks in the JDK, but most are backed up by calls to
Object.equalswhich cover up any indeterminacy in
acmpthat might be caused by value-based instance semantics. Some comparisons will be inherently problematic, and we will need to use static analysis tools (like FindBugs) to amend them.
[idhash] Calls to
System.identityHashCode on a frozen array are just as problematic as pointer comparisons. The safest thing to do is throw an exception (
UnsupportedOperationException) when a frozen array (or any value-based object) is encountered. This means that a few library types (like
IdentityHashMap) will fail when presented with frozen arrays, and will need to be upgraded to support them.
Note: Alternatively, the call could return a hash code, either instance-based as before, or content-based from
Arrays.hashCode. We would need to issue caveats that are parallel to the caveats on pointer comparison. As with reference equality, there may be some low-level uses for identity hash code even on value-based objects, although users are told to make no expectations.
identityHashCode, so is doubly bad for frozen arrays. In any case, using
Object.toStringcalls to frozen arrays will encourage users to adopt the arrays, since
Object.toStringis nearly useless on non-frozen arrays.
[sync] You can’t synchronize on a frozen array; an attempt to do so will throw
IllegalMonitorStateException (or the like). Alternatively, the synchronization could be displaced to a coarsened monitor shared by many or all frozen arrays.
[factory] Freezing a non-frozen array reads all of its components and preserves them permanently in a fresh immutable copy of the array.
[identity] The JVM is free to use caching or any other means to provide previously frozen arrays to satisfy new freezing requests, if the previously frozen arrays have the same (
==) components. In particular, freezing an already-frozen array, or any copy thereof, can return the original frozen array.
Note: Both expressions
a.clone()on all array types produce results with contents identical to the original. Unlike
freezemay return the same object more than once, as long as the contents are the same.
[substitutable] With the exception of reference equality and identity hash code (and classes like
IdentityHashMap which use them), all operations on arrays treat frozen arrays of identical content as identical values. The JVM may perform optimizations that cause some reference-sensitive codes to produce unpredictable answers.
Note: Substitutability is the hardest part of the value-based contract to specify clearly. In the most extreme form of this rule, we could make the JIT and GC free to run around commoning up equivalent value-based objects, at any time. Getting the corner cases to behave well enough may require a complicated set of design compromises. For example, it might be best to amend
IdentityHashMapwith special handling of value-based objects; this which suggests the need for a general query
System.isFrozen, for library codes to use if they need to adjust to value-based instances.
[dimensionality] If an array has two or more dimensions, its frozen status is logically independent from the status of any of its sub-arrays. Thus, an assignment
a[i][j]=x might fail because the component
a[i] is null or because the sub-array
a[i] is frozen, but it will not fail merely because the array
a itself is frozen. A frozen array can contain non-frozen sub-arrays, and a non-frozen array can contain frozen sub-arrays. Also, individual sub-arrays can be frozen or non-frozen, independently of each other. On the other hand, it is plausible that if the language were (in the future) to support direct declaration of frozen arrays, the freezing would typically apply equally to all sub-arrays.
Note: The Java Language Specification uses the term component to refer to a variable in an array which is reached by indexing the array once (e.g.,
a[i]). Such a variable is sub-array of the array if its dimension is greater than one. The term element is reserved for a variable which is reached by indexing D times, where D is the rank of the array (e.g.,
ahas two dimensions).
[null] Since the null reference does not refer to an array, it cannot be frozen. Thus, an expression
Arrays.freeze(a) is liklely to elicit a
NullPointerException when the operand is null, just as
Note: In some contexts it will be reasonable to pretend that the result of freezing a null reference is the same (and unique) null reference. It is possible that if we introduce an operation
Arrays.deepFreezeit will pass over null components (and perhaps any other non-array references) without changing them.
[bytecode] The array store bytecodes (
iastore, etc.) need to be adjusted to throw the appropriate exception if the operand array is frozen. This check must be coordinated with the pre-existing checks (null reference, array index range, reference store check). It seems reasonable to order the check after index range and before any other store check.
Note: Since frozen-ness is a property of array instances, not array references, bytecodes which copy references (such as
astore_1, etc.) are do not affect frozen-ness. All references to the same array refer either to a frozen array or a non-frozen array.
[reflection] Reflective APIs must respect frozen-ness. (
jlr.Array.set needs to perform the same checks as the bytecode.)
[jni] Native APIs must respect frozen-ness. (There must be a way to protect against mutations from JNI code. The existing conventions for throwing errors are sufficient. The JNI support code must make the same checks as the bytecode.)
[unsafe] Any system codes that use
Unsafe, such as deserialization and method handles, must must be adjusted to respect frozen-ness.
Unsafe is not documented as being able to “stomp” on object headers or metadata, so there is no documented way for
Unsafe to affect the frozen-ness of an array. Using
Unsafe to set elements of a frozen array will have unpredictable consequences.
[serialization] The effect of serialization on frozen arrays must be defined. It is likely that all deserialized arrays will be mutable clones, although an immutable array option might be attractive to some users.
[language] None of the present points about the JVM have any direct bearing on any changes to the Java language which might support frozen arrays. Strictly speaking, no changes at all are needed. Although the notation
a.freeze() appears to impinge either on the language or the class
Object, it could be restated as a static method
[debugging] In order to assess the viability of converting existing codes to use frozen instances, it may be desirable to implement a JVM mode which can assist the user in detecting and diagnosing code which violates the rules for value-based classes and instances. Specifically, dangerous uses of
identityHashCode can be diagnosed.
[optimization] The system (and particularly the JIT) gets some extra freedom of action with frozen arrays, if value-based semantics are applied. Generally speaking, a chain of
freeze operations can be collapsed up to the oldest frozen operand. Of course, a double
freeze can be the identity operation on the first frozen operand, but if intermediate non-frozen operands in such a chain are non-escaping and not modified, they can also be treated as frozen. The user model is that the sooner you freeze an array that won’t be further modified, the more optimizations the system can make. Also, re-freezing is desirable: It is cheap, and has the effect of narrowing the scope of any stray mutable copies of a copy-chain.
Note: Much of this logic is applicable to other legacy types, such as
String. Experiments are required to assess whether these types could be made value-based, either fully, or (if public constructors are retained) on an instance-by-instance basis. See JEP 169 for more discussion.