value types and struct tearing

Published 03/06/2014 at, revised 12/18/2019.

This note explains how a notion of “value types” for the VM should protect the integrity of that value type’s invariants, and points out important differences, in memory effects, between “struct-like” and “persistent” designs for values.

[2019 Note:] Value types are the subject of Project Valhalla, and are now known as inline classes.

First, the running example

Consider this two-variable data structure, a pair of encapsulated integral coordinates, with a couple of access functions:

class XY {
  private int x = 1, y = 0;
  public XY() { }  // make a blank one
  public boolean step(int p, int q) {
    int x1 = x + p, y1 = y + q;
    if (x1 == 0 && y1 == 0)  return false;
    x = x1; /*racing*/ y = y1;
    return true;
  public double measure() {  // inverse radius
    return Math.pow((double)x*x + (double)y*y, -0.5);
  public void copyFrom(XY that) {
    this.x = that.x; this.y = that.y;

The reader function measure produces a real number, but will fail to do so if both values x, y are zero at the same time. The writer function step updates the two coordinates (incrementally, as if in a random walk) except when they would both be zero.

The details are not very important; the main point is that the two values are mostly independent, but are coupled by an invariant that can be violated by an uncoordinated change to either value alone.

Off to the races

What could go wrong? Why, nothing, if one of the following conditions is satisfied:

  1. If the object is confined to one thread only.

  2. If multiple threads are updating the object, but some sort of synchronization is preventing calls to step and measure from overlapping.

  3. If there is some assurance that an XY cannot change and has been safely published. Stashing a copy of the object is the basic idea; that is why we might need methods like copyFrom.

Side note 1: Even a fully confined XY object could in principle be broken by an asynchronous interrupt which cancels an assignment to the y field, but such interrupts are not now part of the Java landscape.

Side note 2: General descriptions of these sharing options, and more, may be found in Chapter 3 of Java Concurrency in Practice.

The easy way to cover all the cases is to mark all the methods as synchronized. This is the design of the old Java class java.util.Vector, which synchronizes all of its access functions. Since synchronization is usually not free, this design choice pushes a cost onto all users. This cost can sometimes be reduced by using transactions under the hood, such as Intel supports, but since the language and bytecode set do not directly express the transactions, there is always a possibility that the JVM will have to create a real critical section, so the optimization is not reliable.

This is why many programmers (including those who designed newer standard classes like ArrayList) omit the synchronization and instead push the responsibility for locking onto the user. This allows users to balance the cost of locking against the danger of races.

The user of an ArrayList must take responsibility to avoid publishing a reference to the list without proper mutual exclusion between threads. Often this is easy enough. If the data is shared, it is shared between threads that are programmed together under common design rules and tested well enough to catch simple bugs.

Insecurity complex

The responsibility of the user increases if the list will contain sensitive data, such as parameters to a privileged operation. In that case, the programmer must ensure that a reference to the shared list cannot leak to uncontrolled code, even if the program is abused in some way. Otherwise, an attacker may create or obtain a shared list reference, hand it to privileged code, and then at the same time mutate it. The mutations will appear to the victim code via data races. Most such mutations will be harmless, since the privileged code will validity-check the arguments and throw an exception. But with enough attempts from the attacker, a race can happen which puts the list into a state which confuses the privileged code into doing something unpredicted–except by the attacker. This kind of weakness, known as a TOCTTOU bug, is a risk whenever privilege checks are based on mutable data objects.

It is important to note that marking a variable as private does not protect against inconsistent updates from data races. In the simple example above, where all the state is private, a data race can create a disallowed state as follows:

final XY xy = new XY();
Thread racer = new Thread(() -> {
    for (int i = 0; i < 1e6; i++)
        { xy.step(-1, 1); xy.step(1, -1); }
for (int i = 0; i < 1e6; i++) {
    double m = xy.measure();
    if (m > 1) throw new AssertionError(m+"@"+i);
    // => AssertionError: Infinity@293

This code can fail its assertion quickly, often after a few hundred tries. The victim is observed in a state where both fields are zero. The location where this state is observed is marked /*racing*/ in the method XY.step above. Locally valid updates to single fields of the victim can cause this violation. The sequence of updates that leads the victim to that disallowed state, plus the existence of additional surprising states, is an exercise for the reader.

Side note: Since measure reads each field twice, it is allowed to pick up different values for the two reads of one field. This is rare, because optimizers tend to merge the reads, but it cannot be excluded. Thus, inconsistent reads of the same variable can therefore be a source of bugs, and this is true even for one-field objects. We could call this a single-variable-double-read hazard.

The invariant on XY fails when the two fields lose their mutual coherence because of race conditions. Let’s call this loss of coherence struct tearing, because it looks like the victim is a struct which has been torn, fieldwise, into parts and then reassembled from other parts.

The JVM (both in its original documentation and in the current Java Memory Model) forbids “word tearing” of individual values, such as integers and references. As an exception 64-bit primitives (long and double) are allowed to be treated as structs of two 32-bit fields, subject to independent updats. Thus, 64-bit primitives may be torn apart by races, much like the two 32-bit fields in our XY example.

The JVM specification encourages implementations not to tear 64-bit primitives. Generally speaking, only 32-bit CPUs which lack 64-bit atomics exhibit this behavior, and they are a vanishing breed. But value types easily grow beyond 64 bits, and thus revisit the same question of struct tearing.

Persistence pays off

In our simple example, another way to protect the data is to formulate it using final variables, changing the design to be persistent.

Side note: These days we are recycling the term “persistent” as an alternative and refinement to the older term “immutable”. A problem with the older term is evident whenever you have a conversation that stalls on the audible similarity of phrases like “immutable object” and “a mutable object”. Mumblable “immutable” is semblable to “mutable”, say that ten times fast.

class XY {
  final private int x, y;
  private XY(int x, int y) { this.x = x; this.y = y; }
  public static final BLANK = new XY(1, 0);
  public XY step(int p, int q) {
    int x1 = x + p, y1 = y + q;
    if (x1 == 0 && y1 == 0)  return null;
    return new XY(x1, y1);
  public double measure() {  // inverse radius
    return Math.pow((double)x*x + (double)y*y, -0.5);
  //XY copyFrom(XY that) { return that; }  // no special copy

In contrast to this persistent version of XY, we can call the first version struct-like. Despite the encapsulation of the fields, it interacts with memory like a C struct.

The persistent version does not suffer from races on its x and y variables. A persistent XY object state can be captured or published simply by moving a reference. There only needs to be one public “blank” uninitalized value, and the constructor itself is hidden inside the capsule.

In exchange for this new stability, users must update their references whenever they call the updater function step. Crucially, it is now impossible to observe an XY object in a disallowed state. The class has full control over its internal invariants, even in the face of deviously attacking racer threads. There are still races possible, but they are on the references. And since updating a reference is one memory operation, there is no issue of multiple fields parting ways. The single-variable-double-read bug can still happen, rarely.

It seems there is always a cost for stability and security. In this case, the cost is allocating a new XY object for each distinct state. This is required because the class insists on creating a new XY object to represent each new position in the coordinate space. Put another way, the user is forbidden to use the optimization of re-using XY objects to represent multiple values over time. This is a reasonable restriction, since that optimization can lead to race conditions, as described above.

Side note: Since the XY constructor is private, the class might cache values like java.lang.Integer does, but this has its own sometimes surprising costs, since the caching logic can interfere with optimizations such as escape analysis. In any case, making the constructor private is a helpful move.

Where flattery gets us

But, suppose the JVM had an optimization to flatten instances of XY, in either the mutable or the persistent form. A flattened instance of XY would have both of its fields stored directly in some containing object (or array). There would be no need to have a separate XY object on the heap to hold the fields. Of course, if references to XY were required for some uses, these references could be created temporarily and then discarded.

Obviously, this sort of thing is what we are calling value types. In a nutshell, a value type encapsulates a group of component values, and can be efficiently stored in any type of Java variable.

Would value types this provide significantly better options for designing XY and similar classes, than the current ones sketched above? The answer is a qualified “yes”.

Perhaps the biggest advantage would be better use of memory. Overhead for the distinct XY object (header+padding) would vanish. And code would access the x and y fields directly within a containing object, using static offset arithmetic rather than a dynamic and cache-busting pointer chase. This is most evident if many pairs are stored in an array: Hardware can get clever about sequential accesses to the fields.

The other main advantage would be better support for methods which pass and return composite values. A method could receive or return two or more values inside a single flattened value “on the stack” (or in registers, usually). Of course, locals could also contain flattened values. So complex numbers and vectors get practical, along with a host of other small but useful types.

There are more advantages that may accrue from value types, but this is not the place to go into the details.

Memory vs. method

The two main advantages–better use of memory and better method types–are at some tension with each other. If value types are conceived as a memory layout mechanism only, then their primary usage will be via a Java pointer. The flattened realization in registers (in method types) will be a fiction to be uneasily maintained, at the cost of confusing various JVM optimizations.

On the other hand, value types will fail to deliver effective use of memory if they are conceived as only a clever way of loading registers (again, holding method arguments, locals, and return values). This is because their representation in other variables (fields and array elements) must live in memory, and as such is sensitive to quality of memory layout.

Naturally, we want both advantages, with good representations in both register-based and memory-based variables. To the extent the system materializes Java references that point to flattened values, those references must be easily and routinely optimized away. And such materialized references should be rare. Therefore, a good design for value types needs a primary representation that is pointer-free, even if there is a secondary “boxed” representation which uses references.

Climbing into the capsule

We also want encapsulation, which means privacy of component fields and of methods. Any credible value type design for the JVM must allow a value type to restrict access and enforce invariants.

But encapsulation is incomplete–and therefore a dangerous illusion–if it the protected invariants can be subverted by race conditions, since those race conditions are available to anyone who can perform a value type assignment.

This brings us back to the defects of the struct-like programming style for XY in the first example above. As seen above, the Java memory model permits structure tearing, and requires the author of a class to clearly state its contract, allowing the user to take defensive action if the class does not adequately defend itself against races.

But I wish to use value types, someday, to contain security critical values, such as 96-bit timestamps or encapsulated native pointers. The persistent-style design of String is integral to its usability, outside of any container, for carrying security critical values. Struct-like value types will, I think, be difficult or impossible to secure to the same degree, because of their weaker encapsulation. They will have to be put inside a container to manage safe access (a bit like an ArrayList inside Collections.synchronizedList), and this will tend to cancel the advantages of flattening them in the first place.

[2019 Note:] Value types will be crucial to the performance APIs defined in Project Panama. These include encapsulated native pointers and native SIMD values (such as the various Intel AVX vector types).

Assignment as an act of violence

There are a few specific reasons the persistent design provides better encapsulation and safer APIs than the struct-like design.

First, since value types can be stored in all sorts of variables, not just memory locations, it will be extremely common to assign them from place to place. Note that assignment is a JVM primitive (in the current JVM design) and cannot be customized by a class. This is of no consequence for existing Java APIs, but if it is simply generalized to a racy componentwise copy (as in C), then every assignment statement which operates on a value type will be at risk of structure tearing.

But, Java programmers do not expect assignments to corrupt any internal structure of the assigned quantity. (The sole exception of 64-bit primitives on 32-bit machines is widely neglected.) Allowing a value type assignment to racily disturb the underlying value’s invariants is likely to introduce a new and persistent family of bugs to Java programs.

Fixing this problem for struct-like types would require reifying the assignment operation as an explicit method which could then perform synchronization on both the source and destination of the copied value. The code to do this would be complex and prone to errors and deadlocks. In practice, designers of value types would punt on the problem, pushing it (with wide-eyed trust and hope, doubtless) to their users. Their users, meanwhile, would have to learn the conventional distinctions between safe and unsafe versions of types, for example choosing between FastString and ThreadSafeString.

Publishing via finals

Second, and more subtly, safe publication is one of the least well understood aspects of the Java memory model, but it is crucial to the safe sharing of any non-persistent type. Safe publication requires either accurate synchronization on the object containing the mutable state, or publishing the state via a final variable. Both of these options require levels of indirection which are convenient enough today, but would tend to vanish as data structures are flattened.

The simplest (and thus safest) extension of safe publication patterns to value types is to declare that their component fields are final, so that when a value is assigned, its components are automatically published individually. This pattern comes out naturally from persistent-style values, but must be imposed on struct-like values.

In a sense, we are revisiting the old design decision in Java to make variables mutable by default. Recall that blank final fields were added in Java 1.1, enabling persistent-style types, and even today we don’t yet have frozen (immutable, persistent) arrays. It would be consistent with the oldest versions of Java to define the new composite types to be mutable unless requested otherwise. But more modern forms of inter-thread communication have been created outside of that original model, avoiding synchronization and adding (via the JMM) safe publication semantics to final variables. Although it would be a stretch to say, “Java should have been designed with immutability as the default”, I believe that consistency with Java 1.0 conventions for mutability could reasonably be traded away in order to gain modern thread safety.

Persistence costs

Those are the reasons why I think the persistent design pattern is preferable over the struct-like pattern. That is true despite the difficulties of the persistent design, which I will describe next.

First, the stability provided by final variables is not free. In the current JVM, they may require memory fences in constructors. In a JVM supporting persistent-style value types, every assignment to a memory location (not to registers) may require a similar handshake with the memory system, to ensure that writes of the component values are safe publications.

Correctness vs. throughput?

Second, the atomicity provided by single-reference updates goes away if objects are flattened, and it must be recovered some other way. A store to a persistent-style value type must be all or nothing: It must ensure that other threads either see the whole store, or none of it. This will require another kind of handshake with the memory system, along the lines of the Intel transactions mentioned above, or an atomic multi-word store instruction. Note that the example type XY fits in 64 bits, and so would be supported at no extra cost by all 64-bit processors. Processors which provide larger atomic vector store instructions will cheaply support any value that fits in their vectors. In the case of a jumbo value type which spills over multiple hardware vectors and/or cache lines, the JVM’s software will need to perform additional handshakes, perhaps including old fashioned boxing.

For sophisticated users who know what they are doing, a non-transactional store operation, with the possibility of structure tearing, needs to be provided. Because of tearing, the library would have to provide additional interlocks to ensure proper confinement, immutability, and so forth.

In other words, the struct-like component-wise assignment operator will need to be made available as a privileged operation derived from the persistent pattern, and used only inside carefully designed concurrency-safe libraries.

[2019 Note:] In today’s prototypes, in-place updates to value types can be obtained most easily via appropriately constructed variable handles. A variable handle can be produced which updates a single field of an flattened value nested inside a containing instance or array. Such a handle is unsafe in general, but might be legitimately created with the cooperation of the inline class that defines the field, just as a final field can be updated with the cooperation of the defining class. In both cases, additional care must be taken to avoid races with threads which have read the previous value, which is difficult topic we cannot discuss here.

This is the flip side of the necessity for providing persistent containers for struct-like values, for safe publication. The key question, I think, is which should be the default (mutable or persistent) and under what circumstances the non-default mode be supplied.

[2019 Note:] Today’s consensus is that, in order to maintain a reasonable performance profile, inline classes (by default) should be tearable, like yesterday’s longs and doubles. With apologies to Doug Lea, that compromise is “not too terrible”. On the other hand inline classes may optionally be declared with a modifier (like volatile or alwaysatomic) which instructs the JVM to take all necessary steps to excluding struct tearing, even at the expense of performance. These expenses are likely to include on-heap allocation and/or software transactioning. Since values in registers and on stack are not subject to races, these expenses only apply to values which are actually stored in the heap. This option is required for multi-word inlines which will store security-sensitive information, such as native pointers.

Notation, notation, notation

A third downside to the persistent style of values is notational. It seems there are times when you want to say “just change the imaginary component of this number to zero, please”. If a value type is willing to expose its components and accept component-wise updates, it seems harsh to require the user to create a new value from scratch:

Complex c = ...;
c = new Complex(, 0.0);  // rebuild from scratch
c = c.changeIm(0.0);  // maybe use a helper method = 0.0;  // but what I meant was this

This can be viewed mainly a matter of syntax sugar, but there is also a deep connection to the hardware here. A multi-component value type is realized either as bits in a block of memory words or as as bits in a collection of live registers. It is fundamentally reasonable for the user to ask to change just one of those low-level components in isolation, assuming the library designer allows the operation.

At the JVM level, it should be possible to render the bytecodes of a component-wise update fairly directly and without confusion to a register or memory write (adding memory handshakes as needed). Reconstructing a new value from the ground up, just because one component shifted, might create a bunch of noisy intermediate representation which could distract the JIT compiler from more important optimization work.

[2019 Note:] The consensus today is that in-place updates to value types are better expressed with variable handles, instead of special bytecodes.

Our race is now run

To conclude, I believe that implementation challenges of persistent-style values are manageable, and that the corresponding opposing trade-offs for struct-like values are much more difficult to control. In the end it comes down to safe and clean user model versus direct compilation to memory instructions. And when safety seems opposed to speed, I think we can agree that JVM needs to lean towards designs that are safe by default, in the expectation that slowness is easier to fix than insecurity.

And in the end, I think we will get both safety and speed.