This specification is not final and is subject to change. Use is subject to license terms.

String Templates

Changes to the Java® Language Specification • Version 20-internal-adhoc.gbierman.20221115

This document describes changes to the Java Language Specification to support String Templates, a preview feature of Java SE 20. See JEP 430 for an overview of the feature.

Changes are described with respect to existing sections of the JLS. New text is indicated like this and deleted text is indicated like this. Explanation and discussion, as needed, is set aside in grey boxes.

Changelog:

2022-11-15: Second draft. Main changes surround details of lexical and syntactical grammars. New terminology introduced for templates.

2022-01-20: First draft released.

Chapter 3: Lexical Structure

3.10 Literals

3.10.7 Escape Sequences

In character literals, string literals, and text blocks, and templates (3.10.4, 3.10.5, 3.10.6, 15.8.6), the escape sequences allow for the representation of some nongraphic characters without using Unicode escapes (3.3), as well as the single quote, double quote, and backslash characters.

EscapeSequence:
\ b * (backspace BS, Unicode \u0008)*
\ s * (space SP, Unicode \u0020)*
\ t * (horizontal tab HT, Unicode \u0009)*
\ n * (linefeed LF, Unicode \u000a)*
\ f * (form feed FF, Unicode \u000c)*
\ r * (carriage return CR, Unicode \u000d)*
\ LineTerminator * (line continuation, no Unicode representation)*
\ " * (double quote ", Unicode \u0022)*
\ ' * (single quote ', Unicode \u0027)*
\ \ * (backslash \, Unicode \u005c)*
OctalEscape * (octal value, Unicode \u0000 to \u00ff)*
OctalEscape:
\ OctalDigit
\ OctalDigit OctalDigit
\ ZeroToThree OctalDigit OctalDigit #
OctalDigit:
(one of)
0 1 2 3 4 5 6 7
ZeroToThree:
(one of)
0 1 2 3

The OctalDigit production above comes from 3.10.1. Octal escapes are provided for compatibility with C, but can express only Unicode values \u0000 through \u00FF, so Unicode escapes are usually preferred.

It is a compile-time error if the character following a backslash in an escape sequence is not a LineTerminator or an ASCII b, s, t, n, f, r, ", ', \, 0, 1, 2, 3, 4, 5, 6 or 7.

An escape sequence in the content of a character literal, string literal, or text block is interpreted by replacing its \ and trailing character(s) with the single character denoted by the Unicode escape in the EscapeSequence grammar. The line continuation escape sequence has no corresponding Unicode escape, so is interpreted by replacing it with nothing.

The line continuation escape sequence can appear in a text block, but cannot appear in a character literal or a string literal because each disallows a LineTerminator.

\ { is not an escape sequence, but can appear in a template (15.8.6) to prefix an embedded expression.

Chapter 7: Packages and Modules

7.3 Compilation Units

CompilationUnit is the goal symbol (2.1) for the syntactic grammar (2.3) of Java programs. It is defined by the following production:

CompilationUnit:
OrdinaryCompilationUnit
ModularCompilationUnit
OrdinaryCompilationUnit:
[PackageDeclaration] {ImportDeclaration} {TopLevelClassOrInterfaceDeclaration}
ModularCompilationUnit:
{ImportDeclaration} ModuleDeclaration

An ordinary compilation unit consists of three parts, each of which is optional:

A modular compilation unit consists of a module declaration (7.7), optionally preceded by import declarations. The import declarations allow classes and interfaces from packages in this module and other modules, as well as static members of classes and interfaces, to be referred to using their simple names within the module declaration.

Every compilation unit implicitly imports the following:

  1. Every public class or interface declared in the predefined package java.lang, as if the declaration import java.lang.*; appeared at the beginning of each compilation unit immediately after any package declaration.
  1. The static member STR declared in the predefined class java.lang.template.StringTemplate, as if the declaration import static java.lang.template.StringTemplate.STR; appeared at the beginning of each compilation unit immediately after any package declaration.

As a result, the names of all those implicitly imported classes and interfaces classes, interfaces and static fields are available as simple names in every compilation unit.

The host system determines which compilation units are observable, except for the compilation units in the predefined package java and its subpackages lang and io, which are all always observable.

The rest of §7.3 is unchanged.

7.5 Import Declarations

7.5.3 Single-Static-Import Declarations

A single-static-import declaration imports all accessible static members with a given simple name from a class or interface. This makes these static members available under their simple name in the module, class, and interface declarations of the compilation unit in which the single-static-import declaration appears.

SingleStaticImportDeclaration:
import static TypeName . Identifier ;

The TypeName must be the canonical name (6.7) of a class or interface.

The class or interface must be either a member of a named package, or a member of a class or interface whose outermost lexically enclosing class or interface declaration (8.1.3) is a member of a named package, or a compile-time error occurs.

It is a compile-time error if the named class or interface is not accessible (6.6).

The Identifier must name at least one static member of the named class or interface. It is a compile-time error if there is no static member of that name, or if all of the named members are not accessible.

It is permissible for one single-static-import declaration to import several fields, classes, or interfaces with the same name, or several methods with the same name and signature. This occurs when the named class or interface inherits multiple fields, member classes, member interfaces, or methods, all with the same name, from its own supertypes.

It is permitted for a single-static-import declaration to redundantly import static members that are already implicitly imported.

If two single-static-import declarations in the same compilation unit attempt to import classes or interface with the same simple name, then a compile-time error occurs, unless the two classes or interfaces are the same, in which case the duplicate declaration is ignored.

If a single-static-import declaration imports a class or interface whose simple name is x, and the compilation unit also declares a top level class or interface (7.6) whose simple name is x, a compile-time error occurs.

If a compilation unit contains both a single-static-import declaration that imports a class or interface whose simple name is x, and a single-type-import declaration (7.5.1) that imports a class or interface whose simple name is x, a compile-time error occurs, unless the two classes or interfaces are the same, in which case the duplicate declaration is ignored.

7.5.4 Static-Import-on-Demand Declarations

A static-import-on-demand declaration allows all accessible static members of a named class or interface to be imported as needed.

StaticImportOnDemandDeclaration:
import static TypeName . * ;

The TypeName must be the canonical name (6.7) of a class or interface.

The class or interface must be either a member of a named package, or a member of a class or interface whose outermost lexically enclosing class or interface declaration (8.1.3) is a member of a named package, or a compile-time error occurs.

It is a compile-time error if the named class or interface is not accessible (6.6).

It is permitted for a static-import-on-demand declaration to redundantly import static members that are already implicitly imported.

Two or more static-import-on-demand declarations in the same compilation unit may name the same class or interface; the effect is as if there was exactly one such declaration.

The rest of §7.5.4 is unchanged.

Chapter 12: Execution

12.5 Creation of New Class Instances

A new class instance is explicitly created when evaluation of a class instance creation expression (15.9) causes a class to be instantiated.

A new class instance may be implicitly created in the following situations:

Each of these situations identifies a particular constructor (8.8) to be called with specified arguments (possibly none) as part of the class instance creation process.

Whenever a new class instance is created, memory space is allocated for it with room for all the instance variables declared in the class and all the instance variables declared in each superclass of the class, including all the instance variables that may be hidden (8.3).

If there is not sufficient space available to allocate memory for the object, then creation of the class instance completes abruptly with an OutOfMemoryError. Otherwise, all the instance variables in the new object, including those declared in superclasses, are initialized to their default values (4.12.5).

Just before a reference to the newly created object is returned as the result, the indicated constructor is processed to initialize the new object using the following procedure:

  1. Assign the arguments for the constructor to newly created parameter variables for this constructor invocation.

  2. If this constructor begins with an explicit constructor invocation (8.8.7.1) of another constructor in the same class (using this), then evaluate the arguments and process that constructor invocation recursively using these same five steps. If that constructor invocation completes abruptly, then this procedure completes abruptly for the same reason; otherwise, continue with step 5.

  3. This constructor does not begin with an explicit constructor invocation of another constructor in the same class (using this). If this constructor is for a class other than Object, then this constructor will begin with an explicit or implicit invocation of a superclass constructor (using super). Evaluate the arguments and process that superclass constructor invocation recursively using these same five steps. If that constructor invocation completes abruptly, then this procedure completes abruptly for the same reason. Otherwise, continue with step 4.

  4. Execute the instance initializers and instance variable initializers for this class, assigning the values of instance variable initializers to the corresponding instance variables, in the left-to-right order in which they appear textually in the source code for the class. If execution of any of these initializers results in an exception, then no further initializers are processed and this procedure completes abruptly with that same exception. Otherwise, continue with step 5.

  5. Execute the rest of the body of this constructor. If that execution completes abruptly, then this procedure completes abruptly for the same reason. Otherwise, this procedure completes normally.

Unlike C++, the Java programming language does not specify altered rules for method dispatch during the creation of a new class instance. If methods are invoked that are overridden in subclasses in the object being initialized, then these overriding methods are used, even before the new object is completely initialized.

Example 12.5-1. Evaluation of Instance Creation

class Point {
    int x, y;
    Point() { x = 1; y = 1; }
}
class ColoredPoint extends Point {
    int color = 0xFF00FF;
}
class Test {
    public static void main(String[] args) {
        ColoredPoint cp = new ColoredPoint();
        System.out.println(cp.color);
    }
}

Here, a new instance of ColoredPoint is created. First, space is allocated for the new ColoredPoint, to hold the fields x, y, and color. All these fields are then initialized to their default values (in this case, 0 for each field). Next, the ColoredPoint constructor with no arguments is first invoked. Since ColoredPoint declares no constructors, a default constructor of the following form is implicitly declared:

ColoredPoint() { super(); }

This constructor then invokes the Point constructor with no arguments. The Point constructor does not begin with an invocation of a constructor, so the Java compiler provides an implicit invocation of its superclass constructor of no arguments, as though it had been written:

Point() { super(); x = 1; y = 1; }

Therefore, the constructor for Object which takes no arguments is invoked.

The class Object has no superclass, so the recursion terminates here. Next, any instance initializers and instance variable initializers of Object are invoked. Next, the body of the constructor of Object that takes no arguments is executed. No such constructor is declared in Object, so the Java compiler supplies a default one, which in this special case is:

Object() { }

This constructor executes without effect and returns.

Next, all initializers for the instance variables of class Point are executed. As it happens, the declarations of x and y do not provide any initialization expressions, so no action is required for this step of the example. Then the body of the Point constructor is executed, setting x to 1 and y to 1.

Next, the initializers for the instance variables of class ColoredPoint are executed. This step assigns the value 0xFF00FF to color. Finally, the rest of the body of the ColoredPoint constructor is executed (the part after the invocation of super); there happen to be no statements in the rest of the body, so no further action is required and initialization is complete.

Example 12.5-2. Dynamic Dispatch During Instance Creation

class Super {
    Super() { printThree(); }
    void printThree() { System.out.println("three"); }
}
class Test extends Super {
    int three = (int)Math.PI;  // That is, 3
    void printThree() { System.out.println(three); }

    public static void main(String[] args) {
        Test t = new Test();
        t.printThree();
    }
}

This program produces the output:

0
3

This shows that the invocation of printThree in the constructor for class Super does not invoke the definition of printThree in class Super, but rather invokes the overriding definition of printThree in class Test. This method therefore runs before the field initializers of Test have been executed, which is why the first value output is 0, the default value to which the field three of Test is initialized. The later invocation of printThree in method main invokes the same definition of printThree, but by that point the initializer for instance variable three has been executed, and so the value 3 is printed.

Chapter 15: Expressions

15.8 Primary Expressions

Primary expressions include most of the simplest kinds of expressions, from which all others are constructed: literals, object creations, field accesses, method invocations, method references, and array accesses, and template expressions. A parenthesized expression is also treated syntactically as a primary expression.

Primary:
PrimaryNoNewArray
ArrayCreationExpression
PrimaryNoNewArray:
Literal
ClassLiteral
this
TypeName . this
( Expression )
ClassInstanceCreationExpression
FieldAccess
ArrayAccess
MethodInvocation
MethodReference
TemplateExpression

The rest of this section is unchanged.

15.8.6 Template Expressions

Template expressions provide a general means of combining literal text with the values of any embedded expressions in order to produce a result. This result is often a String, but is not limited to just this type. Template expressions subsume simple string interpolation.

TemplateExpression:
TemplateProcessor . ProcessorArgument
TemplateProcessor:
Expression
ProcessorArgument:
Template
StringLiteral
TextBlock
Template:
StringTemplate
TextBlockTemplate
StringTemplate:
" StringTemplateBody "
StringTemplateBody:
StringTemplateElement {StringTemplateElement} StringFragment
StringTemplateElement:
StringFragment EscapedExpression
StringFragment:
{StringCharacter}
TextBlockTemplate:
" " " {TextBlockWhiteSpace} LineTerminator TextBlockTemplateBody " " "
TextBlockTemplateBody:
TextBlockTemplateElement {TextBlockTemplateElement} TextBlockFragment
TextBlockTemplateElement:
TextBlockFragment EscapedExpression
TextBlockFragment:
{TextBlockCharacter}
EscapedExpression:
\ { [ Expression ] }

The following productions from 3.10.5 and 3.10.6 are shown here for convenience:

StringCharacter:
InputCharacter but not " or \
EscapeSequence
TextBlockCharacter:
InputCharacter but not \
EscapeSequence
LineTerminator

A template expression consists of a processor expression and an argument that is either a template, a string literal (3.10.5), or a text block (3.10.6).

A template resembles a string literal or a text block but consists of the strict alternate interleaving of two or more string literals or text blocks, known as fragment literals, with one or more embedded expressions. An embedded expression can be either empty or an expression.

The StringTemplate production makes use of the " terminal symbol and the StringCharacter nonterminal (and similarly for the TextBlockTemplate production that uses the TextBlockCharacter nonterminal). Strictly speaking, these are not tokens defined by the lexical grammar (2.2). However, they are constituents of the StringFragment token of the lexical grammar, and so it is possible to view the parsing of a StringTemplate as if it is the parsing of an alternating sequence of StringLiteral and Expression nonterminals. Namely, a string template consisting of " StringFragment1 \{ Expression1 } StringFragment2 ... \{ Expressionn } StringFragmentn+1 ", is parsed as if it had the form StringLiteral1 Expression1 StringLiteral2 ... Expressionn StringLiteraln+1 (where StringLiterali is defined as " StringFragmenti "), and similarly for a text block template. Providing a lexical grammar that deals directly with such context sensitive lexical processing (possibly making use of lexical modes) is an implementation detail beyond the scope of this specification.

Every template consists of the interleaving of at least two fragment literals (either string literals or text blocks) and at least one embedded expression. For example, the simple string template "\{42}" consists of the interleaving of the empty string literal, followed by the embedded expression 42, followed by the empty string literal. The string template "Forty Two \{}" consists of the interleaving of the string literal "Forty Two ", followed by an empty embedded expression, followed by the empty string literal. Finally, the string template "The answer \{"is"} \{42}" consists of the string literal "The answer ", followed by the embedded expression "is" (a string literal), followed by the string literal " ", followed by the expression 42 (an integer literal), followed by the empty string literal.

The type of the TemplateProcessor expression must be a subtype of a type java.lang.template.ValidatingProcessor<R,E>, for some types R and E, otherwise a compile-time error occurs. The type of the template expression is then given by the type R.

There is no restriction on the type of any non-empty embedded expression appearing in a Template.

The static member STR is implicitly imported in every compilation unit (7.3). This member, whose type is a subtype of java.lang.template.ValidatingProcessor<String, RunTimeException>, implements simple string interpolation. The template expression STR."The answer is \{ 41+1 }" evaluates to the string "The answer is 42".

At run time, a template expression is evaluated as follows:

  1. The TemplateProcessor expression is evaluated. If the resulting value is null, then a NullPointerException is thrown and the entire template expression completes abruptly for that reason. If evaluation of the TemplateProcessor completes abruptly, the entire template expression completes abruptly for the same reason.

  2. If the ProcessorArgument is a StringLiteral or a TextBlock, then the result of this step is an instance of java.lang.template.StringTemplate, produced as if by invocation of the static method java.lang.template.StringTemplate.of with the argument ProcessorArgument.

    If the ProcessorArgument is a Template, then the embedded expressions e1, ..., en are evaluated to yield embedded values, v1, ..., vn. The embedded expressions are evaluated in the order that they appear in the Template, from left to right. If an embedded expression is empty, then the result of its evaluation is the null reference. If evaluation of any embedded expression completes abruptly, then the entire template expression completes abruptly for the same reason.

    Otherwise, the result of this step is a reference to an instance of a class with the following properties:

    • The class implements the java.lang.template.StringTemplate interface.

    • The instance method java.lang.template.StringTemplate.values returns the embedded values v1, ..., vn, in that order.

    • The instance method java.lang.template.StringTemplate.fragments returns exactly the fragment literals of the template, in the same order they appear in the template.

    • The instance method java.lang.template.StringTemplate.interpolate of the class instance returns the strict alternate interleaved string concatenation of (1) exactly the fragment literals in the same order they appear in the template, and (2) the embedded values v1, ..., vn, in that order, beginning with the first fragment literal.

  3. The result of evaluating the template expression is determined as if invoking the method process on the result of step 1, with the argument given by the result of step 2. If this method invocation completes abruptly, the entire template expression completes abruptly for the same reason.