Mark Reinhold
2011/12/20
Copyright © 2011 Oracle and/or its affiliates · All rights reserved
This document is an overview of the current state of Project Jigsaw, an exploratory effort to design and implement a module system for the Java SE Platform and to apply that system to the Platform itself and to the JDK.
This document is not yet complete. Additional sections covering compilation, packaging, libraries, repositories, the module-system API, and the modularization of the JDK are in preparation.
Every feature mentioned here has been implemented in the main Jigsaw repository unless otherwise noted.
Comments to: jigsaw dash dev at openjdk dot java dot net
The Jigsaw module system is designed to be both approachable and scalable: Approachable by all developers, yet sufficiently scalable to support the modularization of large legacy software systems in general and the JDK in particular. It aims to implement a set of general requirements; its detailed design has been further guided by the following principles:
Modularity is a language construct — The best way to support modular programming in a standard way in the Java platform is to extend the language itself to support modules. Developers already think about standard kinds of program components such as classes and interfaces in terms of the language; modules should be just another kind of program component.
Module boundaries should be strongly enforced — A class that is private to a module should be private in exactly the same way that a private field is private to a class. In other words, module boundaries should determine not just the visibility of classes and interfaces but also their accessibility. Without this guarantee it is impossible to construct modular systems capable of running untrusted code securely.
Static, single-version module resolution is usually sufficient — Most applications do not need to add or remove modules dynamically at run time, nor do they need to use multiple versions of the same module simultaneously. The module system should be optimized for common scenarios but also support narrowly-scoped forms of dynamic multi-version resolution motivated by actual use cases such as, e.g., application servers, IDEs, and test harnesses.
A module is a collection of Java types (i.e., classes and interfaces) with a name, an optional version number, and a formal description of its relationships to other modules. In addition to Java types a module can include resource files, configuration files, native libraries, and native commands. A module can be cryptographically signed so that its authenticity can be validated.
The most important type of inter-module relationship is that of dependence, in which one module declares that it depends upon some other module by specifying that module’s name and possibly also a constraint upon the range of allowable versions.
A module dependence is not necessarily precise: Multiple modules with the same name but different version numbers might be available to satisfy it. Before a module can be used each of its dependences must be resolved to a specific module. Given an initial set of modules, resolution is the process of locating additional modules, as required, and constructing a superset of that set in which every dependence is optimally satisfied.
TODO: This compile-time preference for older versions is not yet implemented.
There are three principal phases in the lifetime of a module:
Compile time – A module’s dependences are resolved, its types are compiled from Java source files, and its other content is compiled or constructed as appropriate. The results are then packaged up for publication and distribution.
Install time – A module is inserted into a module library, i.e., a collection of previously-installed modules. If the module is invokable, i.e., it has an entry point, then it is made ready for use by resolving its dependences and storing the result of that computation in a persistent configuration.
Run time – An invokable module is loaded into a running Java virtual machine and linked up to the other modules upon which it depends as recorded in its configuration during installation.
The phase in which resolution is performed determines how dependences are satisfied. At compile time the oldest available version of a module satisfying a dependence is preferred, while in later phases the newest version is preferred.
When compiling a module, the Java compiler writes class files into a module-structured classes directory. In this module-path layout there is one top-level directory for each module; the content of each module directory is structured as a normal classes directory, i.e., a tree of decomposed Java package names. In order to support interactive development, the Java launcher can run a modular application directly from a module-path directory. When doing so it performs resolution before invoking the application’s entry point, although the resulting configuration is not stored for future use.
The module system does not support general dynamic run-time resolution; i.e., it is not possible to add or remove dependences or modules after an application has started running. Sophisticated container-type programs such as application servers, IDEs, and test harnesses can achieve the effect of run-time resolution in a limited way by using the module-system API to install modules into a temporary module library and then run them from that library.
TODO: Finish implementing run-time module-path support.
TODO: Design and implement container support.
The Java programming language is extended to include module
declarations for the purpose of defining modules, their
content, and their relationships to other modules. A compilation unit
containing a module declaration is, by convention, stored in a file named
module-info.java and compiled into a file named module-info.class.
The simplest possible module declaration merely expresses the existence of a module with a specific name:
module foo { }
If a module has a version number then that is placed after the module
name, preceded by an @ character:
module foo @ 1.0 { }
A version number starts with a digit and thereafter consists of Java
identifier-part characters, periods ('.'), and dashes ('-').
Module names are qualified Java identifiers, just like Java package names:
module foo.bar { }
Module declarations cannot be annotated.
Source and class files for ordinary Java types do not specify the modules of which they are members.
An exports clause in a module declaration makes the public types in the
package it names available to other modules:
module foo {
exports foo; // Export all public types in the foo package
}
Here the foo module exports all of the public types in its foo
package, though not in any subpackages of foo. No other public types
declared in the foo module are exported. There is no requirement that
a module declare and export a package of the same name, though that is
conventional for simple modules. It is not possible to export non-public
types.
Multiple exports clauses are, of course, allowed:
module foo {
exports foo;
exports foo.spi;
exports foo.util;
}
A module’s exports declarations govern the accessibility of the
public types declared in the named packages. It is thus enforced at both
compile time, by the Java compiler, and at run time, by the virtual
machine.
TODO: Finish initial implementation.
ISSUE: Use package names that differ from the module names in these examples, to improve readability?
The requires clause expresses the dependence of one module upon
another:
module foo {
exports foo;
}
module bar {
requires foo;
}
Here the bar module depends upon the foo module, so at both compile
time and run time the exported types declared in foo are both visible
to and accessible by types declared in bar. If no foo module is
available then bar cannot be compiled, and if bar is invokable then
neither can it be installed or invoked.
At run time foo and bar will have distinct module class loaders, and
bar’s loader will use foo’s loader to load the types exported by
foo.
An exports clause in a module’s declaration only affects the
availability of types declared in that module; it cannot be used to
re-export types imported from other modules.
ISSUE: Do we need disjunctive dependences? Negative dependences?
When bar simply requires foo then the exported types in foo are
available to bar but not to other modules that depend upon bar and
not upon foo. Imported types can be re-exported via the public
modifier of the requires clause:
module foo {
exports foo;
}
module bar {
requires public foo; // Re-exports foo's exported types
}
module baz {
requires bar; // Can use foo's exported types
}
The public modifier makes the types imported into bar from foo
available to any other module that depends directly upon bar.
Types can be re-exported through a chain of modules:
module foo {
exports foo;
}
module bar {
requires public foo;
}
module baz {
requires public bar;
}
module buz {
requires baz; // Can also use foo's exported types
}
In this case any other module that depends upon either bar or baz
will be able to use public types exported by foo without depending upon
foo itself.
A requires clause can include a version constraint:
module bar {
requires foo @ 1.0;
}
This dependence of bar upon foo can be satisfied only by a foo
module whose version is exactly 1.0. More-flexible constraints are
useful in practice, so a constraint can be specified in terms of an
exclusive or inclusive lower or upper bound:
module bar {
requires foo @ >= 1.0;
requires baz @ < 5.1a;
}
These dependences can be satisfied by any foo module with version 1.0
or later and any baz module with version no greater than 5.1a.
No specific semantics are imposed upon version numbers. Version numbers are compared using an algorithm similar to that of the Debian packaging system.
TODO: Support both lower and upper bounds in version constraints.
TODO: Re-examine the version-comparison algorithm.
In large software systems it is often useful to restrict the set of
modules that can depend upon some other module. The permits clause
expresses such a constraint:
module foo {
exports foo;
permits bar;
permits baz;
}
Here the module foo can be required only by modules named bar or
baz. A dependence from a module of some other name upon foo will not
be resolvable at compile time, install time, or run time. If no
permits clauses are present then there are no such constraints.
The bar and baz modules can re-export foo’s exported types via
requires public clauses. Care must be taken, therefore, when writing
permits clauses.
TODO: Controlling
permitsby module name alone is not sufficient, since an adversary can install a module of any given name. At the same time, for debugging the JDK itself it’s desirable to be able to install an experimental version of a JDK module into a local module library which delegates to the built-in module library of a pre-installed JDK, so simply limiting permitted modules to just those in the same module library is won’t work in general. We need to explore more alternatives here.ISSUE: Should it be possible to restrict a permitted module from re-exporting a permitting module’s exported types?
To support the refactoring of large modular systems, and also to allow
the separation of module names corresponding to well-defined standards
(e.g., java.base) from the names of modules implementing those
standards (e.g., jdk.base), the provides clause declares an alternate
name for a module:
module foo {
provides bar;
}
Given this declaration, any dependence upon bar can be satisfied by
foo. More than one provides clause can be present.
TODO: Implement aliases.
ISSUE: Should aliases have version numbers? The syntax currently allows them. They appear to be necessary to support refactoring by aggregation. In popular native packaging systems, however, the natural mapping of a module alias is to a virtual package, and virtual packages don’t have version numbers.
If a module declares a class with a traditional public static void main
entry point then it can be made into an application module via the
class clause:
module foo {
class foo.Main; // Contains the main method
}
The java launcher can then be used to invoke the module:
$ java -m foo
in which case the foo.Main.main method is found and invoked in the
usual fashion. Any remaining command-line arguments are passed to the
main method as usual.
A module declaration can contain at most one class clause.
ISSUE: Should there be a way to suggest, if not specify, an external name for the entry point for use by external agents such as command shells?
ISSUE: Should entry points be expressed instead as services?
A dependence from one module to another can be declared optional:
module bar {
requires optional foo;
}
If no foo module is available then bar can still be installed and
invoked. Code in bar that uses types from foo must be written
defensively so that it operates properly when foo is not available.
A foo module must still be available when compiling bar since code in
bar can depend upon types declared in foo.
A dependence from one module to another can be declared local:
module bar {
requires local foo;
}
To resolve this dependence, foo must explicitly permits bar.
A local dependence allows two modules to define types in the same Java package:
module foo {
permits bar;
exports p;
}
module bar {
requires local foo;
exports p;
}
Such multi-module packages, also called split packages, are sometimes required when modularizing large legacy systems.
With a local dependence, types declared in the same package in each module can make use of public, protected, and even package-private types and members declared in the same package in the other module. The public types exported by each module are implicitly re-exported by the other. At run time this is all achieved by using the same module class loader for both modules.
More than two modules can be related by local dependences:
module foo {
permits bar;
exports p;
}
module bar {
requires local foo;
permits baz;
exports p;
}
module baz {
requires local bar;
exports p;
}
In this case all three modules would, at run time, be loaded by the same module class loader.
ISSUE: Should
requires local publicbe illegal?ISSUE: Should each module in a set of modules related by local dependence be required explicitly to permit all the other modules? That is not the case today, but it is arguably safer.
The bindings of a module are the types defined within it together with
those imported from other modules via requires clauses. The view of
a module is a subset of its bindings, namely the set of types that it
exports, via exports and requires public clauses, and the set of
modules to which those types are available, as constrained by any
permits clauses.
module bar {
requires foo;
exports bar;
}
This bar module binds types defined locally, e.g., on the module path
under the bar module directory, as well as all public types exported
from the module foo. It defines a single view which exports all public
types in the bar package to any other module.
In large software systems it is often useful to define multiple views of the same module. One view can, e.g., be declared for general use by any other module, while another provides access to internal interfaces intended only for use by a select set of closely-related modules.
A series of exports, requires public, and permits clauses at the
top syntactic level of a module declaration defines the module’s default
view. Further views of a module’s bindings can be defined using the
view construct, which specifies a view name together with a bracketed
list of exports and permits declarations:
module bar {
requires foo;
exports bar;
view bar.internal {
permits baz;
exports bar.private;
}
}
The bar module now defines two views. The default view, available by
referencing the module name bar, is the same as before—it’s as if the
declaration also said view bar { exports bar; }. The new view, named
bar.internal, is available only to the baz module. It exports all
public types in the bar.private package. It also exports all public
types in the bar package because the non-default views of a module
inherit the exports clauses of that module’s default view.
A non-default view never has requires clauses.
A non-default view cannot declare its version; it inherits the version, if any, of its containing module.
A non-default view does not inherit the permits clauses, if any, of its
containing module.
In addition to declaring exports and entry points, a non-default view can also declare aliases and services.
A non-default view can, finally, also declare an entry point different from that of its containing module’s default view, so a single module can define multiple related entry points. For example, the declaration
module commands {
view cat {
class org.foo.commands.Cat;
}
view find {
class org.foo.commands.Find;
}
view ls {
class org.foo.commands.List;
}
}
defines three entry points: cat, find, and ls.
HISTORICAL NOTE: Module views are not a new idea. The concept proposed here is very similar to that of structures in the module systems of Scheme 48 and Standard ML.
TODO: Finish initial implementation.
ISSUE: Should a non-default view instead not inherit the types exported by the default view of its containing module declaration? If so, should there be a way to declare explicitly that a view inherits the exported types of the default view, or perhaps some other view?
ISSUE: How do views map to native packaging systems such as RPM or Debian? Treating a module view as a virtual package would probably work but might not scale well. Another possibility is to structure the names of non-default views so that they always include the names of their containing modules, but that turns views into second-class entities.
The module system assumes the existence of a foundational module named
java.base, which is the one module that must be present in every Java
SE implementation. It is the module upon which all others depend, either
implicitly or explicitly, somewhat akin to the implicit reference to the
java.lang package by every compilation unit.
If a module does not declare an explicit dependence upon a java.base
module, is not itself named java.base, and does not define an alias or
view named java.base, then at compile time a synthesized dependence
upon java.base is inserted into the compiled module declaration. The
version constraint in this dependence is of the form >= N, where N is
the version number given to the -target option of the Java compiler, if
any, or else the version number of the Java SE Platform Specification
implemented by the system of which the compiler is a part.
A module can declare that it provides a service:
module foo {
provides service mammals.Wombat with foo.WombatImpl;
}
Here the foo module declares that it implements the mammals.Wombat
service using the class foo.WombatImpl.
To make use of a service, a module must first declare a dependence upon it:
module bar {
requires service mammals.Wombat;
}
Code in the bar module can use an enhanced version of the
ServiceLoader API to access instances of the Wombat service.
The order in which instances are returned is not specified.
A module can declare a service dependence to be optional, in which case
it is possible to use the module even when no provider of the service is
available. As with optional module dependences, code in such modules
must be written defensively so that it operates properly when no
providers are present.
Services are not themselves versioned. A service is defined by a specific interface or abstract class, hence it is implicitly versioned by the version of the module that declares that type.
If a module defining a service also exports some types then those types are available only to modules that have regular module dependences upon it, either directly or indirectly. Classes that implement services are not exported implicitly, nor do they need to be exported explicitly. A class that implements a service can therefore remain both invisible and inaccessible to the clients of that service.
TODO: Finish working out the design and implementation.
ISSUE: Should
permitsclauses affect service lookup?