< prev index next >

src/java.base/share/classes/java/lang/Character.java

Print this page
rev 52374 : 8213046: Define Japanese new Era character
Reviewed-by:


  38  * {@code Character} contains a single field whose type is
  39  * {@code char}.
  40  * <p>
  41  * In addition, this class provides several methods for determining
  42  * a character's category (lowercase letter, digit, etc.) and for converting
  43  * characters from uppercase to lowercase and vice versa.
  44  * <p>
  45  * Character information is based on the Unicode Standard, version 10.0.0.
  46  * <p>
  47  * The methods and data of class {@code Character} are defined by
  48  * the information in the <i>UnicodeData</i> file that is part of the
  49  * Unicode Character Database maintained by the Unicode
  50  * Consortium. This file specifies various properties including name
  51  * and general category for every defined Unicode code point or
  52  * character range.
  53  * <p>
  54  * The file and its description are available from the Unicode Consortium at:
  55  * <ul>
  56  * <li><a href="http://www.unicode.org">http://www.unicode.org</a>
  57  * </ul>








  58  *
  59  * <h3><a id="unicode">Unicode Character Representations</a></h3>
  60  *
  61  * <p>The {@code char} data type (and therefore the value that a
  62  * {@code Character} object encapsulates) are based on the
  63  * original Unicode specification, which defined characters as
  64  * fixed-width 16-bit entities. The Unicode Standard has since been
  65  * changed to allow for characters whose representation requires more
  66  * than 16 bits.  The range of legal <em>code point</em>s is now
  67  * U+0000 to U+10FFFF, known as <em>Unicode scalar value</em>.
  68  * (Refer to the <a
  69  * href="http://www.unicode.org/reports/tr27/#notation"><i>
  70  * definition</i></a> of the U+<i>n</i> notation in the Unicode
  71  * Standard.)
  72  *
  73  * <p><a id="BMP">The set of characters from U+0000 to U+FFFF</a> is
  74  * sometimes referred to as the <em>Basic Multilingual Plane (BMP)</em>.
  75  * <a id="supplementary">Characters</a> whose code points are greater
  76  * than U+FFFF are called <em>supplementary character</em>s.  The Java
  77  * platform uses the UTF-16 representation in {@code char} arrays and




  38  * {@code Character} contains a single field whose type is
  39  * {@code char}.
  40  * <p>
  41  * In addition, this class provides several methods for determining
  42  * a character's category (lowercase letter, digit, etc.) and for converting
  43  * characters from uppercase to lowercase and vice versa.
  44  * <p>
  45  * Character information is based on the Unicode Standard, version 10.0.0.
  46  * <p>
  47  * The methods and data of class {@code Character} are defined by
  48  * the information in the <i>UnicodeData</i> file that is part of the
  49  * Unicode Character Database maintained by the Unicode
  50  * Consortium. This file specifies various properties including name
  51  * and general category for every defined Unicode code point or
  52  * character range.
  53  * <p>
  54  * The file and its description are available from the Unicode Consortium at:
  55  * <ul>
  56  * <li><a href="http://www.unicode.org">http://www.unicode.org</a>
  57  * </ul>
  58  * <p>
  59  * The code point, U+32FF, is reserved by the Unicode Consortium
  60  * to represent the Japanese square character for the new era that begins
  61  * May 2019. Relevant methods in the Character class return the same
  62  * properties as for the existing Japanese era characters (e.g., U+337E for
  63  * "Meizi"). For the details of the code point, refer to
  64  * <a href="http://blog.unicode.org/2018/09/new-japanese-era.html">
  65  * http://blog.unicode.org/2018/09/new-japanese-era.html</a>
  66  *
  67  * <h3><a id="unicode">Unicode Character Representations</a></h3>
  68  *
  69  * <p>The {@code char} data type (and therefore the value that a
  70  * {@code Character} object encapsulates) are based on the
  71  * original Unicode specification, which defined characters as
  72  * fixed-width 16-bit entities. The Unicode Standard has since been
  73  * changed to allow for characters whose representation requires more
  74  * than 16 bits.  The range of legal <em>code point</em>s is now
  75  * U+0000 to U+10FFFF, known as <em>Unicode scalar value</em>.
  76  * (Refer to the <a
  77  * href="http://www.unicode.org/reports/tr27/#notation"><i>
  78  * definition</i></a> of the U+<i>n</i> notation in the Unicode
  79  * Standard.)
  80  *
  81  * <p><a id="BMP">The set of characters from U+0000 to U+FFFF</a> is
  82  * sometimes referred to as the <em>Basic Multilingual Plane (BMP)</em>.
  83  * <a id="supplementary">Characters</a> whose code points are greater
  84  * than U+FFFF are called <em>supplementary character</em>s.  The Java
  85  * platform uses the UTF-16 representation in {@code char} arrays and


< prev index next >