1 /* 2 * Copyright (c) 2002, 2018, Oracle and/or its affiliates. All rights reserved. 3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 4 * 5 * This code is free software; you can redistribute it and/or modify it 6 * under the terms of the GNU General Public License version 2 only, as 7 * published by the Free Software Foundation. Oracle designates this 8 * particular file as subject to the "Classpath" exception as provided 9 * by Oracle in the LICENSE file that accompanied this code. 10 * 11 * This code is distributed in the hope that it will be useful, but WITHOUT 12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 14 * version 2 for more details (a copy is included in the LICENSE file that 15 * accompanied this code). 16 * 17 * You should have received a copy of the GNU General Public License version 18 * 2 along with this work; if not, write to the Free Software Foundation, 19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 20 * 21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 22 * or visit www.oracle.com if you need additional information or have any 23 * questions. 24 */ 25 26 package java.lang; 27 28 import java.util.Arrays; 29 import java.util.Map; 30 import java.util.HashMap; 31 import java.util.Locale; 32 33 import jdk.internal.HotSpotIntrinsicCandidate; 34 import jdk.internal.misc.VM; 35 36 /** 37 * The {@code Character} class wraps a value of the primitive 38 * type {@code char} in an object. An object of type 39 * {@code Character} contains a single field whose type is 40 * {@code char}. 41 * <p> 42 * In addition, this class provides several methods for determining 43 * a character's category (lowercase letter, digit, etc.) and for converting 44 * characters from uppercase to lowercase and vice versa. 45 * <p> 46 * Character information is based on the Unicode Standard, version 11.0.0. 47 * <p> 48 * The methods and data of class {@code Character} are defined by 49 * the information in the <i>UnicodeData</i> file that is part of the 50 * Unicode Character Database maintained by the Unicode 51 * Consortium. This file specifies various properties including name 52 * and general category for every defined Unicode code point or 53 * character range. 54 * <p> 55 * The file and its description are available from the Unicode Consortium at: 56 * <ul> 57 * <li><a href="http://www.unicode.org">http://www.unicode.org</a> 58 * </ul> 59 * <p> 60 * The code point, U+32FF, is reserved by the Unicode Consortium 61 * to represent the Japanese square character for the new era that begins 62 * May 2019. Relevant methods in the Character class return the same 63 * properties as for the existing Japanese era characters (e.g., U+337E for 64 * "Meizi"). For the details of the code point, refer to 65 * <a href="http://blog.unicode.org/2018/09/new-japanese-era.html"> 66 * http://blog.unicode.org/2018/09/new-japanese-era.html</a>. 67 * 68 * <h3><a id="unicode">Unicode Character Representations</a></h3> 69 * 70 * <p>The {@code char} data type (and therefore the value that a 71 * {@code Character} object encapsulates) are based on the 72 * original Unicode specification, which defined characters as 73 * fixed-width 16-bit entities. The Unicode Standard has since been 74 * changed to allow for characters whose representation requires more 75 * than 16 bits. The range of legal <em>code point</em>s is now 76 * U+0000 to U+10FFFF, known as <em>Unicode scalar value</em>. 77 * (Refer to the <a 78 * href="http://www.unicode.org/reports/tr27/#notation"><i> 79 * definition</i></a> of the U+<i>n</i> notation in the Unicode 80 * Standard.) 81 * 82 * <p><a id="BMP">The set of characters from U+0000 to U+FFFF</a> is 83 * sometimes referred to as the <em>Basic Multilingual Plane (BMP)</em>. 84 * <a id="supplementary">Characters</a> whose code points are greater 85 * than U+FFFF are called <em>supplementary character</em>s. The Java 86 * platform uses the UTF-16 representation in {@code char} arrays and | 1 /* 2 * Copyright (c) 2002, 2019, Oracle and/or its affiliates. All rights reserved. 3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 4 * 5 * This code is free software; you can redistribute it and/or modify it 6 * under the terms of the GNU General Public License version 2 only, as 7 * published by the Free Software Foundation. Oracle designates this 8 * particular file as subject to the "Classpath" exception as provided 9 * by Oracle in the LICENSE file that accompanied this code. 10 * 11 * This code is distributed in the hope that it will be useful, but WITHOUT 12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 14 * version 2 for more details (a copy is included in the LICENSE file that 15 * accompanied this code). 16 * 17 * You should have received a copy of the GNU General Public License version 18 * 2 along with this work; if not, write to the Free Software Foundation, 19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 20 * 21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 22 * or visit www.oracle.com if you need additional information or have any 23 * questions. 24 */ 25 26 package java.lang; 27 28 import java.util.Arrays; 29 import java.util.Map; 30 import java.util.HashMap; 31 import java.util.Locale; 32 33 import jdk.internal.HotSpotIntrinsicCandidate; 34 import jdk.internal.misc.VM; 35 36 /** 37 * The {@code Character} class wraps a value of the primitive 38 * type {@code char} in an object. An object of class 39 * {@code Character} contains a single field whose type is 40 * {@code char}. 41 * <p> 42 * In addition, this class provides a large number of static methods for 43 * determining a character's category (lowercase letter, digit, etc.) 44 * and for converting characters from uppercase to lowercase and vice 45 * versa. 46 * 47 * <h3><a id="conformance">Unicode Conformance</a></h3> 48 * <p> 49 * The fields and methods of class {@code Character} are defined in terms 50 * of character information from the Unicode Standard, specifically the 51 * <i>UnicodeData</i> file that is part of the Unicode Character Database. 52 * This file specifies properties including name and category for every 53 * assigned Unicode code point or character range. The file is available 54 * from the Unicode Consortium at 55 * <a href="http://www.unicode.org">http://www.unicode.org</a>. 56 * <p> 57 * The Java SE 12 Platform uses character information from version 11.0 58 * of the Unicode Standard, plus the Japanese Era code point, 59 * {@code U+32FF}, from the first version of the Unicode Standard 60 * after 11.0 that assigns the code point. 61 * 62 * <h3><a id="unicode">Unicode Character Representations</a></h3> 63 * 64 * <p>The {@code char} data type (and therefore the value that a 65 * {@code Character} object encapsulates) are based on the 66 * original Unicode specification, which defined characters as 67 * fixed-width 16-bit entities. The Unicode Standard has since been 68 * changed to allow for characters whose representation requires more 69 * than 16 bits. The range of legal <em>code point</em>s is now 70 * U+0000 to U+10FFFF, known as <em>Unicode scalar value</em>. 71 * (Refer to the <a 72 * href="http://www.unicode.org/reports/tr27/#notation"><i> 73 * definition</i></a> of the U+<i>n</i> notation in the Unicode 74 * Standard.) 75 * 76 * <p><a id="BMP">The set of characters from U+0000 to U+FFFF</a> is 77 * sometimes referred to as the <em>Basic Multilingual Plane (BMP)</em>. 78 * <a id="supplementary">Characters</a> whose code points are greater 79 * than U+FFFF are called <em>supplementary character</em>s. The Java 80 * platform uses the UTF-16 representation in {@code char} arrays and |