1 /* 2 * Copyright (c) 2002, 2018, Oracle and/or its affiliates. All rights reserved. 3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 4 * 5 * This code is free software; you can redistribute it and/or modify it 6 * under the terms of the GNU General Public License version 2 only, as 7 * published by the Free Software Foundation. Oracle designates this 8 * particular file as subject to the "Classpath" exception as provided 9 * by Oracle in the LICENSE file that accompanied this code. 10 * 11 * This code is distributed in the hope that it will be useful, but WITHOUT 12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 14 * version 2 for more details (a copy is included in the LICENSE file that 15 * accompanied this code). 16 * 17 * You should have received a copy of the GNU General Public License version 18 * 2 along with this work; if not, write to the Free Software Foundation, 19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 20 * 21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 22 * or visit www.oracle.com if you need additional information or have any 23 * questions. 24 */ 25 26 package java.lang; 27 28 import java.util.Arrays; 29 import java.util.Map; 30 import java.util.HashMap; 31 import java.util.Locale; 32 33 import jdk.internal.HotSpotIntrinsicCandidate; 34 import jdk.internal.misc.VM; 35 36 /** 37 * The {@code Character} class wraps a value of the primitive 38 * type {@code char} in an object. An object of type 39 * {@code Character} contains a single field whose type is 40 * {@code char}. 41 * <p> 42 * In addition, this class provides several methods for determining 43 * a character's category (lowercase letter, digit, etc.) and for converting 44 * characters from uppercase to lowercase and vice versa. 45 * <p> 46 * Character information is based on the Unicode Standard, version 11.0.0. 47 * <p> 48 * The methods and data of class {@code Character} are defined by 49 * the information in the <i>UnicodeData</i> file that is part of the 50 * Unicode Character Database maintained by the Unicode 51 * Consortium. This file specifies various properties including name 52 * and general category for every defined Unicode code point or 53 * character range. 54 * <p> 55 * The file and its description are available from the Unicode Consortium at: 56 * <ul> 57 * <li><a href="http://www.unicode.org">http://www.unicode.org</a> 58 * </ul> 59 * <p> 60 * The code point, U+32FF, is reserved by the Unicode Consortium 61 * to represent the Japanese square character for the new era that begins 62 * May 2019. Relevant methods in the Character class return the same 63 * properties as for the existing Japanese era characters (e.g., U+337E for 64 * "Meizi"). For the details of the code point, refer to 65 * <a href="http://blog.unicode.org/2018/09/new-japanese-era.html"> 66 * http://blog.unicode.org/2018/09/new-japanese-era.html</a>. 67 * 68 * <h3><a id="unicode">Unicode Character Representations</a></h3> 69 * 70 * <p>The {@code char} data type (and therefore the value that a 71 * {@code Character} object encapsulates) are based on the 72 * original Unicode specification, which defined characters as 73 * fixed-width 16-bit entities. The Unicode Standard has since been 74 * changed to allow for characters whose representation requires more 75 * than 16 bits. The range of legal <em>code point</em>s is now 76 * U+0000 to U+10FFFF, known as <em>Unicode scalar value</em>. 77 * (Refer to the <a 78 * href="http://www.unicode.org/reports/tr27/#notation"><i> 79 * definition</i></a> of the U+<i>n</i> notation in the Unicode 80 * Standard.) 81 * 82 * <p><a id="BMP">The set of characters from U+0000 to U+FFFF</a> is 83 * sometimes referred to as the <em>Basic Multilingual Plane (BMP)</em>. 84 * <a id="supplementary">Characters</a> whose code points are greater 85 * than U+FFFF are called <em>supplementary character</em>s. The Java 86 * platform uses the UTF-16 representation in {@code char} arrays and 87 * in the {@code String} and {@code StringBuffer} classes. In 88 * this representation, supplementary characters are represented as a pair 89 * of {@code char} values, the first from the <em>high-surrogates</em> 90 * range, (\uD800-\uDBFF), the second from the 91 * <em>low-surrogates</em> range (\uDC00-\uDFFF). 92 * 93 * <p>A {@code char} value, therefore, represents Basic 94 * Multilingual Plane (BMP) code points, including the surrogate 95 * code points, or code units of the UTF-16 encoding. An 96 * {@code int} value represents all Unicode code points, 97 * including supplementary code points. The lower (least significant) 98 * 21 bits of {@code int} are used to represent Unicode code 99 * points and the upper (most significant) 11 bits must be zero. 100 * Unless otherwise specified, the behavior with respect to 101 * supplementary characters and surrogate {@code char} values is 102 * as follows: 103 * 104 * <ul> 105 * <li>The methods that only accept a {@code char} value cannot support 106 * supplementary characters. They treat {@code char} values from the 107 * surrogate ranges as undefined characters. For example, 108 * {@code Character.isLetter('\u005CuD840')} returns {@code false}, even though 109 * this specific value if followed by any low-surrogate value in a string 110 * would represent a letter. 111 * 112 * <li>The methods that accept an {@code int} value support all 113 * Unicode characters, including supplementary characters. For 114 * example, {@code Character.isLetter(0x2F81A)} returns 115 * {@code true} because the code point value represents a letter 116 * (a CJK ideograph). 117 * </ul> 118 * 119 * <p>In the Java SE API documentation, <em>Unicode code point</em> is 120 * used for character values in the range between U+0000 and U+10FFFF, 121 * and <em>Unicode code unit</em> is used for 16-bit 122 * {@code char} values that are code units of the <em>UTF-16</em> 123 * encoding. For more information on Unicode terminology, refer to the 124 * <a href="http://www.unicode.org/glossary/">Unicode Glossary</a>. 125 * 126 * @author Lee Boynton 127 * @author Guy Steele 128 * @author Akira Tanaka 129 * @author Martin Buchholz 130 * @author Ulf Zibis 131 * @since 1.0 132 */ 133 public final 134 class Character implements java.io.Serializable, Comparable<Character> { 135 /** 136 * The minimum radix available for conversion to and from strings. 137 * The constant value of this field is the smallest value permitted 138 * for the radix argument in radix-conversion methods such as the 139 * {@code digit} method, the {@code forDigit} method, and the 140 * {@code toString} method of class {@code Integer}. 141 * 142 * @see Character#digit(char, int) 143 * @see Character#forDigit(int, int) 144 * @see Integer#toString(int, int) 145 * @see Integer#valueOf(String) 146 */ 147 public static final int MIN_RADIX = 2; 148 149 /** 150 * The maximum radix available for conversion to and from strings. 151 * The constant value of this field is the largest value permitted 152 * for the radix argument in radix-conversion methods such as the 153 * {@code digit} method, the {@code forDigit} method, and the 154 * {@code toString} method of class {@code Integer}. 155 * 156 * @see Character#digit(char, int) 157 * @see Character#forDigit(int, int) 158 * @see Integer#toString(int, int) 159 * @see Integer#valueOf(String) 160 */ 161 public static final int MAX_RADIX = 36; 162 163 /** 164 * The constant value of this field is the smallest value of type 165 * {@code char}, {@code '\u005Cu0000'}. 166 * 167 * @since 1.0.2 168 */ 169 public static final char MIN_VALUE = '\u0000'; 170 171 /** 172 * The constant value of this field is the largest value of type 173 * {@code char}, {@code '\u005CuFFFF'}. 174 * 175 * @since 1.0.2 176 */ 177 public static final char MAX_VALUE = '\uFFFF'; 178 179 /** 180 * The {@code Class} instance representing the primitive type 181 * {@code char}. 182 * 183 * @since 1.1 184 */ 185 @SuppressWarnings("unchecked") 186 public static final Class<Character> TYPE = (Class<Character>) Class.getPrimitiveClass("char"); 187 188 /* 189 * Normative general types 190 */ 191 192 /* 193 * General character types 194 */ 195 196 /** 197 * General category "Cn" in the Unicode specification. 198 * @since 1.1 199 */ 200 public static final byte UNASSIGNED = 0; 201 202 /** 203 * General category "Lu" in the Unicode specification. 204 * @since 1.1 205 */ 206 public static final byte UPPERCASE_LETTER = 1; 207 208 /** 209 * General category "Ll" in the Unicode specification. 210 * @since 1.1 211 */ 212 public static final byte LOWERCASE_LETTER = 2; 213 214 /** 215 * General category "Lt" in the Unicode specification. 216 * @since 1.1 217 */ 218 public static final byte TITLECASE_LETTER = 3; 219 220 /** 221 * General category "Lm" in the Unicode specification. 222 * @since 1.1 223 */ 224 public static final byte MODIFIER_LETTER = 4; 225 226 /** 227 * General category "Lo" in the Unicode specification. 228 * @since 1.1 229 */ 230 public static final byte OTHER_LETTER = 5; 231 232 /** 233 * General category "Mn" in the Unicode specification. 234 * @since 1.1 235 */ 236 public static final byte NON_SPACING_MARK = 6; 237 238 /** 239 * General category "Me" in the Unicode specification. 240 * @since 1.1 241 */ 242 public static final byte ENCLOSING_MARK = 7; 243 244 /** 245 * General category "Mc" in the Unicode specification. 246 * @since 1.1 247 */ 248 public static final byte COMBINING_SPACING_MARK = 8; 249 250 /** 251 * General category "Nd" in the Unicode specification. 252 * @since 1.1 253 */ 254 public static final byte DECIMAL_DIGIT_NUMBER = 9; 255 256 /** 257 * General category "Nl" in the Unicode specification. 258 * @since 1.1 259 */ 260 public static final byte LETTER_NUMBER = 10; 261 262 /** 263 * General category "No" in the Unicode specification. 264 * @since 1.1 265 */ 266 public static final byte OTHER_NUMBER = 11; 267 268 /** 269 * General category "Zs" in the Unicode specification. 270 * @since 1.1 271 */ 272 public static final byte SPACE_SEPARATOR = 12; 273 274 /** 275 * General category "Zl" in the Unicode specification. 276 * @since 1.1 277 */ 278 public static final byte LINE_SEPARATOR = 13; 279 280 /** 281 * General category "Zp" in the Unicode specification. 282 * @since 1.1 283 */ 284 public static final byte PARAGRAPH_SEPARATOR = 14; 285 286 /** 287 * General category "Cc" in the Unicode specification. 288 * @since 1.1 289 */ 290 public static final byte CONTROL = 15; 291 292 /** 293 * General category "Cf" in the Unicode specification. 294 * @since 1.1 295 */ 296 public static final byte FORMAT = 16; 297 298 /** 299 * General category "Co" in the Unicode specification. 300 * @since 1.1 301 */ 302 public static final byte PRIVATE_USE = 18; 303 304 /** 305 * General category "Cs" in the Unicode specification. 306 * @since 1.1 307 */ 308 public static final byte SURROGATE = 19; 309 310 /** 311 * General category "Pd" in the Unicode specification. 312 * @since 1.1 313 */ 314 public static final byte DASH_PUNCTUATION = 20; 315 316 /** 317 * General category "Ps" in the Unicode specification. 318 * @since 1.1 319 */ 320 public static final byte START_PUNCTUATION = 21; 321 322 /** 323 * General category "Pe" in the Unicode specification. 324 * @since 1.1 325 */ 326 public static final byte END_PUNCTUATION = 22; 327 328 /** 329 * General category "Pc" in the Unicode specification. 330 * @since 1.1 331 */ 332 public static final byte CONNECTOR_PUNCTUATION = 23; 333 334 /** 335 * General category "Po" in the Unicode specification. 336 * @since 1.1 337 */ 338 public static final byte OTHER_PUNCTUATION = 24; 339 340 /** 341 * General category "Sm" in the Unicode specification. 342 * @since 1.1 343 */ 344 public static final byte MATH_SYMBOL = 25; 345 346 /** 347 * General category "Sc" in the Unicode specification. 348 * @since 1.1 349 */ 350 public static final byte CURRENCY_SYMBOL = 26; 351 352 /** 353 * General category "Sk" in the Unicode specification. 354 * @since 1.1 355 */ 356 public static final byte MODIFIER_SYMBOL = 27; 357 358 /** 359 * General category "So" in the Unicode specification. 360 * @since 1.1 361 */ 362 public static final byte OTHER_SYMBOL = 28; 363 364 /** 365 * General category "Pi" in the Unicode specification. 366 * @since 1.4 367 */ 368 public static final byte INITIAL_QUOTE_PUNCTUATION = 29; 369 370 /** 371 * General category "Pf" in the Unicode specification. 372 * @since 1.4 373 */ 374 public static final byte FINAL_QUOTE_PUNCTUATION = 30; 375 376 /** 377 * Error flag. Use int (code point) to avoid confusion with U+FFFF. 378 */ 379 static final int ERROR = 0xFFFFFFFF; 380 381 382 /** 383 * Undefined bidirectional character type. Undefined {@code char} 384 * values have undefined directionality in the Unicode specification. 385 * @since 1.4 386 */ 387 public static final byte DIRECTIONALITY_UNDEFINED = -1; 388 389 /** 390 * Strong bidirectional character type "L" in the Unicode specification. 391 * @since 1.4 392 */ 393 public static final byte DIRECTIONALITY_LEFT_TO_RIGHT = 0; 394 395 /** 396 * Strong bidirectional character type "R" in the Unicode specification. 397 * @since 1.4 398 */ 399 public static final byte DIRECTIONALITY_RIGHT_TO_LEFT = 1; 400 401 /** 402 * Strong bidirectional character type "AL" in the Unicode specification. 403 * @since 1.4 404 */ 405 public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC = 2; 406 407 /** 408 * Weak bidirectional character type "EN" in the Unicode specification. 409 * @since 1.4 410 */ 411 public static final byte DIRECTIONALITY_EUROPEAN_NUMBER = 3; 412 413 /** 414 * Weak bidirectional character type "ES" in the Unicode specification. 415 * @since 1.4 416 */ 417 public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR = 4; 418 419 /** 420 * Weak bidirectional character type "ET" in the Unicode specification. 421 * @since 1.4 422 */ 423 public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR = 5; 424 425 /** 426 * Weak bidirectional character type "AN" in the Unicode specification. 427 * @since 1.4 428 */ 429 public static final byte DIRECTIONALITY_ARABIC_NUMBER = 6; 430 431 /** 432 * Weak bidirectional character type "CS" in the Unicode specification. 433 * @since 1.4 434 */ 435 public static final byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR = 7; 436 437 /** 438 * Weak bidirectional character type "NSM" in the Unicode specification. 439 * @since 1.4 440 */ 441 public static final byte DIRECTIONALITY_NONSPACING_MARK = 8; 442 443 /** 444 * Weak bidirectional character type "BN" in the Unicode specification. 445 * @since 1.4 446 */ 447 public static final byte DIRECTIONALITY_BOUNDARY_NEUTRAL = 9; 448 449 /** 450 * Neutral bidirectional character type "B" in the Unicode specification. 451 * @since 1.4 452 */ 453 public static final byte DIRECTIONALITY_PARAGRAPH_SEPARATOR = 10; 454 455 /** 456 * Neutral bidirectional character type "S" in the Unicode specification. 457 * @since 1.4 458 */ 459 public static final byte DIRECTIONALITY_SEGMENT_SEPARATOR = 11; 460 461 /** 462 * Neutral bidirectional character type "WS" in the Unicode specification. 463 * @since 1.4 464 */ 465 public static final byte DIRECTIONALITY_WHITESPACE = 12; 466 467 /** 468 * Neutral bidirectional character type "ON" in the Unicode specification. 469 * @since 1.4 470 */ 471 public static final byte DIRECTIONALITY_OTHER_NEUTRALS = 13; 472 473 /** 474 * Strong bidirectional character type "LRE" in the Unicode specification. 475 * @since 1.4 476 */ 477 public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING = 14; 478 479 /** 480 * Strong bidirectional character type "LRO" in the Unicode specification. 481 * @since 1.4 482 */ 483 public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE = 15; 484 485 /** 486 * Strong bidirectional character type "RLE" in the Unicode specification. 487 * @since 1.4 488 */ 489 public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING = 16; 490 491 /** 492 * Strong bidirectional character type "RLO" in the Unicode specification. 493 * @since 1.4 494 */ 495 public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE = 17; 496 497 /** 498 * Weak bidirectional character type "PDF" in the Unicode specification. 499 * @since 1.4 500 */ 501 public static final byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT = 18; 502 503 /** 504 * Weak bidirectional character type "LRI" in the Unicode specification. 505 * @since 9 506 */ 507 public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_ISOLATE = 19; 508 509 /** 510 * Weak bidirectional character type "RLI" in the Unicode specification. 511 * @since 9 512 */ 513 public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ISOLATE = 20; 514 515 /** 516 * Weak bidirectional character type "FSI" in the Unicode specification. 517 * @since 9 518 */ 519 public static final byte DIRECTIONALITY_FIRST_STRONG_ISOLATE = 21; 520 521 /** 522 * Weak bidirectional character type "PDI" in the Unicode specification. 523 * @since 9 524 */ 525 public static final byte DIRECTIONALITY_POP_DIRECTIONAL_ISOLATE = 22; 526 527 /** 528 * The minimum value of a 529 * <a href="http://www.unicode.org/glossary/#high_surrogate_code_unit"> 530 * Unicode high-surrogate code unit</a> 531 * in the UTF-16 encoding, constant {@code '\u005CuD800'}. 532 * A high-surrogate is also known as a <i>leading-surrogate</i>. 533 * 534 * @since 1.5 535 */ 536 public static final char MIN_HIGH_SURROGATE = '\uD800'; 537 538 /** 539 * The maximum value of a 540 * <a href="http://www.unicode.org/glossary/#high_surrogate_code_unit"> 541 * Unicode high-surrogate code unit</a> 542 * in the UTF-16 encoding, constant {@code '\u005CuDBFF'}. 543 * A high-surrogate is also known as a <i>leading-surrogate</i>. 544 * 545 * @since 1.5 546 */ 547 public static final char MAX_HIGH_SURROGATE = '\uDBFF'; 548 549 /** 550 * The minimum value of a 551 * <a href="http://www.unicode.org/glossary/#low_surrogate_code_unit"> 552 * Unicode low-surrogate code unit</a> 553 * in the UTF-16 encoding, constant {@code '\u005CuDC00'}. 554 * A low-surrogate is also known as a <i>trailing-surrogate</i>. 555 * 556 * @since 1.5 557 */ 558 public static final char MIN_LOW_SURROGATE = '\uDC00'; 559 560 /** 561 * The maximum value of a 562 * <a href="http://www.unicode.org/glossary/#low_surrogate_code_unit"> 563 * Unicode low-surrogate code unit</a> 564 * in the UTF-16 encoding, constant {@code '\u005CuDFFF'}. 565 * A low-surrogate is also known as a <i>trailing-surrogate</i>. 566 * 567 * @since 1.5 568 */ 569 public static final char MAX_LOW_SURROGATE = '\uDFFF'; 570 571 /** 572 * The minimum value of a Unicode surrogate code unit in the 573 * UTF-16 encoding, constant {@code '\u005CuD800'}. 574 * 575 * @since 1.5 576 */ 577 public static final char MIN_SURROGATE = MIN_HIGH_SURROGATE; 578 579 /** 580 * The maximum value of a Unicode surrogate code unit in the 581 * UTF-16 encoding, constant {@code '\u005CuDFFF'}. 582 * 583 * @since 1.5 584 */ 585 public static final char MAX_SURROGATE = MAX_LOW_SURROGATE; 586 587 /** 588 * The minimum value of a 589 * <a href="http://www.unicode.org/glossary/#supplementary_code_point"> 590 * Unicode supplementary code point</a>, constant {@code U+10000}. 591 * 592 * @since 1.5 593 */ 594 public static final int MIN_SUPPLEMENTARY_CODE_POINT = 0x010000; 595 596 /** 597 * The minimum value of a 598 * <a href="http://www.unicode.org/glossary/#code_point"> 599 * Unicode code point</a>, constant {@code U+0000}. 600 * 601 * @since 1.5 602 */ 603 public static final int MIN_CODE_POINT = 0x000000; 604 605 /** 606 * The maximum value of a 607 * <a href="http://www.unicode.org/glossary/#code_point"> 608 * Unicode code point</a>, constant {@code U+10FFFF}. 609 * 610 * @since 1.5 611 */ 612 public static final int MAX_CODE_POINT = 0X10FFFF; 613 614 615 /** 616 * Instances of this class represent particular subsets of the Unicode 617 * character set. The only family of subsets defined in the 618 * {@code Character} class is {@link Character.UnicodeBlock}. 619 * Other portions of the Java API may define other subsets for their 620 * own purposes. 621 * 622 * @since 1.2 623 */ 624 public static class Subset { 625 626 private String name; 627 628 /** 629 * Constructs a new {@code Subset} instance. 630 * 631 * @param name The name of this subset 632 * @throws NullPointerException if name is {@code null} 633 */ 634 protected Subset(String name) { 635 if (name == null) { 636 throw new NullPointerException("name"); 637 } 638 this.name = name; 639 } 640 641 /** 642 * Compares two {@code Subset} objects for equality. 643 * This method returns {@code true} if and only if 644 * {@code this} and the argument refer to the same 645 * object; since this method is {@code final}, this 646 * guarantee holds for all subclasses. 647 */ 648 public final boolean equals(Object obj) { 649 return (this == obj); 650 } 651 652 /** 653 * Returns the standard hash code as defined by the 654 * {@link Object#hashCode} method. This method 655 * is {@code final} in order to ensure that the 656 * {@code equals} and {@code hashCode} methods will 657 * be consistent in all subclasses. 658 */ 659 public final int hashCode() { 660 return super.hashCode(); 661 } 662 663 /** 664 * Returns the name of this subset. 665 */ 666 public final String toString() { 667 return name; 668 } 669 } 670 671 // See http://www.unicode.org/Public/UNIDATA/Blocks.txt 672 // for the latest specification of Unicode Blocks. 673 674 /** 675 * A family of character subsets representing the character blocks in the 676 * Unicode specification. Character blocks generally define characters 677 * used for a specific script or purpose. A character is contained by 678 * at most one Unicode block. 679 * 680 * @since 1.2 681 */ 682 public static final class UnicodeBlock extends Subset { 683 /** 684 * 667 - the expected number of entities 685 * 0.75 - the default load factor of HashMap 686 */ 687 private static final int NUM_ENTITIES = 667; 688 private static Map<String, UnicodeBlock> map = 689 new HashMap<>((int)(NUM_ENTITIES / 0.75f + 1.0f)); 690 691 /** 692 * Creates a UnicodeBlock with the given identifier name. 693 * This name must be the same as the block identifier. 694 */ 695 private UnicodeBlock(String idName) { 696 super(idName); 697 map.put(idName, this); 698 } 699 700 /** 701 * Creates a UnicodeBlock with the given identifier name and 702 * alias name. 703 */ 704 private UnicodeBlock(String idName, String alias) { 705 this(idName); 706 map.put(alias, this); 707 } 708 709 /** 710 * Creates a UnicodeBlock with the given identifier name and 711 * alias names. 712 */ 713 private UnicodeBlock(String idName, String... aliases) { 714 this(idName); 715 for (String alias : aliases) 716 map.put(alias, this); 717 } 718 719 /** 720 * Constant for the "Basic Latin" Unicode character block. 721 * @since 1.2 722 */ 723 public static final UnicodeBlock BASIC_LATIN = 724 new UnicodeBlock("BASIC_LATIN", 725 "BASIC LATIN", 726 "BASICLATIN"); 727 728 /** 729 * Constant for the "Latin-1 Supplement" Unicode character block. 730 * @since 1.2 731 */ 732 public static final UnicodeBlock LATIN_1_SUPPLEMENT = 733 new UnicodeBlock("LATIN_1_SUPPLEMENT", 734 "LATIN-1 SUPPLEMENT", 735 "LATIN-1SUPPLEMENT"); 736 737 /** 738 * Constant for the "Latin Extended-A" Unicode character block. 739 * @since 1.2 740 */ 741 public static final UnicodeBlock LATIN_EXTENDED_A = 742 new UnicodeBlock("LATIN_EXTENDED_A", 743 "LATIN EXTENDED-A", 744 "LATINEXTENDED-A"); 745 746 /** 747 * Constant for the "Latin Extended-B" Unicode character block. 748 * @since 1.2 749 */ 750 public static final UnicodeBlock LATIN_EXTENDED_B = 751 new UnicodeBlock("LATIN_EXTENDED_B", 752 "LATIN EXTENDED-B", 753 "LATINEXTENDED-B"); 754 755 /** 756 * Constant for the "IPA Extensions" Unicode character block. 757 * @since 1.2 758 */ 759 public static final UnicodeBlock IPA_EXTENSIONS = 760 new UnicodeBlock("IPA_EXTENSIONS", 761 "IPA EXTENSIONS", 762 "IPAEXTENSIONS"); 763 764 /** 765 * Constant for the "Spacing Modifier Letters" Unicode character block. 766 * @since 1.2 767 */ 768 public static final UnicodeBlock SPACING_MODIFIER_LETTERS = 769 new UnicodeBlock("SPACING_MODIFIER_LETTERS", 770 "SPACING MODIFIER LETTERS", 771 "SPACINGMODIFIERLETTERS"); 772 773 /** 774 * Constant for the "Combining Diacritical Marks" Unicode character block. 775 * @since 1.2 776 */ 777 public static final UnicodeBlock COMBINING_DIACRITICAL_MARKS = 778 new UnicodeBlock("COMBINING_DIACRITICAL_MARKS", 779 "COMBINING DIACRITICAL MARKS", 780 "COMBININGDIACRITICALMARKS"); 781 782 /** 783 * Constant for the "Greek and Coptic" Unicode character block. 784 * <p> 785 * This block was previously known as the "Greek" block. 786 * 787 * @since 1.2 788 */ 789 public static final UnicodeBlock GREEK = 790 new UnicodeBlock("GREEK", 791 "GREEK AND COPTIC", 792 "GREEKANDCOPTIC"); 793 794 /** 795 * Constant for the "Cyrillic" Unicode character block. 796 * @since 1.2 797 */ 798 public static final UnicodeBlock CYRILLIC = 799 new UnicodeBlock("CYRILLIC"); 800 801 /** 802 * Constant for the "Armenian" Unicode character block. 803 * @since 1.2 804 */ 805 public static final UnicodeBlock ARMENIAN = 806 new UnicodeBlock("ARMENIAN"); 807 808 /** 809 * Constant for the "Hebrew" Unicode character block. 810 * @since 1.2 811 */ 812 public static final UnicodeBlock HEBREW = 813 new UnicodeBlock("HEBREW"); 814 815 /** 816 * Constant for the "Arabic" Unicode character block. 817 * @since 1.2 818 */ 819 public static final UnicodeBlock ARABIC = 820 new UnicodeBlock("ARABIC"); 821 822 /** 823 * Constant for the "Devanagari" Unicode character block. 824 * @since 1.2 825 */ 826 public static final UnicodeBlock DEVANAGARI = 827 new UnicodeBlock("DEVANAGARI"); 828 829 /** 830 * Constant for the "Bengali" Unicode character block. 831 * @since 1.2 832 */ 833 public static final UnicodeBlock BENGALI = 834 new UnicodeBlock("BENGALI"); 835 836 /** 837 * Constant for the "Gurmukhi" Unicode character block. 838 * @since 1.2 839 */ 840 public static final UnicodeBlock GURMUKHI = 841 new UnicodeBlock("GURMUKHI"); 842 843 /** 844 * Constant for the "Gujarati" Unicode character block. 845 * @since 1.2 846 */ 847 public static final UnicodeBlock GUJARATI = 848 new UnicodeBlock("GUJARATI"); 849 850 /** 851 * Constant for the "Oriya" Unicode character block. 852 * @since 1.2 853 */ 854 public static final UnicodeBlock ORIYA = 855 new UnicodeBlock("ORIYA"); 856 857 /** 858 * Constant for the "Tamil" Unicode character block. 859 * @since 1.2 860 */ 861 public static final UnicodeBlock TAMIL = 862 new UnicodeBlock("TAMIL"); 863 864 /** 865 * Constant for the "Telugu" Unicode character block. 866 * @since 1.2 867 */ 868 public static final UnicodeBlock TELUGU = 869 new UnicodeBlock("TELUGU"); 870 871 /** 872 * Constant for the "Kannada" Unicode character block. 873 * @since 1.2 874 */ 875 public static final UnicodeBlock KANNADA = 876 new UnicodeBlock("KANNADA"); 877 878 /** 879 * Constant for the "Malayalam" Unicode character block. 880 * @since 1.2 881 */ 882 public static final UnicodeBlock MALAYALAM = 883 new UnicodeBlock("MALAYALAM"); 884 885 /** 886 * Constant for the "Thai" Unicode character block. 887 * @since 1.2 888 */ 889 public static final UnicodeBlock THAI = 890 new UnicodeBlock("THAI"); 891 892 /** 893 * Constant for the "Lao" Unicode character block. 894 * @since 1.2 895 */ 896 public static final UnicodeBlock LAO = 897 new UnicodeBlock("LAO"); 898 899 /** 900 * Constant for the "Tibetan" Unicode character block. 901 * @since 1.2 902 */ 903 public static final UnicodeBlock TIBETAN = 904 new UnicodeBlock("TIBETAN"); 905 906 /** 907 * Constant for the "Georgian" Unicode character block. 908 * @since 1.2 909 */ 910 public static final UnicodeBlock GEORGIAN = 911 new UnicodeBlock("GEORGIAN"); 912 913 /** 914 * Constant for the "Hangul Jamo" Unicode character block. 915 * @since 1.2 916 */ 917 public static final UnicodeBlock HANGUL_JAMO = 918 new UnicodeBlock("HANGUL_JAMO", 919 "HANGUL JAMO", 920 "HANGULJAMO"); 921 922 /** 923 * Constant for the "Latin Extended Additional" Unicode character block. 924 * @since 1.2 925 */ 926 public static final UnicodeBlock LATIN_EXTENDED_ADDITIONAL = 927 new UnicodeBlock("LATIN_EXTENDED_ADDITIONAL", 928 "LATIN EXTENDED ADDITIONAL", 929 "LATINEXTENDEDADDITIONAL"); 930 931 /** 932 * Constant for the "Greek Extended" Unicode character block. 933 * @since 1.2 934 */ 935 public static final UnicodeBlock GREEK_EXTENDED = 936 new UnicodeBlock("GREEK_EXTENDED", 937 "GREEK EXTENDED", 938 "GREEKEXTENDED"); 939 940 /** 941 * Constant for the "General Punctuation" Unicode character block. 942 * @since 1.2 943 */ 944 public static final UnicodeBlock GENERAL_PUNCTUATION = 945 new UnicodeBlock("GENERAL_PUNCTUATION", 946 "GENERAL PUNCTUATION", 947 "GENERALPUNCTUATION"); 948 949 /** 950 * Constant for the "Superscripts and Subscripts" Unicode character 951 * block. 952 * @since 1.2 953 */ 954 public static final UnicodeBlock SUPERSCRIPTS_AND_SUBSCRIPTS = 955 new UnicodeBlock("SUPERSCRIPTS_AND_SUBSCRIPTS", 956 "SUPERSCRIPTS AND SUBSCRIPTS", 957 "SUPERSCRIPTSANDSUBSCRIPTS"); 958 959 /** 960 * Constant for the "Currency Symbols" Unicode character block. 961 * @since 1.2 962 */ 963 public static final UnicodeBlock CURRENCY_SYMBOLS = 964 new UnicodeBlock("CURRENCY_SYMBOLS", 965 "CURRENCY SYMBOLS", 966 "CURRENCYSYMBOLS"); 967 968 /** 969 * Constant for the "Combining Diacritical Marks for Symbols" Unicode 970 * character block. 971 * <p> 972 * This block was previously known as "Combining Marks for Symbols". 973 * @since 1.2 974 */ 975 public static final UnicodeBlock COMBINING_MARKS_FOR_SYMBOLS = 976 new UnicodeBlock("COMBINING_MARKS_FOR_SYMBOLS", 977 "COMBINING DIACRITICAL MARKS FOR SYMBOLS", 978 "COMBININGDIACRITICALMARKSFORSYMBOLS", 979 "COMBINING MARKS FOR SYMBOLS", 980 "COMBININGMARKSFORSYMBOLS"); 981 982 /** 983 * Constant for the "Letterlike Symbols" Unicode character block. 984 * @since 1.2 985 */ 986 public static final UnicodeBlock LETTERLIKE_SYMBOLS = 987 new UnicodeBlock("LETTERLIKE_SYMBOLS", 988 "LETTERLIKE SYMBOLS", 989 "LETTERLIKESYMBOLS"); 990 991 /** 992 * Constant for the "Number Forms" Unicode character block. 993 * @since 1.2 994 */ 995 public static final UnicodeBlock NUMBER_FORMS = 996 new UnicodeBlock("NUMBER_FORMS", 997 "NUMBER FORMS", 998 "NUMBERFORMS"); 999 1000 /** 1001 * Constant for the "Arrows" Unicode character block. 1002 * @since 1.2 1003 */ 1004 public static final UnicodeBlock ARROWS = 1005 new UnicodeBlock("ARROWS"); 1006 1007 /** 1008 * Constant for the "Mathematical Operators" Unicode character block. 1009 * @since 1.2 1010 */ 1011 public static final UnicodeBlock MATHEMATICAL_OPERATORS = 1012 new UnicodeBlock("MATHEMATICAL_OPERATORS", 1013 "MATHEMATICAL OPERATORS", 1014 "MATHEMATICALOPERATORS"); 1015 1016 /** 1017 * Constant for the "Miscellaneous Technical" Unicode character block. 1018 * @since 1.2 1019 */ 1020 public static final UnicodeBlock MISCELLANEOUS_TECHNICAL = 1021 new UnicodeBlock("MISCELLANEOUS_TECHNICAL", 1022 "MISCELLANEOUS TECHNICAL", 1023 "MISCELLANEOUSTECHNICAL"); 1024 1025 /** 1026 * Constant for the "Control Pictures" Unicode character block. 1027 * @since 1.2 1028 */ 1029 public static final UnicodeBlock CONTROL_PICTURES = 1030 new UnicodeBlock("CONTROL_PICTURES", 1031 "CONTROL PICTURES", 1032 "CONTROLPICTURES"); 1033 1034 /** 1035 * Constant for the "Optical Character Recognition" Unicode character block. 1036 * @since 1.2 1037 */ 1038 public static final UnicodeBlock OPTICAL_CHARACTER_RECOGNITION = 1039 new UnicodeBlock("OPTICAL_CHARACTER_RECOGNITION", 1040 "OPTICAL CHARACTER RECOGNITION", 1041 "OPTICALCHARACTERRECOGNITION"); 1042 1043 /** 1044 * Constant for the "Enclosed Alphanumerics" Unicode character block. 1045 * @since 1.2 1046 */ 1047 public static final UnicodeBlock ENCLOSED_ALPHANUMERICS = 1048 new UnicodeBlock("ENCLOSED_ALPHANUMERICS", 1049 "ENCLOSED ALPHANUMERICS", 1050 "ENCLOSEDALPHANUMERICS"); 1051 1052 /** 1053 * Constant for the "Box Drawing" Unicode character block. 1054 * @since 1.2 1055 */ 1056 public static final UnicodeBlock BOX_DRAWING = 1057 new UnicodeBlock("BOX_DRAWING", 1058 "BOX DRAWING", 1059 "BOXDRAWING"); 1060 1061 /** 1062 * Constant for the "Block Elements" Unicode character block. 1063 * @since 1.2 1064 */ 1065 public static final UnicodeBlock BLOCK_ELEMENTS = 1066 new UnicodeBlock("BLOCK_ELEMENTS", 1067 "BLOCK ELEMENTS", 1068 "BLOCKELEMENTS"); 1069 1070 /** 1071 * Constant for the "Geometric Shapes" Unicode character block. 1072 * @since 1.2 1073 */ 1074 public static final UnicodeBlock GEOMETRIC_SHAPES = 1075 new UnicodeBlock("GEOMETRIC_SHAPES", 1076 "GEOMETRIC SHAPES", 1077 "GEOMETRICSHAPES"); 1078 1079 /** 1080 * Constant for the "Miscellaneous Symbols" Unicode character block. 1081 * @since 1.2 1082 */ 1083 public static final UnicodeBlock MISCELLANEOUS_SYMBOLS = 1084 new UnicodeBlock("MISCELLANEOUS_SYMBOLS", 1085 "MISCELLANEOUS SYMBOLS", 1086 "MISCELLANEOUSSYMBOLS"); 1087 1088 /** 1089 * Constant for the "Dingbats" Unicode character block. 1090 * @since 1.2 1091 */ 1092 public static final UnicodeBlock DINGBATS = 1093 new UnicodeBlock("DINGBATS"); 1094 1095 /** 1096 * Constant for the "CJK Symbols and Punctuation" Unicode character block. 1097 * @since 1.2 1098 */ 1099 public static final UnicodeBlock CJK_SYMBOLS_AND_PUNCTUATION = 1100 new UnicodeBlock("CJK_SYMBOLS_AND_PUNCTUATION", 1101 "CJK SYMBOLS AND PUNCTUATION", 1102 "CJKSYMBOLSANDPUNCTUATION"); 1103 1104 /** 1105 * Constant for the "Hiragana" Unicode character block. 1106 * @since 1.2 1107 */ 1108 public static final UnicodeBlock HIRAGANA = 1109 new UnicodeBlock("HIRAGANA"); 1110 1111 /** 1112 * Constant for the "Katakana" Unicode character block. 1113 * @since 1.2 1114 */ 1115 public static final UnicodeBlock KATAKANA = 1116 new UnicodeBlock("KATAKANA"); 1117 1118 /** 1119 * Constant for the "Bopomofo" Unicode character block. 1120 * @since 1.2 1121 */ 1122 public static final UnicodeBlock BOPOMOFO = 1123 new UnicodeBlock("BOPOMOFO"); 1124 1125 /** 1126 * Constant for the "Hangul Compatibility Jamo" Unicode character block. 1127 * @since 1.2 1128 */ 1129 public static final UnicodeBlock HANGUL_COMPATIBILITY_JAMO = 1130 new UnicodeBlock("HANGUL_COMPATIBILITY_JAMO", 1131 "HANGUL COMPATIBILITY JAMO", 1132 "HANGULCOMPATIBILITYJAMO"); 1133 1134 /** 1135 * Constant for the "Kanbun" Unicode character block. 1136 * @since 1.2 1137 */ 1138 public static final UnicodeBlock KANBUN = 1139 new UnicodeBlock("KANBUN"); 1140 1141 /** 1142 * Constant for the "Enclosed CJK Letters and Months" Unicode character block. 1143 * @since 1.2 1144 */ 1145 public static final UnicodeBlock ENCLOSED_CJK_LETTERS_AND_MONTHS = 1146 new UnicodeBlock("ENCLOSED_CJK_LETTERS_AND_MONTHS", 1147 "ENCLOSED CJK LETTERS AND MONTHS", 1148 "ENCLOSEDCJKLETTERSANDMONTHS"); 1149 1150 /** 1151 * Constant for the "CJK Compatibility" Unicode character block. 1152 * @since 1.2 1153 */ 1154 public static final UnicodeBlock CJK_COMPATIBILITY = 1155 new UnicodeBlock("CJK_COMPATIBILITY", 1156 "CJK COMPATIBILITY", 1157 "CJKCOMPATIBILITY"); 1158 1159 /** 1160 * Constant for the "CJK Unified Ideographs" Unicode character block. 1161 * @since 1.2 1162 */ 1163 public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS = 1164 new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS", 1165 "CJK UNIFIED IDEOGRAPHS", 1166 "CJKUNIFIEDIDEOGRAPHS"); 1167 1168 /** 1169 * Constant for the "Hangul Syllables" Unicode character block. 1170 * @since 1.2 1171 */ 1172 public static final UnicodeBlock HANGUL_SYLLABLES = 1173 new UnicodeBlock("HANGUL_SYLLABLES", 1174 "HANGUL SYLLABLES", 1175 "HANGULSYLLABLES"); 1176 1177 /** 1178 * Constant for the "Private Use Area" Unicode character block. 1179 * @since 1.2 1180 */ 1181 public static final UnicodeBlock PRIVATE_USE_AREA = 1182 new UnicodeBlock("PRIVATE_USE_AREA", 1183 "PRIVATE USE AREA", 1184 "PRIVATEUSEAREA"); 1185 1186 /** 1187 * Constant for the "CJK Compatibility Ideographs" Unicode character 1188 * block. 1189 * @since 1.2 1190 */ 1191 public static final UnicodeBlock CJK_COMPATIBILITY_IDEOGRAPHS = 1192 new UnicodeBlock("CJK_COMPATIBILITY_IDEOGRAPHS", 1193 "CJK COMPATIBILITY IDEOGRAPHS", 1194 "CJKCOMPATIBILITYIDEOGRAPHS"); 1195 1196 /** 1197 * Constant for the "Alphabetic Presentation Forms" Unicode character block. 1198 * @since 1.2 1199 */ 1200 public static final UnicodeBlock ALPHABETIC_PRESENTATION_FORMS = 1201 new UnicodeBlock("ALPHABETIC_PRESENTATION_FORMS", 1202 "ALPHABETIC PRESENTATION FORMS", 1203 "ALPHABETICPRESENTATIONFORMS"); 1204 1205 /** 1206 * Constant for the "Arabic Presentation Forms-A" Unicode character 1207 * block. 1208 * @since 1.2 1209 */ 1210 public static final UnicodeBlock ARABIC_PRESENTATION_FORMS_A = 1211 new UnicodeBlock("ARABIC_PRESENTATION_FORMS_A", 1212 "ARABIC PRESENTATION FORMS-A", 1213 "ARABICPRESENTATIONFORMS-A"); 1214 1215 /** 1216 * Constant for the "Combining Half Marks" Unicode character block. 1217 * @since 1.2 1218 */ 1219 public static final UnicodeBlock COMBINING_HALF_MARKS = 1220 new UnicodeBlock("COMBINING_HALF_MARKS", 1221 "COMBINING HALF MARKS", 1222 "COMBININGHALFMARKS"); 1223 1224 /** 1225 * Constant for the "CJK Compatibility Forms" Unicode character block. 1226 * @since 1.2 1227 */ 1228 public static final UnicodeBlock CJK_COMPATIBILITY_FORMS = 1229 new UnicodeBlock("CJK_COMPATIBILITY_FORMS", 1230 "CJK COMPATIBILITY FORMS", 1231 "CJKCOMPATIBILITYFORMS"); 1232 1233 /** 1234 * Constant for the "Small Form Variants" Unicode character block. 1235 * @since 1.2 1236 */ 1237 public static final UnicodeBlock SMALL_FORM_VARIANTS = 1238 new UnicodeBlock("SMALL_FORM_VARIANTS", 1239 "SMALL FORM VARIANTS", 1240 "SMALLFORMVARIANTS"); 1241 1242 /** 1243 * Constant for the "Arabic Presentation Forms-B" Unicode character block. 1244 * @since 1.2 1245 */ 1246 public static final UnicodeBlock ARABIC_PRESENTATION_FORMS_B = 1247 new UnicodeBlock("ARABIC_PRESENTATION_FORMS_B", 1248 "ARABIC PRESENTATION FORMS-B", 1249 "ARABICPRESENTATIONFORMS-B"); 1250 1251 /** 1252 * Constant for the "Halfwidth and Fullwidth Forms" Unicode character 1253 * block. 1254 * @since 1.2 1255 */ 1256 public static final UnicodeBlock HALFWIDTH_AND_FULLWIDTH_FORMS = 1257 new UnicodeBlock("HALFWIDTH_AND_FULLWIDTH_FORMS", 1258 "HALFWIDTH AND FULLWIDTH FORMS", 1259 "HALFWIDTHANDFULLWIDTHFORMS"); 1260 1261 /** 1262 * Constant for the "Specials" Unicode character block. 1263 * @since 1.2 1264 */ 1265 public static final UnicodeBlock SPECIALS = 1266 new UnicodeBlock("SPECIALS"); 1267 1268 /** 1269 * @deprecated 1270 * Instead of {@code SURROGATES_AREA}, use {@link #HIGH_SURROGATES}, 1271 * {@link #HIGH_PRIVATE_USE_SURROGATES}, and {@link #LOW_SURROGATES}. 1272 * These constants match the block definitions of the Unicode Standard. 1273 * The {@link #of(char)} and {@link #of(int)} methods return the 1274 * standard constants. 1275 */ 1276 @Deprecated(since="1.5") 1277 public static final UnicodeBlock SURROGATES_AREA = 1278 new UnicodeBlock("SURROGATES_AREA"); 1279 1280 /** 1281 * Constant for the "Syriac" Unicode character block. 1282 * @since 1.4 1283 */ 1284 public static final UnicodeBlock SYRIAC = 1285 new UnicodeBlock("SYRIAC"); 1286 1287 /** 1288 * Constant for the "Thaana" Unicode character block. 1289 * @since 1.4 1290 */ 1291 public static final UnicodeBlock THAANA = 1292 new UnicodeBlock("THAANA"); 1293 1294 /** 1295 * Constant for the "Sinhala" Unicode character block. 1296 * @since 1.4 1297 */ 1298 public static final UnicodeBlock SINHALA = 1299 new UnicodeBlock("SINHALA"); 1300 1301 /** 1302 * Constant for the "Myanmar" Unicode character block. 1303 * @since 1.4 1304 */ 1305 public static final UnicodeBlock MYANMAR = 1306 new UnicodeBlock("MYANMAR"); 1307 1308 /** 1309 * Constant for the "Ethiopic" Unicode character block. 1310 * @since 1.4 1311 */ 1312 public static final UnicodeBlock ETHIOPIC = 1313 new UnicodeBlock("ETHIOPIC"); 1314 1315 /** 1316 * Constant for the "Cherokee" Unicode character block. 1317 * @since 1.4 1318 */ 1319 public static final UnicodeBlock CHEROKEE = 1320 new UnicodeBlock("CHEROKEE"); 1321 1322 /** 1323 * Constant for the "Unified Canadian Aboriginal Syllabics" Unicode character block. 1324 * @since 1.4 1325 */ 1326 public static final UnicodeBlock UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS = 1327 new UnicodeBlock("UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS", 1328 "UNIFIED CANADIAN ABORIGINAL SYLLABICS", 1329 "UNIFIEDCANADIANABORIGINALSYLLABICS"); 1330 1331 /** 1332 * Constant for the "Ogham" Unicode character block. 1333 * @since 1.4 1334 */ 1335 public static final UnicodeBlock OGHAM = 1336 new UnicodeBlock("OGHAM"); 1337 1338 /** 1339 * Constant for the "Runic" Unicode character block. 1340 * @since 1.4 1341 */ 1342 public static final UnicodeBlock RUNIC = 1343 new UnicodeBlock("RUNIC"); 1344 1345 /** 1346 * Constant for the "Khmer" Unicode character block. 1347 * @since 1.4 1348 */ 1349 public static final UnicodeBlock KHMER = 1350 new UnicodeBlock("KHMER"); 1351 1352 /** 1353 * Constant for the "Mongolian" Unicode character block. 1354 * @since 1.4 1355 */ 1356 public static final UnicodeBlock MONGOLIAN = 1357 new UnicodeBlock("MONGOLIAN"); 1358 1359 /** 1360 * Constant for the "Braille Patterns" Unicode character block. 1361 * @since 1.4 1362 */ 1363 public static final UnicodeBlock BRAILLE_PATTERNS = 1364 new UnicodeBlock("BRAILLE_PATTERNS", 1365 "BRAILLE PATTERNS", 1366 "BRAILLEPATTERNS"); 1367 1368 /** 1369 * Constant for the "CJK Radicals Supplement" Unicode character block. 1370 * @since 1.4 1371 */ 1372 public static final UnicodeBlock CJK_RADICALS_SUPPLEMENT = 1373 new UnicodeBlock("CJK_RADICALS_SUPPLEMENT", 1374 "CJK RADICALS SUPPLEMENT", 1375 "CJKRADICALSSUPPLEMENT"); 1376 1377 /** 1378 * Constant for the "Kangxi Radicals" Unicode character block. 1379 * @since 1.4 1380 */ 1381 public static final UnicodeBlock KANGXI_RADICALS = 1382 new UnicodeBlock("KANGXI_RADICALS", 1383 "KANGXI RADICALS", 1384 "KANGXIRADICALS"); 1385 1386 /** 1387 * Constant for the "Ideographic Description Characters" Unicode character block. 1388 * @since 1.4 1389 */ 1390 public static final UnicodeBlock IDEOGRAPHIC_DESCRIPTION_CHARACTERS = 1391 new UnicodeBlock("IDEOGRAPHIC_DESCRIPTION_CHARACTERS", 1392 "IDEOGRAPHIC DESCRIPTION CHARACTERS", 1393 "IDEOGRAPHICDESCRIPTIONCHARACTERS"); 1394 1395 /** 1396 * Constant for the "Bopomofo Extended" Unicode character block. 1397 * @since 1.4 1398 */ 1399 public static final UnicodeBlock BOPOMOFO_EXTENDED = 1400 new UnicodeBlock("BOPOMOFO_EXTENDED", 1401 "BOPOMOFO EXTENDED", 1402 "BOPOMOFOEXTENDED"); 1403 1404 /** 1405 * Constant for the "CJK Unified Ideographs Extension A" Unicode character block. 1406 * @since 1.4 1407 */ 1408 public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A = 1409 new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A", 1410 "CJK UNIFIED IDEOGRAPHS EXTENSION A", 1411 "CJKUNIFIEDIDEOGRAPHSEXTENSIONA"); 1412 1413 /** 1414 * Constant for the "Yi Syllables" Unicode character block. 1415 * @since 1.4 1416 */ 1417 public static final UnicodeBlock YI_SYLLABLES = 1418 new UnicodeBlock("YI_SYLLABLES", 1419 "YI SYLLABLES", 1420 "YISYLLABLES"); 1421 1422 /** 1423 * Constant for the "Yi Radicals" Unicode character block. 1424 * @since 1.4 1425 */ 1426 public static final UnicodeBlock YI_RADICALS = 1427 new UnicodeBlock("YI_RADICALS", 1428 "YI RADICALS", 1429 "YIRADICALS"); 1430 1431 /** 1432 * Constant for the "Cyrillic Supplement" Unicode character block. 1433 * This block was previously known as the "Cyrillic Supplementary" block. 1434 * @since 1.5 1435 */ 1436 public static final UnicodeBlock CYRILLIC_SUPPLEMENTARY = 1437 new UnicodeBlock("CYRILLIC_SUPPLEMENTARY", 1438 "CYRILLIC SUPPLEMENTARY", 1439 "CYRILLICSUPPLEMENTARY", 1440 "CYRILLIC SUPPLEMENT", 1441 "CYRILLICSUPPLEMENT"); 1442 1443 /** 1444 * Constant for the "Tagalog" Unicode character block. 1445 * @since 1.5 1446 */ 1447 public static final UnicodeBlock TAGALOG = 1448 new UnicodeBlock("TAGALOG"); 1449 1450 /** 1451 * Constant for the "Hanunoo" Unicode character block. 1452 * @since 1.5 1453 */ 1454 public static final UnicodeBlock HANUNOO = 1455 new UnicodeBlock("HANUNOO"); 1456 1457 /** 1458 * Constant for the "Buhid" Unicode character block. 1459 * @since 1.5 1460 */ 1461 public static final UnicodeBlock BUHID = 1462 new UnicodeBlock("BUHID"); 1463 1464 /** 1465 * Constant for the "Tagbanwa" Unicode character block. 1466 * @since 1.5 1467 */ 1468 public static final UnicodeBlock TAGBANWA = 1469 new UnicodeBlock("TAGBANWA"); 1470 1471 /** 1472 * Constant for the "Limbu" Unicode character block. 1473 * @since 1.5 1474 */ 1475 public static final UnicodeBlock LIMBU = 1476 new UnicodeBlock("LIMBU"); 1477 1478 /** 1479 * Constant for the "Tai Le" Unicode character block. 1480 * @since 1.5 1481 */ 1482 public static final UnicodeBlock TAI_LE = 1483 new UnicodeBlock("TAI_LE", 1484 "TAI LE", 1485 "TAILE"); 1486 1487 /** 1488 * Constant for the "Khmer Symbols" Unicode character block. 1489 * @since 1.5 1490 */ 1491 public static final UnicodeBlock KHMER_SYMBOLS = 1492 new UnicodeBlock("KHMER_SYMBOLS", 1493 "KHMER SYMBOLS", 1494 "KHMERSYMBOLS"); 1495 1496 /** 1497 * Constant for the "Phonetic Extensions" Unicode character block. 1498 * @since 1.5 1499 */ 1500 public static final UnicodeBlock PHONETIC_EXTENSIONS = 1501 new UnicodeBlock("PHONETIC_EXTENSIONS", 1502 "PHONETIC EXTENSIONS", 1503 "PHONETICEXTENSIONS"); 1504 1505 /** 1506 * Constant for the "Miscellaneous Mathematical Symbols-A" Unicode character block. 1507 * @since 1.5 1508 */ 1509 public static final UnicodeBlock MISCELLANEOUS_MATHEMATICAL_SYMBOLS_A = 1510 new UnicodeBlock("MISCELLANEOUS_MATHEMATICAL_SYMBOLS_A", 1511 "MISCELLANEOUS MATHEMATICAL SYMBOLS-A", 1512 "MISCELLANEOUSMATHEMATICALSYMBOLS-A"); 1513 1514 /** 1515 * Constant for the "Supplemental Arrows-A" Unicode character block. 1516 * @since 1.5 1517 */ 1518 public static final UnicodeBlock SUPPLEMENTAL_ARROWS_A = 1519 new UnicodeBlock("SUPPLEMENTAL_ARROWS_A", 1520 "SUPPLEMENTAL ARROWS-A", 1521 "SUPPLEMENTALARROWS-A"); 1522 1523 /** 1524 * Constant for the "Supplemental Arrows-B" Unicode character block. 1525 * @since 1.5 1526 */ 1527 public static final UnicodeBlock SUPPLEMENTAL_ARROWS_B = 1528 new UnicodeBlock("SUPPLEMENTAL_ARROWS_B", 1529 "SUPPLEMENTAL ARROWS-B", 1530 "SUPPLEMENTALARROWS-B"); 1531 1532 /** 1533 * Constant for the "Miscellaneous Mathematical Symbols-B" Unicode 1534 * character block. 1535 * @since 1.5 1536 */ 1537 public static final UnicodeBlock MISCELLANEOUS_MATHEMATICAL_SYMBOLS_B = 1538 new UnicodeBlock("MISCELLANEOUS_MATHEMATICAL_SYMBOLS_B", 1539 "MISCELLANEOUS MATHEMATICAL SYMBOLS-B", 1540 "MISCELLANEOUSMATHEMATICALSYMBOLS-B"); 1541 1542 /** 1543 * Constant for the "Supplemental Mathematical Operators" Unicode 1544 * character block. 1545 * @since 1.5 1546 */ 1547 public static final UnicodeBlock SUPPLEMENTAL_MATHEMATICAL_OPERATORS = 1548 new UnicodeBlock("SUPPLEMENTAL_MATHEMATICAL_OPERATORS", 1549 "SUPPLEMENTAL MATHEMATICAL OPERATORS", 1550 "SUPPLEMENTALMATHEMATICALOPERATORS"); 1551 1552 /** 1553 * Constant for the "Miscellaneous Symbols and Arrows" Unicode character 1554 * block. 1555 * @since 1.5 1556 */ 1557 public static final UnicodeBlock MISCELLANEOUS_SYMBOLS_AND_ARROWS = 1558 new UnicodeBlock("MISCELLANEOUS_SYMBOLS_AND_ARROWS", 1559 "MISCELLANEOUS SYMBOLS AND ARROWS", 1560 "MISCELLANEOUSSYMBOLSANDARROWS"); 1561 1562 /** 1563 * Constant for the "Katakana Phonetic Extensions" Unicode character 1564 * block. 1565 * @since 1.5 1566 */ 1567 public static final UnicodeBlock KATAKANA_PHONETIC_EXTENSIONS = 1568 new UnicodeBlock("KATAKANA_PHONETIC_EXTENSIONS", 1569 "KATAKANA PHONETIC EXTENSIONS", 1570 "KATAKANAPHONETICEXTENSIONS"); 1571 1572 /** 1573 * Constant for the "Yijing Hexagram Symbols" Unicode character block. 1574 * @since 1.5 1575 */ 1576 public static final UnicodeBlock YIJING_HEXAGRAM_SYMBOLS = 1577 new UnicodeBlock("YIJING_HEXAGRAM_SYMBOLS", 1578 "YIJING HEXAGRAM SYMBOLS", 1579 "YIJINGHEXAGRAMSYMBOLS"); 1580 1581 /** 1582 * Constant for the "Variation Selectors" Unicode character block. 1583 * @since 1.5 1584 */ 1585 public static final UnicodeBlock VARIATION_SELECTORS = 1586 new UnicodeBlock("VARIATION_SELECTORS", 1587 "VARIATION SELECTORS", 1588 "VARIATIONSELECTORS"); 1589 1590 /** 1591 * Constant for the "Linear B Syllabary" Unicode character block. 1592 * @since 1.5 1593 */ 1594 public static final UnicodeBlock LINEAR_B_SYLLABARY = 1595 new UnicodeBlock("LINEAR_B_SYLLABARY", 1596 "LINEAR B SYLLABARY", 1597 "LINEARBSYLLABARY"); 1598 1599 /** 1600 * Constant for the "Linear B Ideograms" Unicode character block. 1601 * @since 1.5 1602 */ 1603 public static final UnicodeBlock LINEAR_B_IDEOGRAMS = 1604 new UnicodeBlock("LINEAR_B_IDEOGRAMS", 1605 "LINEAR B IDEOGRAMS", 1606 "LINEARBIDEOGRAMS"); 1607 1608 /** 1609 * Constant for the "Aegean Numbers" Unicode character block. 1610 * @since 1.5 1611 */ 1612 public static final UnicodeBlock AEGEAN_NUMBERS = 1613 new UnicodeBlock("AEGEAN_NUMBERS", 1614 "AEGEAN NUMBERS", 1615 "AEGEANNUMBERS"); 1616 1617 /** 1618 * Constant for the "Old Italic" Unicode character block. 1619 * @since 1.5 1620 */ 1621 public static final UnicodeBlock OLD_ITALIC = 1622 new UnicodeBlock("OLD_ITALIC", 1623 "OLD ITALIC", 1624 "OLDITALIC"); 1625 1626 /** 1627 * Constant for the "Gothic" Unicode character block. 1628 * @since 1.5 1629 */ 1630 public static final UnicodeBlock GOTHIC = 1631 new UnicodeBlock("GOTHIC"); 1632 1633 /** 1634 * Constant for the "Ugaritic" Unicode character block. 1635 * @since 1.5 1636 */ 1637 public static final UnicodeBlock UGARITIC = 1638 new UnicodeBlock("UGARITIC"); 1639 1640 /** 1641 * Constant for the "Deseret" Unicode character block. 1642 * @since 1.5 1643 */ 1644 public static final UnicodeBlock DESERET = 1645 new UnicodeBlock("DESERET"); 1646 1647 /** 1648 * Constant for the "Shavian" Unicode character block. 1649 * @since 1.5 1650 */ 1651 public static final UnicodeBlock SHAVIAN = 1652 new UnicodeBlock("SHAVIAN"); 1653 1654 /** 1655 * Constant for the "Osmanya" Unicode character block. 1656 * @since 1.5 1657 */ 1658 public static final UnicodeBlock OSMANYA = 1659 new UnicodeBlock("OSMANYA"); 1660 1661 /** 1662 * Constant for the "Cypriot Syllabary" Unicode character block. 1663 * @since 1.5 1664 */ 1665 public static final UnicodeBlock CYPRIOT_SYLLABARY = 1666 new UnicodeBlock("CYPRIOT_SYLLABARY", 1667 "CYPRIOT SYLLABARY", 1668 "CYPRIOTSYLLABARY"); 1669 1670 /** 1671 * Constant for the "Byzantine Musical Symbols" Unicode character block. 1672 * @since 1.5 1673 */ 1674 public static final UnicodeBlock BYZANTINE_MUSICAL_SYMBOLS = 1675 new UnicodeBlock("BYZANTINE_MUSICAL_SYMBOLS", 1676 "BYZANTINE MUSICAL SYMBOLS", 1677 "BYZANTINEMUSICALSYMBOLS"); 1678 1679 /** 1680 * Constant for the "Musical Symbols" Unicode character block. 1681 * @since 1.5 1682 */ 1683 public static final UnicodeBlock MUSICAL_SYMBOLS = 1684 new UnicodeBlock("MUSICAL_SYMBOLS", 1685 "MUSICAL SYMBOLS", 1686 "MUSICALSYMBOLS"); 1687 1688 /** 1689 * Constant for the "Tai Xuan Jing Symbols" Unicode character block. 1690 * @since 1.5 1691 */ 1692 public static final UnicodeBlock TAI_XUAN_JING_SYMBOLS = 1693 new UnicodeBlock("TAI_XUAN_JING_SYMBOLS", 1694 "TAI XUAN JING SYMBOLS", 1695 "TAIXUANJINGSYMBOLS"); 1696 1697 /** 1698 * Constant for the "Mathematical Alphanumeric Symbols" Unicode 1699 * character block. 1700 * @since 1.5 1701 */ 1702 public static final UnicodeBlock MATHEMATICAL_ALPHANUMERIC_SYMBOLS = 1703 new UnicodeBlock("MATHEMATICAL_ALPHANUMERIC_SYMBOLS", 1704 "MATHEMATICAL ALPHANUMERIC SYMBOLS", 1705 "MATHEMATICALALPHANUMERICSYMBOLS"); 1706 1707 /** 1708 * Constant for the "CJK Unified Ideographs Extension B" Unicode 1709 * character block. 1710 * @since 1.5 1711 */ 1712 public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B = 1713 new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B", 1714 "CJK UNIFIED IDEOGRAPHS EXTENSION B", 1715 "CJKUNIFIEDIDEOGRAPHSEXTENSIONB"); 1716 1717 /** 1718 * Constant for the "CJK Compatibility Ideographs Supplement" Unicode character block. 1719 * @since 1.5 1720 */ 1721 public static final UnicodeBlock CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT = 1722 new UnicodeBlock("CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT", 1723 "CJK COMPATIBILITY IDEOGRAPHS SUPPLEMENT", 1724 "CJKCOMPATIBILITYIDEOGRAPHSSUPPLEMENT"); 1725 1726 /** 1727 * Constant for the "Tags" Unicode character block. 1728 * @since 1.5 1729 */ 1730 public static final UnicodeBlock TAGS = 1731 new UnicodeBlock("TAGS"); 1732 1733 /** 1734 * Constant for the "Variation Selectors Supplement" Unicode character 1735 * block. 1736 * @since 1.5 1737 */ 1738 public static final UnicodeBlock VARIATION_SELECTORS_SUPPLEMENT = 1739 new UnicodeBlock("VARIATION_SELECTORS_SUPPLEMENT", 1740 "VARIATION SELECTORS SUPPLEMENT", 1741 "VARIATIONSELECTORSSUPPLEMENT"); 1742 1743 /** 1744 * Constant for the "Supplementary Private Use Area-A" Unicode character 1745 * block. 1746 * @since 1.5 1747 */ 1748 public static final UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_A = 1749 new UnicodeBlock("SUPPLEMENTARY_PRIVATE_USE_AREA_A", 1750 "SUPPLEMENTARY PRIVATE USE AREA-A", 1751 "SUPPLEMENTARYPRIVATEUSEAREA-A"); 1752 1753 /** 1754 * Constant for the "Supplementary Private Use Area-B" Unicode character 1755 * block. 1756 * @since 1.5 1757 */ 1758 public static final UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_B = 1759 new UnicodeBlock("SUPPLEMENTARY_PRIVATE_USE_AREA_B", 1760 "SUPPLEMENTARY PRIVATE USE AREA-B", 1761 "SUPPLEMENTARYPRIVATEUSEAREA-B"); 1762 1763 /** 1764 * Constant for the "High Surrogates" Unicode character block. 1765 * This block represents codepoint values in the high surrogate 1766 * range: U+D800 through U+DB7F 1767 * 1768 * @since 1.5 1769 */ 1770 public static final UnicodeBlock HIGH_SURROGATES = 1771 new UnicodeBlock("HIGH_SURROGATES", 1772 "HIGH SURROGATES", 1773 "HIGHSURROGATES"); 1774 1775 /** 1776 * Constant for the "High Private Use Surrogates" Unicode character 1777 * block. 1778 * This block represents codepoint values in the private use high 1779 * surrogate range: U+DB80 through U+DBFF 1780 * 1781 * @since 1.5 1782 */ 1783 public static final UnicodeBlock HIGH_PRIVATE_USE_SURROGATES = 1784 new UnicodeBlock("HIGH_PRIVATE_USE_SURROGATES", 1785 "HIGH PRIVATE USE SURROGATES", 1786 "HIGHPRIVATEUSESURROGATES"); 1787 1788 /** 1789 * Constant for the "Low Surrogates" Unicode character block. 1790 * This block represents codepoint values in the low surrogate 1791 * range: U+DC00 through U+DFFF 1792 * 1793 * @since 1.5 1794 */ 1795 public static final UnicodeBlock LOW_SURROGATES = 1796 new UnicodeBlock("LOW_SURROGATES", 1797 "LOW SURROGATES", 1798 "LOWSURROGATES"); 1799 1800 /** 1801 * Constant for the "Arabic Supplement" Unicode character block. 1802 * @since 1.7 1803 */ 1804 public static final UnicodeBlock ARABIC_SUPPLEMENT = 1805 new UnicodeBlock("ARABIC_SUPPLEMENT", 1806 "ARABIC SUPPLEMENT", 1807 "ARABICSUPPLEMENT"); 1808 1809 /** 1810 * Constant for the "NKo" Unicode character block. 1811 * @since 1.7 1812 */ 1813 public static final UnicodeBlock NKO = 1814 new UnicodeBlock("NKO"); 1815 1816 /** 1817 * Constant for the "Samaritan" Unicode character block. 1818 * @since 1.7 1819 */ 1820 public static final UnicodeBlock SAMARITAN = 1821 new UnicodeBlock("SAMARITAN"); 1822 1823 /** 1824 * Constant for the "Mandaic" Unicode character block. 1825 * @since 1.7 1826 */ 1827 public static final UnicodeBlock MANDAIC = 1828 new UnicodeBlock("MANDAIC"); 1829 1830 /** 1831 * Constant for the "Ethiopic Supplement" Unicode character block. 1832 * @since 1.7 1833 */ 1834 public static final UnicodeBlock ETHIOPIC_SUPPLEMENT = 1835 new UnicodeBlock("ETHIOPIC_SUPPLEMENT", 1836 "ETHIOPIC SUPPLEMENT", 1837 "ETHIOPICSUPPLEMENT"); 1838 1839 /** 1840 * Constant for the "Unified Canadian Aboriginal Syllabics Extended" 1841 * Unicode character block. 1842 * @since 1.7 1843 */ 1844 public static final UnicodeBlock UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS_EXTENDED = 1845 new UnicodeBlock("UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS_EXTENDED", 1846 "UNIFIED CANADIAN ABORIGINAL SYLLABICS EXTENDED", 1847 "UNIFIEDCANADIANABORIGINALSYLLABICSEXTENDED"); 1848 1849 /** 1850 * Constant for the "New Tai Lue" Unicode character block. 1851 * @since 1.7 1852 */ 1853 public static final UnicodeBlock NEW_TAI_LUE = 1854 new UnicodeBlock("NEW_TAI_LUE", 1855 "NEW TAI LUE", 1856 "NEWTAILUE"); 1857 1858 /** 1859 * Constant for the "Buginese" Unicode character block. 1860 * @since 1.7 1861 */ 1862 public static final UnicodeBlock BUGINESE = 1863 new UnicodeBlock("BUGINESE"); 1864 1865 /** 1866 * Constant for the "Tai Tham" Unicode character block. 1867 * @since 1.7 1868 */ 1869 public static final UnicodeBlock TAI_THAM = 1870 new UnicodeBlock("TAI_THAM", 1871 "TAI THAM", 1872 "TAITHAM"); 1873 1874 /** 1875 * Constant for the "Balinese" Unicode character block. 1876 * @since 1.7 1877 */ 1878 public static final UnicodeBlock BALINESE = 1879 new UnicodeBlock("BALINESE"); 1880 1881 /** 1882 * Constant for the "Sundanese" Unicode character block. 1883 * @since 1.7 1884 */ 1885 public static final UnicodeBlock SUNDANESE = 1886 new UnicodeBlock("SUNDANESE"); 1887 1888 /** 1889 * Constant for the "Batak" Unicode character block. 1890 * @since 1.7 1891 */ 1892 public static final UnicodeBlock BATAK = 1893 new UnicodeBlock("BATAK"); 1894 1895 /** 1896 * Constant for the "Lepcha" Unicode character block. 1897 * @since 1.7 1898 */ 1899 public static final UnicodeBlock LEPCHA = 1900 new UnicodeBlock("LEPCHA"); 1901 1902 /** 1903 * Constant for the "Ol Chiki" Unicode character block. 1904 * @since 1.7 1905 */ 1906 public static final UnicodeBlock OL_CHIKI = 1907 new UnicodeBlock("OL_CHIKI", 1908 "OL CHIKI", 1909 "OLCHIKI"); 1910 1911 /** 1912 * Constant for the "Vedic Extensions" Unicode character block. 1913 * @since 1.7 1914 */ 1915 public static final UnicodeBlock VEDIC_EXTENSIONS = 1916 new UnicodeBlock("VEDIC_EXTENSIONS", 1917 "VEDIC EXTENSIONS", 1918 "VEDICEXTENSIONS"); 1919 1920 /** 1921 * Constant for the "Phonetic Extensions Supplement" Unicode character 1922 * block. 1923 * @since 1.7 1924 */ 1925 public static final UnicodeBlock PHONETIC_EXTENSIONS_SUPPLEMENT = 1926 new UnicodeBlock("PHONETIC_EXTENSIONS_SUPPLEMENT", 1927 "PHONETIC EXTENSIONS SUPPLEMENT", 1928 "PHONETICEXTENSIONSSUPPLEMENT"); 1929 1930 /** 1931 * Constant for the "Combining Diacritical Marks Supplement" Unicode 1932 * character block. 1933 * @since 1.7 1934 */ 1935 public static final UnicodeBlock COMBINING_DIACRITICAL_MARKS_SUPPLEMENT = 1936 new UnicodeBlock("COMBINING_DIACRITICAL_MARKS_SUPPLEMENT", 1937 "COMBINING DIACRITICAL MARKS SUPPLEMENT", 1938 "COMBININGDIACRITICALMARKSSUPPLEMENT"); 1939 1940 /** 1941 * Constant for the "Glagolitic" Unicode character block. 1942 * @since 1.7 1943 */ 1944 public static final UnicodeBlock GLAGOLITIC = 1945 new UnicodeBlock("GLAGOLITIC"); 1946 1947 /** 1948 * Constant for the "Latin Extended-C" Unicode character block. 1949 * @since 1.7 1950 */ 1951 public static final UnicodeBlock LATIN_EXTENDED_C = 1952 new UnicodeBlock("LATIN_EXTENDED_C", 1953 "LATIN EXTENDED-C", 1954 "LATINEXTENDED-C"); 1955 1956 /** 1957 * Constant for the "Coptic" Unicode character block. 1958 * @since 1.7 1959 */ 1960 public static final UnicodeBlock COPTIC = 1961 new UnicodeBlock("COPTIC"); 1962 1963 /** 1964 * Constant for the "Georgian Supplement" Unicode character block. 1965 * @since 1.7 1966 */ 1967 public static final UnicodeBlock GEORGIAN_SUPPLEMENT = 1968 new UnicodeBlock("GEORGIAN_SUPPLEMENT", 1969 "GEORGIAN SUPPLEMENT", 1970 "GEORGIANSUPPLEMENT"); 1971 1972 /** 1973 * Constant for the "Tifinagh" Unicode character block. 1974 * @since 1.7 1975 */ 1976 public static final UnicodeBlock TIFINAGH = 1977 new UnicodeBlock("TIFINAGH"); 1978 1979 /** 1980 * Constant for the "Ethiopic Extended" Unicode character block. 1981 * @since 1.7 1982 */ 1983 public static final UnicodeBlock ETHIOPIC_EXTENDED = 1984 new UnicodeBlock("ETHIOPIC_EXTENDED", 1985 "ETHIOPIC EXTENDED", 1986 "ETHIOPICEXTENDED"); 1987 1988 /** 1989 * Constant for the "Cyrillic Extended-A" Unicode character block. 1990 * @since 1.7 1991 */ 1992 public static final UnicodeBlock CYRILLIC_EXTENDED_A = 1993 new UnicodeBlock("CYRILLIC_EXTENDED_A", 1994 "CYRILLIC EXTENDED-A", 1995 "CYRILLICEXTENDED-A"); 1996 1997 /** 1998 * Constant for the "Supplemental Punctuation" Unicode character block. 1999 * @since 1.7 2000 */ 2001 public static final UnicodeBlock SUPPLEMENTAL_PUNCTUATION = 2002 new UnicodeBlock("SUPPLEMENTAL_PUNCTUATION", 2003 "SUPPLEMENTAL PUNCTUATION", 2004 "SUPPLEMENTALPUNCTUATION"); 2005 2006 /** 2007 * Constant for the "CJK Strokes" Unicode character block. 2008 * @since 1.7 2009 */ 2010 public static final UnicodeBlock CJK_STROKES = 2011 new UnicodeBlock("CJK_STROKES", 2012 "CJK STROKES", 2013 "CJKSTROKES"); 2014 2015 /** 2016 * Constant for the "Lisu" Unicode character block. 2017 * @since 1.7 2018 */ 2019 public static final UnicodeBlock LISU = 2020 new UnicodeBlock("LISU"); 2021 2022 /** 2023 * Constant for the "Vai" Unicode character block. 2024 * @since 1.7 2025 */ 2026 public static final UnicodeBlock VAI = 2027 new UnicodeBlock("VAI"); 2028 2029 /** 2030 * Constant for the "Cyrillic Extended-B" Unicode character block. 2031 * @since 1.7 2032 */ 2033 public static final UnicodeBlock CYRILLIC_EXTENDED_B = 2034 new UnicodeBlock("CYRILLIC_EXTENDED_B", 2035 "CYRILLIC EXTENDED-B", 2036 "CYRILLICEXTENDED-B"); 2037 2038 /** 2039 * Constant for the "Bamum" Unicode character block. 2040 * @since 1.7 2041 */ 2042 public static final UnicodeBlock BAMUM = 2043 new UnicodeBlock("BAMUM"); 2044 2045 /** 2046 * Constant for the "Modifier Tone Letters" Unicode character block. 2047 * @since 1.7 2048 */ 2049 public static final UnicodeBlock MODIFIER_TONE_LETTERS = 2050 new UnicodeBlock("MODIFIER_TONE_LETTERS", 2051 "MODIFIER TONE LETTERS", 2052 "MODIFIERTONELETTERS"); 2053 2054 /** 2055 * Constant for the "Latin Extended-D" Unicode character block. 2056 * @since 1.7 2057 */ 2058 public static final UnicodeBlock LATIN_EXTENDED_D = 2059 new UnicodeBlock("LATIN_EXTENDED_D", 2060 "LATIN EXTENDED-D", 2061 "LATINEXTENDED-D"); 2062 2063 /** 2064 * Constant for the "Syloti Nagri" Unicode character block. 2065 * @since 1.7 2066 */ 2067 public static final UnicodeBlock SYLOTI_NAGRI = 2068 new UnicodeBlock("SYLOTI_NAGRI", 2069 "SYLOTI NAGRI", 2070 "SYLOTINAGRI"); 2071 2072 /** 2073 * Constant for the "Common Indic Number Forms" Unicode character block. 2074 * @since 1.7 2075 */ 2076 public static final UnicodeBlock COMMON_INDIC_NUMBER_FORMS = 2077 new UnicodeBlock("COMMON_INDIC_NUMBER_FORMS", 2078 "COMMON INDIC NUMBER FORMS", 2079 "COMMONINDICNUMBERFORMS"); 2080 2081 /** 2082 * Constant for the "Phags-pa" Unicode character block. 2083 * @since 1.7 2084 */ 2085 public static final UnicodeBlock PHAGS_PA = 2086 new UnicodeBlock("PHAGS_PA", 2087 "PHAGS-PA"); 2088 2089 /** 2090 * Constant for the "Saurashtra" Unicode character block. 2091 * @since 1.7 2092 */ 2093 public static final UnicodeBlock SAURASHTRA = 2094 new UnicodeBlock("SAURASHTRA"); 2095 2096 /** 2097 * Constant for the "Devanagari Extended" Unicode character block. 2098 * @since 1.7 2099 */ 2100 public static final UnicodeBlock DEVANAGARI_EXTENDED = 2101 new UnicodeBlock("DEVANAGARI_EXTENDED", 2102 "DEVANAGARI EXTENDED", 2103 "DEVANAGARIEXTENDED"); 2104 2105 /** 2106 * Constant for the "Kayah Li" Unicode character block. 2107 * @since 1.7 2108 */ 2109 public static final UnicodeBlock KAYAH_LI = 2110 new UnicodeBlock("KAYAH_LI", 2111 "KAYAH LI", 2112 "KAYAHLI"); 2113 2114 /** 2115 * Constant for the "Rejang" Unicode character block. 2116 * @since 1.7 2117 */ 2118 public static final UnicodeBlock REJANG = 2119 new UnicodeBlock("REJANG"); 2120 2121 /** 2122 * Constant for the "Hangul Jamo Extended-A" Unicode character block. 2123 * @since 1.7 2124 */ 2125 public static final UnicodeBlock HANGUL_JAMO_EXTENDED_A = 2126 new UnicodeBlock("HANGUL_JAMO_EXTENDED_A", 2127 "HANGUL JAMO EXTENDED-A", 2128 "HANGULJAMOEXTENDED-A"); 2129 2130 /** 2131 * Constant for the "Javanese" Unicode character block. 2132 * @since 1.7 2133 */ 2134 public static final UnicodeBlock JAVANESE = 2135 new UnicodeBlock("JAVANESE"); 2136 2137 /** 2138 * Constant for the "Cham" Unicode character block. 2139 * @since 1.7 2140 */ 2141 public static final UnicodeBlock CHAM = 2142 new UnicodeBlock("CHAM"); 2143 2144 /** 2145 * Constant for the "Myanmar Extended-A" Unicode character block. 2146 * @since 1.7 2147 */ 2148 public static final UnicodeBlock MYANMAR_EXTENDED_A = 2149 new UnicodeBlock("MYANMAR_EXTENDED_A", 2150 "MYANMAR EXTENDED-A", 2151 "MYANMAREXTENDED-A"); 2152 2153 /** 2154 * Constant for the "Tai Viet" Unicode character block. 2155 * @since 1.7 2156 */ 2157 public static final UnicodeBlock TAI_VIET = 2158 new UnicodeBlock("TAI_VIET", 2159 "TAI VIET", 2160 "TAIVIET"); 2161 2162 /** 2163 * Constant for the "Ethiopic Extended-A" Unicode character block. 2164 * @since 1.7 2165 */ 2166 public static final UnicodeBlock ETHIOPIC_EXTENDED_A = 2167 new UnicodeBlock("ETHIOPIC_EXTENDED_A", 2168 "ETHIOPIC EXTENDED-A", 2169 "ETHIOPICEXTENDED-A"); 2170 2171 /** 2172 * Constant for the "Meetei Mayek" Unicode character block. 2173 * @since 1.7 2174 */ 2175 public static final UnicodeBlock MEETEI_MAYEK = 2176 new UnicodeBlock("MEETEI_MAYEK", 2177 "MEETEI MAYEK", 2178 "MEETEIMAYEK"); 2179 2180 /** 2181 * Constant for the "Hangul Jamo Extended-B" Unicode character block. 2182 * @since 1.7 2183 */ 2184 public static final UnicodeBlock HANGUL_JAMO_EXTENDED_B = 2185 new UnicodeBlock("HANGUL_JAMO_EXTENDED_B", 2186 "HANGUL JAMO EXTENDED-B", 2187 "HANGULJAMOEXTENDED-B"); 2188 2189 /** 2190 * Constant for the "Vertical Forms" Unicode character block. 2191 * @since 1.7 2192 */ 2193 public static final UnicodeBlock VERTICAL_FORMS = 2194 new UnicodeBlock("VERTICAL_FORMS", 2195 "VERTICAL FORMS", 2196 "VERTICALFORMS"); 2197 2198 /** 2199 * Constant for the "Ancient Greek Numbers" Unicode character block. 2200 * @since 1.7 2201 */ 2202 public static final UnicodeBlock ANCIENT_GREEK_NUMBERS = 2203 new UnicodeBlock("ANCIENT_GREEK_NUMBERS", 2204 "ANCIENT GREEK NUMBERS", 2205 "ANCIENTGREEKNUMBERS"); 2206 2207 /** 2208 * Constant for the "Ancient Symbols" Unicode character block. 2209 * @since 1.7 2210 */ 2211 public static final UnicodeBlock ANCIENT_SYMBOLS = 2212 new UnicodeBlock("ANCIENT_SYMBOLS", 2213 "ANCIENT SYMBOLS", 2214 "ANCIENTSYMBOLS"); 2215 2216 /** 2217 * Constant for the "Phaistos Disc" Unicode character block. 2218 * @since 1.7 2219 */ 2220 public static final UnicodeBlock PHAISTOS_DISC = 2221 new UnicodeBlock("PHAISTOS_DISC", 2222 "PHAISTOS DISC", 2223 "PHAISTOSDISC"); 2224 2225 /** 2226 * Constant for the "Lycian" Unicode character block. 2227 * @since 1.7 2228 */ 2229 public static final UnicodeBlock LYCIAN = 2230 new UnicodeBlock("LYCIAN"); 2231 2232 /** 2233 * Constant for the "Carian" Unicode character block. 2234 * @since 1.7 2235 */ 2236 public static final UnicodeBlock CARIAN = 2237 new UnicodeBlock("CARIAN"); 2238 2239 /** 2240 * Constant for the "Old Persian" Unicode character block. 2241 * @since 1.7 2242 */ 2243 public static final UnicodeBlock OLD_PERSIAN = 2244 new UnicodeBlock("OLD_PERSIAN", 2245 "OLD PERSIAN", 2246 "OLDPERSIAN"); 2247 2248 /** 2249 * Constant for the "Imperial Aramaic" Unicode character block. 2250 * @since 1.7 2251 */ 2252 public static final UnicodeBlock IMPERIAL_ARAMAIC = 2253 new UnicodeBlock("IMPERIAL_ARAMAIC", 2254 "IMPERIAL ARAMAIC", 2255 "IMPERIALARAMAIC"); 2256 2257 /** 2258 * Constant for the "Phoenician" Unicode character block. 2259 * @since 1.7 2260 */ 2261 public static final UnicodeBlock PHOENICIAN = 2262 new UnicodeBlock("PHOENICIAN"); 2263 2264 /** 2265 * Constant for the "Lydian" Unicode character block. 2266 * @since 1.7 2267 */ 2268 public static final UnicodeBlock LYDIAN = 2269 new UnicodeBlock("LYDIAN"); 2270 2271 /** 2272 * Constant for the "Kharoshthi" Unicode character block. 2273 * @since 1.7 2274 */ 2275 public static final UnicodeBlock KHAROSHTHI = 2276 new UnicodeBlock("KHAROSHTHI"); 2277 2278 /** 2279 * Constant for the "Old South Arabian" Unicode character block. 2280 * @since 1.7 2281 */ 2282 public static final UnicodeBlock OLD_SOUTH_ARABIAN = 2283 new UnicodeBlock("OLD_SOUTH_ARABIAN", 2284 "OLD SOUTH ARABIAN", 2285 "OLDSOUTHARABIAN"); 2286 2287 /** 2288 * Constant for the "Avestan" Unicode character block. 2289 * @since 1.7 2290 */ 2291 public static final UnicodeBlock AVESTAN = 2292 new UnicodeBlock("AVESTAN"); 2293 2294 /** 2295 * Constant for the "Inscriptional Parthian" Unicode character block. 2296 * @since 1.7 2297 */ 2298 public static final UnicodeBlock INSCRIPTIONAL_PARTHIAN = 2299 new UnicodeBlock("INSCRIPTIONAL_PARTHIAN", 2300 "INSCRIPTIONAL PARTHIAN", 2301 "INSCRIPTIONALPARTHIAN"); 2302 2303 /** 2304 * Constant for the "Inscriptional Pahlavi" Unicode character block. 2305 * @since 1.7 2306 */ 2307 public static final UnicodeBlock INSCRIPTIONAL_PAHLAVI = 2308 new UnicodeBlock("INSCRIPTIONAL_PAHLAVI", 2309 "INSCRIPTIONAL PAHLAVI", 2310 "INSCRIPTIONALPAHLAVI"); 2311 2312 /** 2313 * Constant for the "Old Turkic" Unicode character block. 2314 * @since 1.7 2315 */ 2316 public static final UnicodeBlock OLD_TURKIC = 2317 new UnicodeBlock("OLD_TURKIC", 2318 "OLD TURKIC", 2319 "OLDTURKIC"); 2320 2321 /** 2322 * Constant for the "Rumi Numeral Symbols" Unicode character block. 2323 * @since 1.7 2324 */ 2325 public static final UnicodeBlock RUMI_NUMERAL_SYMBOLS = 2326 new UnicodeBlock("RUMI_NUMERAL_SYMBOLS", 2327 "RUMI NUMERAL SYMBOLS", 2328 "RUMINUMERALSYMBOLS"); 2329 2330 /** 2331 * Constant for the "Brahmi" Unicode character block. 2332 * @since 1.7 2333 */ 2334 public static final UnicodeBlock BRAHMI = 2335 new UnicodeBlock("BRAHMI"); 2336 2337 /** 2338 * Constant for the "Kaithi" Unicode character block. 2339 * @since 1.7 2340 */ 2341 public static final UnicodeBlock KAITHI = 2342 new UnicodeBlock("KAITHI"); 2343 2344 /** 2345 * Constant for the "Cuneiform" Unicode character block. 2346 * @since 1.7 2347 */ 2348 public static final UnicodeBlock CUNEIFORM = 2349 new UnicodeBlock("CUNEIFORM"); 2350 2351 /** 2352 * Constant for the "Cuneiform Numbers and Punctuation" Unicode 2353 * character block. 2354 * @since 1.7 2355 */ 2356 public static final UnicodeBlock CUNEIFORM_NUMBERS_AND_PUNCTUATION = 2357 new UnicodeBlock("CUNEIFORM_NUMBERS_AND_PUNCTUATION", 2358 "CUNEIFORM NUMBERS AND PUNCTUATION", 2359 "CUNEIFORMNUMBERSANDPUNCTUATION"); 2360 2361 /** 2362 * Constant for the "Egyptian Hieroglyphs" Unicode character block. 2363 * @since 1.7 2364 */ 2365 public static final UnicodeBlock EGYPTIAN_HIEROGLYPHS = 2366 new UnicodeBlock("EGYPTIAN_HIEROGLYPHS", 2367 "EGYPTIAN HIEROGLYPHS", 2368 "EGYPTIANHIEROGLYPHS"); 2369 2370 /** 2371 * Constant for the "Bamum Supplement" Unicode character block. 2372 * @since 1.7 2373 */ 2374 public static final UnicodeBlock BAMUM_SUPPLEMENT = 2375 new UnicodeBlock("BAMUM_SUPPLEMENT", 2376 "BAMUM SUPPLEMENT", 2377 "BAMUMSUPPLEMENT"); 2378 2379 /** 2380 * Constant for the "Kana Supplement" Unicode character block. 2381 * @since 1.7 2382 */ 2383 public static final UnicodeBlock KANA_SUPPLEMENT = 2384 new UnicodeBlock("KANA_SUPPLEMENT", 2385 "KANA SUPPLEMENT", 2386 "KANASUPPLEMENT"); 2387 2388 /** 2389 * Constant for the "Ancient Greek Musical Notation" Unicode character 2390 * block. 2391 * @since 1.7 2392 */ 2393 public static final UnicodeBlock ANCIENT_GREEK_MUSICAL_NOTATION = 2394 new UnicodeBlock("ANCIENT_GREEK_MUSICAL_NOTATION", 2395 "ANCIENT GREEK MUSICAL NOTATION", 2396 "ANCIENTGREEKMUSICALNOTATION"); 2397 2398 /** 2399 * Constant for the "Counting Rod Numerals" Unicode character block. 2400 * @since 1.7 2401 */ 2402 public static final UnicodeBlock COUNTING_ROD_NUMERALS = 2403 new UnicodeBlock("COUNTING_ROD_NUMERALS", 2404 "COUNTING ROD NUMERALS", 2405 "COUNTINGRODNUMERALS"); 2406 2407 /** 2408 * Constant for the "Mahjong Tiles" Unicode character block. 2409 * @since 1.7 2410 */ 2411 public static final UnicodeBlock MAHJONG_TILES = 2412 new UnicodeBlock("MAHJONG_TILES", 2413 "MAHJONG TILES", 2414 "MAHJONGTILES"); 2415 2416 /** 2417 * Constant for the "Domino Tiles" Unicode character block. 2418 * @since 1.7 2419 */ 2420 public static final UnicodeBlock DOMINO_TILES = 2421 new UnicodeBlock("DOMINO_TILES", 2422 "DOMINO TILES", 2423 "DOMINOTILES"); 2424 2425 /** 2426 * Constant for the "Playing Cards" Unicode character block. 2427 * @since 1.7 2428 */ 2429 public static final UnicodeBlock PLAYING_CARDS = 2430 new UnicodeBlock("PLAYING_CARDS", 2431 "PLAYING CARDS", 2432 "PLAYINGCARDS"); 2433 2434 /** 2435 * Constant for the "Enclosed Alphanumeric Supplement" Unicode character 2436 * block. 2437 * @since 1.7 2438 */ 2439 public static final UnicodeBlock ENCLOSED_ALPHANUMERIC_SUPPLEMENT = 2440 new UnicodeBlock("ENCLOSED_ALPHANUMERIC_SUPPLEMENT", 2441 "ENCLOSED ALPHANUMERIC SUPPLEMENT", 2442 "ENCLOSEDALPHANUMERICSUPPLEMENT"); 2443 2444 /** 2445 * Constant for the "Enclosed Ideographic Supplement" Unicode character 2446 * block. 2447 * @since 1.7 2448 */ 2449 public static final UnicodeBlock ENCLOSED_IDEOGRAPHIC_SUPPLEMENT = 2450 new UnicodeBlock("ENCLOSED_IDEOGRAPHIC_SUPPLEMENT", 2451 "ENCLOSED IDEOGRAPHIC SUPPLEMENT", 2452 "ENCLOSEDIDEOGRAPHICSUPPLEMENT"); 2453 2454 /** 2455 * Constant for the "Miscellaneous Symbols And Pictographs" Unicode 2456 * character block. 2457 * @since 1.7 2458 */ 2459 public static final UnicodeBlock MISCELLANEOUS_SYMBOLS_AND_PICTOGRAPHS = 2460 new UnicodeBlock("MISCELLANEOUS_SYMBOLS_AND_PICTOGRAPHS", 2461 "MISCELLANEOUS SYMBOLS AND PICTOGRAPHS", 2462 "MISCELLANEOUSSYMBOLSANDPICTOGRAPHS"); 2463 2464 /** 2465 * Constant for the "Emoticons" Unicode character block. 2466 * @since 1.7 2467 */ 2468 public static final UnicodeBlock EMOTICONS = 2469 new UnicodeBlock("EMOTICONS"); 2470 2471 /** 2472 * Constant for the "Transport And Map Symbols" Unicode character block. 2473 * @since 1.7 2474 */ 2475 public static final UnicodeBlock TRANSPORT_AND_MAP_SYMBOLS = 2476 new UnicodeBlock("TRANSPORT_AND_MAP_SYMBOLS", 2477 "TRANSPORT AND MAP SYMBOLS", 2478 "TRANSPORTANDMAPSYMBOLS"); 2479 2480 /** 2481 * Constant for the "Alchemical Symbols" Unicode character block. 2482 * @since 1.7 2483 */ 2484 public static final UnicodeBlock ALCHEMICAL_SYMBOLS = 2485 new UnicodeBlock("ALCHEMICAL_SYMBOLS", 2486 "ALCHEMICAL SYMBOLS", 2487 "ALCHEMICALSYMBOLS"); 2488 2489 /** 2490 * Constant for the "CJK Unified Ideographs Extension C" Unicode 2491 * character block. 2492 * @since 1.7 2493 */ 2494 public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_C = 2495 new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS_EXTENSION_C", 2496 "CJK UNIFIED IDEOGRAPHS EXTENSION C", 2497 "CJKUNIFIEDIDEOGRAPHSEXTENSIONC"); 2498 2499 /** 2500 * Constant for the "CJK Unified Ideographs Extension D" Unicode 2501 * character block. 2502 * @since 1.7 2503 */ 2504 public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_D = 2505 new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS_EXTENSION_D", 2506 "CJK UNIFIED IDEOGRAPHS EXTENSION D", 2507 "CJKUNIFIEDIDEOGRAPHSEXTENSIOND"); 2508 2509 /** 2510 * Constant for the "Arabic Extended-A" Unicode character block. 2511 * @since 1.8 2512 */ 2513 public static final UnicodeBlock ARABIC_EXTENDED_A = 2514 new UnicodeBlock("ARABIC_EXTENDED_A", 2515 "ARABIC EXTENDED-A", 2516 "ARABICEXTENDED-A"); 2517 2518 /** 2519 * Constant for the "Sundanese Supplement" Unicode character block. 2520 * @since 1.8 2521 */ 2522 public static final UnicodeBlock SUNDANESE_SUPPLEMENT = 2523 new UnicodeBlock("SUNDANESE_SUPPLEMENT", 2524 "SUNDANESE SUPPLEMENT", 2525 "SUNDANESESUPPLEMENT"); 2526 2527 /** 2528 * Constant for the "Meetei Mayek Extensions" Unicode character block. 2529 * @since 1.8 2530 */ 2531 public static final UnicodeBlock MEETEI_MAYEK_EXTENSIONS = 2532 new UnicodeBlock("MEETEI_MAYEK_EXTENSIONS", 2533 "MEETEI MAYEK EXTENSIONS", 2534 "MEETEIMAYEKEXTENSIONS"); 2535 2536 /** 2537 * Constant for the "Meroitic Hieroglyphs" Unicode character block. 2538 * @since 1.8 2539 */ 2540 public static final UnicodeBlock MEROITIC_HIEROGLYPHS = 2541 new UnicodeBlock("MEROITIC_HIEROGLYPHS", 2542 "MEROITIC HIEROGLYPHS", 2543 "MEROITICHIEROGLYPHS"); 2544 2545 /** 2546 * Constant for the "Meroitic Cursive" Unicode character block. 2547 * @since 1.8 2548 */ 2549 public static final UnicodeBlock MEROITIC_CURSIVE = 2550 new UnicodeBlock("MEROITIC_CURSIVE", 2551 "MEROITIC CURSIVE", 2552 "MEROITICCURSIVE"); 2553 2554 /** 2555 * Constant for the "Sora Sompeng" Unicode character block. 2556 * @since 1.8 2557 */ 2558 public static final UnicodeBlock SORA_SOMPENG = 2559 new UnicodeBlock("SORA_SOMPENG", 2560 "SORA SOMPENG", 2561 "SORASOMPENG"); 2562 2563 /** 2564 * Constant for the "Chakma" Unicode character block. 2565 * @since 1.8 2566 */ 2567 public static final UnicodeBlock CHAKMA = 2568 new UnicodeBlock("CHAKMA"); 2569 2570 /** 2571 * Constant for the "Sharada" Unicode character block. 2572 * @since 1.8 2573 */ 2574 public static final UnicodeBlock SHARADA = 2575 new UnicodeBlock("SHARADA"); 2576 2577 /** 2578 * Constant for the "Takri" Unicode character block. 2579 * @since 1.8 2580 */ 2581 public static final UnicodeBlock TAKRI = 2582 new UnicodeBlock("TAKRI"); 2583 2584 /** 2585 * Constant for the "Miao" Unicode character block. 2586 * @since 1.8 2587 */ 2588 public static final UnicodeBlock MIAO = 2589 new UnicodeBlock("MIAO"); 2590 2591 /** 2592 * Constant for the "Arabic Mathematical Alphabetic Symbols" Unicode 2593 * character block. 2594 * @since 1.8 2595 */ 2596 public static final UnicodeBlock ARABIC_MATHEMATICAL_ALPHABETIC_SYMBOLS = 2597 new UnicodeBlock("ARABIC_MATHEMATICAL_ALPHABETIC_SYMBOLS", 2598 "ARABIC MATHEMATICAL ALPHABETIC SYMBOLS", 2599 "ARABICMATHEMATICALALPHABETICSYMBOLS"); 2600 2601 /** 2602 * Constant for the "Combining Diacritical Marks Extended" Unicode 2603 * character block. 2604 * @since 9 2605 */ 2606 public static final UnicodeBlock COMBINING_DIACRITICAL_MARKS_EXTENDED = 2607 new UnicodeBlock("COMBINING_DIACRITICAL_MARKS_EXTENDED", 2608 "COMBINING DIACRITICAL MARKS EXTENDED", 2609 "COMBININGDIACRITICALMARKSEXTENDED"); 2610 2611 /** 2612 * Constant for the "Myanmar Extended-B" Unicode character block. 2613 * @since 9 2614 */ 2615 public static final UnicodeBlock MYANMAR_EXTENDED_B = 2616 new UnicodeBlock("MYANMAR_EXTENDED_B", 2617 "MYANMAR EXTENDED-B", 2618 "MYANMAREXTENDED-B"); 2619 2620 /** 2621 * Constant for the "Latin Extended-E" Unicode character block. 2622 * @since 9 2623 */ 2624 public static final UnicodeBlock LATIN_EXTENDED_E = 2625 new UnicodeBlock("LATIN_EXTENDED_E", 2626 "LATIN EXTENDED-E", 2627 "LATINEXTENDED-E"); 2628 2629 /** 2630 * Constant for the "Coptic Epact Numbers" Unicode character block. 2631 * @since 9 2632 */ 2633 public static final UnicodeBlock COPTIC_EPACT_NUMBERS = 2634 new UnicodeBlock("COPTIC_EPACT_NUMBERS", 2635 "COPTIC EPACT NUMBERS", 2636 "COPTICEPACTNUMBERS"); 2637 2638 /** 2639 * Constant for the "Old Permic" Unicode character block. 2640 * @since 9 2641 */ 2642 public static final UnicodeBlock OLD_PERMIC = 2643 new UnicodeBlock("OLD_PERMIC", 2644 "OLD PERMIC", 2645 "OLDPERMIC"); 2646 2647 /** 2648 * Constant for the "Elbasan" Unicode character block. 2649 * @since 9 2650 */ 2651 public static final UnicodeBlock ELBASAN = 2652 new UnicodeBlock("ELBASAN"); 2653 2654 /** 2655 * Constant for the "Caucasian Albanian" Unicode character block. 2656 * @since 9 2657 */ 2658 public static final UnicodeBlock CAUCASIAN_ALBANIAN = 2659 new UnicodeBlock("CAUCASIAN_ALBANIAN", 2660 "CAUCASIAN ALBANIAN", 2661 "CAUCASIANALBANIAN"); 2662 2663 /** 2664 * Constant for the "Linear A" Unicode character block. 2665 * @since 9 2666 */ 2667 public static final UnicodeBlock LINEAR_A = 2668 new UnicodeBlock("LINEAR_A", 2669 "LINEAR A", 2670 "LINEARA"); 2671 2672 /** 2673 * Constant for the "Palmyrene" Unicode character block. 2674 * @since 9 2675 */ 2676 public static final UnicodeBlock PALMYRENE = 2677 new UnicodeBlock("PALMYRENE"); 2678 2679 /** 2680 * Constant for the "Nabataean" Unicode character block. 2681 * @since 9 2682 */ 2683 public static final UnicodeBlock NABATAEAN = 2684 new UnicodeBlock("NABATAEAN"); 2685 2686 /** 2687 * Constant for the "Old North Arabian" Unicode character block. 2688 * @since 9 2689 */ 2690 public static final UnicodeBlock OLD_NORTH_ARABIAN = 2691 new UnicodeBlock("OLD_NORTH_ARABIAN", 2692 "OLD NORTH ARABIAN", 2693 "OLDNORTHARABIAN"); 2694 2695 /** 2696 * Constant for the "Manichaean" Unicode character block. 2697 * @since 9 2698 */ 2699 public static final UnicodeBlock MANICHAEAN = 2700 new UnicodeBlock("MANICHAEAN"); 2701 2702 /** 2703 * Constant for the "Psalter Pahlavi" Unicode character block. 2704 * @since 9 2705 */ 2706 public static final UnicodeBlock PSALTER_PAHLAVI = 2707 new UnicodeBlock("PSALTER_PAHLAVI", 2708 "PSALTER PAHLAVI", 2709 "PSALTERPAHLAVI"); 2710 2711 /** 2712 * Constant for the "Mahajani" Unicode character block. 2713 * @since 9 2714 */ 2715 public static final UnicodeBlock MAHAJANI = 2716 new UnicodeBlock("MAHAJANI"); 2717 2718 /** 2719 * Constant for the "Sinhala Archaic Numbers" Unicode character block. 2720 * @since 9 2721 */ 2722 public static final UnicodeBlock SINHALA_ARCHAIC_NUMBERS = 2723 new UnicodeBlock("SINHALA_ARCHAIC_NUMBERS", 2724 "SINHALA ARCHAIC NUMBERS", 2725 "SINHALAARCHAICNUMBERS"); 2726 2727 /** 2728 * Constant for the "Khojki" Unicode character block. 2729 * @since 9 2730 */ 2731 public static final UnicodeBlock KHOJKI = 2732 new UnicodeBlock("KHOJKI"); 2733 2734 /** 2735 * Constant for the "Khudawadi" Unicode character block. 2736 * @since 9 2737 */ 2738 public static final UnicodeBlock KHUDAWADI = 2739 new UnicodeBlock("KHUDAWADI"); 2740 2741 /** 2742 * Constant for the "Grantha" Unicode character block. 2743 * @since 9 2744 */ 2745 public static final UnicodeBlock GRANTHA = 2746 new UnicodeBlock("GRANTHA"); 2747 2748 /** 2749 * Constant for the "Tirhuta" Unicode character block. 2750 * @since 9 2751 */ 2752 public static final UnicodeBlock TIRHUTA = 2753 new UnicodeBlock("TIRHUTA"); 2754 2755 /** 2756 * Constant for the "Siddham" Unicode character block. 2757 * @since 9 2758 */ 2759 public static final UnicodeBlock SIDDHAM = 2760 new UnicodeBlock("SIDDHAM"); 2761 2762 /** 2763 * Constant for the "Modi" Unicode character block. 2764 * @since 9 2765 */ 2766 public static final UnicodeBlock MODI = 2767 new UnicodeBlock("MODI"); 2768 2769 /** 2770 * Constant for the "Warang Citi" Unicode character block. 2771 * @since 9 2772 */ 2773 public static final UnicodeBlock WARANG_CITI = 2774 new UnicodeBlock("WARANG_CITI", 2775 "WARANG CITI", 2776 "WARANGCITI"); 2777 2778 /** 2779 * Constant for the "Pau Cin Hau" Unicode character block. 2780 * @since 9 2781 */ 2782 public static final UnicodeBlock PAU_CIN_HAU = 2783 new UnicodeBlock("PAU_CIN_HAU", 2784 "PAU CIN HAU", 2785 "PAUCINHAU"); 2786 2787 /** 2788 * Constant for the "Mro" Unicode character block. 2789 * @since 9 2790 */ 2791 public static final UnicodeBlock MRO = 2792 new UnicodeBlock("MRO"); 2793 2794 /** 2795 * Constant for the "Bassa Vah" Unicode character block. 2796 * @since 9 2797 */ 2798 public static final UnicodeBlock BASSA_VAH = 2799 new UnicodeBlock("BASSA_VAH", 2800 "BASSA VAH", 2801 "BASSAVAH"); 2802 2803 /** 2804 * Constant for the "Pahawh Hmong" Unicode character block. 2805 * @since 9 2806 */ 2807 public static final UnicodeBlock PAHAWH_HMONG = 2808 new UnicodeBlock("PAHAWH_HMONG", 2809 "PAHAWH HMONG", 2810 "PAHAWHHMONG"); 2811 2812 /** 2813 * Constant for the "Duployan" Unicode character block. 2814 * @since 9 2815 */ 2816 public static final UnicodeBlock DUPLOYAN = 2817 new UnicodeBlock("DUPLOYAN"); 2818 2819 /** 2820 * Constant for the "Shorthand Format Controls" Unicode character block. 2821 * @since 9 2822 */ 2823 public static final UnicodeBlock SHORTHAND_FORMAT_CONTROLS = 2824 new UnicodeBlock("SHORTHAND_FORMAT_CONTROLS", 2825 "SHORTHAND FORMAT CONTROLS", 2826 "SHORTHANDFORMATCONTROLS"); 2827 2828 /** 2829 * Constant for the "Mende Kikakui" Unicode character block. 2830 * @since 9 2831 */ 2832 public static final UnicodeBlock MENDE_KIKAKUI = 2833 new UnicodeBlock("MENDE_KIKAKUI", 2834 "MENDE KIKAKUI", 2835 "MENDEKIKAKUI"); 2836 2837 /** 2838 * Constant for the "Ornamental Dingbats" Unicode character block. 2839 * @since 9 2840 */ 2841 public static final UnicodeBlock ORNAMENTAL_DINGBATS = 2842 new UnicodeBlock("ORNAMENTAL_DINGBATS", 2843 "ORNAMENTAL DINGBATS", 2844 "ORNAMENTALDINGBATS"); 2845 2846 /** 2847 * Constant for the "Geometric Shapes Extended" Unicode character block. 2848 * @since 9 2849 */ 2850 public static final UnicodeBlock GEOMETRIC_SHAPES_EXTENDED = 2851 new UnicodeBlock("GEOMETRIC_SHAPES_EXTENDED", 2852 "GEOMETRIC SHAPES EXTENDED", 2853 "GEOMETRICSHAPESEXTENDED"); 2854 2855 /** 2856 * Constant for the "Supplemental Arrows-C" Unicode character block. 2857 * @since 9 2858 */ 2859 public static final UnicodeBlock SUPPLEMENTAL_ARROWS_C = 2860 new UnicodeBlock("SUPPLEMENTAL_ARROWS_C", 2861 "SUPPLEMENTAL ARROWS-C", 2862 "SUPPLEMENTALARROWS-C"); 2863 2864 /** 2865 * Constant for the "Cherokee Supplement" Unicode character block. 2866 * @since 9 2867 */ 2868 public static final UnicodeBlock CHEROKEE_SUPPLEMENT = 2869 new UnicodeBlock("CHEROKEE_SUPPLEMENT", 2870 "CHEROKEE SUPPLEMENT", 2871 "CHEROKEESUPPLEMENT"); 2872 2873 /** 2874 * Constant for the "Hatran" Unicode character block. 2875 * @since 9 2876 */ 2877 public static final UnicodeBlock HATRAN = 2878 new UnicodeBlock("HATRAN"); 2879 2880 /** 2881 * Constant for the "Old Hungarian" Unicode character block. 2882 * @since 9 2883 */ 2884 public static final UnicodeBlock OLD_HUNGARIAN = 2885 new UnicodeBlock("OLD_HUNGARIAN", 2886 "OLD HUNGARIAN", 2887 "OLDHUNGARIAN"); 2888 2889 /** 2890 * Constant for the "Multani" Unicode character block. 2891 * @since 9 2892 */ 2893 public static final UnicodeBlock MULTANI = 2894 new UnicodeBlock("MULTANI"); 2895 2896 /** 2897 * Constant for the "Ahom" Unicode character block. 2898 * @since 9 2899 */ 2900 public static final UnicodeBlock AHOM = 2901 new UnicodeBlock("AHOM"); 2902 2903 /** 2904 * Constant for the "Early Dynastic Cuneiform" Unicode character block. 2905 * @since 9 2906 */ 2907 public static final UnicodeBlock EARLY_DYNASTIC_CUNEIFORM = 2908 new UnicodeBlock("EARLY_DYNASTIC_CUNEIFORM", 2909 "EARLY DYNASTIC CUNEIFORM", 2910 "EARLYDYNASTICCUNEIFORM"); 2911 2912 /** 2913 * Constant for the "Anatolian Hieroglyphs" Unicode character block. 2914 * @since 9 2915 */ 2916 public static final UnicodeBlock ANATOLIAN_HIEROGLYPHS = 2917 new UnicodeBlock("ANATOLIAN_HIEROGLYPHS", 2918 "ANATOLIAN HIEROGLYPHS", 2919 "ANATOLIANHIEROGLYPHS"); 2920 2921 /** 2922 * Constant for the "Sutton SignWriting" Unicode character block. 2923 * @since 9 2924 */ 2925 public static final UnicodeBlock SUTTON_SIGNWRITING = 2926 new UnicodeBlock("SUTTON_SIGNWRITING", 2927 "SUTTON SIGNWRITING", 2928 "SUTTONSIGNWRITING"); 2929 2930 /** 2931 * Constant for the "Supplemental Symbols and Pictographs" Unicode 2932 * character block. 2933 * @since 9 2934 */ 2935 public static final UnicodeBlock SUPPLEMENTAL_SYMBOLS_AND_PICTOGRAPHS = 2936 new UnicodeBlock("SUPPLEMENTAL_SYMBOLS_AND_PICTOGRAPHS", 2937 "SUPPLEMENTAL SYMBOLS AND PICTOGRAPHS", 2938 "SUPPLEMENTALSYMBOLSANDPICTOGRAPHS"); 2939 2940 /** 2941 * Constant for the "CJK Unified Ideographs Extension E" Unicode 2942 * character block. 2943 * @since 9 2944 */ 2945 public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_E = 2946 new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS_EXTENSION_E", 2947 "CJK UNIFIED IDEOGRAPHS EXTENSION E", 2948 "CJKUNIFIEDIDEOGRAPHSEXTENSIONE"); 2949 2950 /** 2951 * Constant for the "Syriac Supplement" Unicode 2952 * character block. 2953 * @since 11 2954 */ 2955 public static final UnicodeBlock SYRIAC_SUPPLEMENT = 2956 new UnicodeBlock("SYRIAC_SUPPLEMENT", 2957 "SYRIAC SUPPLEMENT", 2958 "SYRIACSUPPLEMENT"); 2959 2960 /** 2961 * Constant for the "Cyrillic Extended-C" Unicode 2962 * character block. 2963 * @since 11 2964 */ 2965 public static final UnicodeBlock CYRILLIC_EXTENDED_C = 2966 new UnicodeBlock("CYRILLIC_EXTENDED_C", 2967 "CYRILLIC EXTENDED-C", 2968 "CYRILLICEXTENDED-C"); 2969 2970 /** 2971 * Constant for the "Osage" Unicode 2972 * character block. 2973 * @since 11 2974 */ 2975 public static final UnicodeBlock OSAGE = 2976 new UnicodeBlock("OSAGE"); 2977 2978 /** 2979 * Constant for the "Newa" Unicode 2980 * character block. 2981 * @since 11 2982 */ 2983 public static final UnicodeBlock NEWA = 2984 new UnicodeBlock("NEWA"); 2985 2986 /** 2987 * Constant for the "Mongolian Supplement" Unicode 2988 * character block. 2989 * @since 11 2990 */ 2991 public static final UnicodeBlock MONGOLIAN_SUPPLEMENT = 2992 new UnicodeBlock("MONGOLIAN_SUPPLEMENT", 2993 "MONGOLIAN SUPPLEMENT", 2994 "MONGOLIANSUPPLEMENT"); 2995 2996 /** 2997 * Constant for the "Marchen" Unicode 2998 * character block. 2999 * @since 11 3000 */ 3001 public static final UnicodeBlock MARCHEN = 3002 new UnicodeBlock("MARCHEN"); 3003 3004 /** 3005 * Constant for the "Ideographic Symbols and Punctuation" Unicode 3006 * character block. 3007 * @since 11 3008 */ 3009 public static final UnicodeBlock IDEOGRAPHIC_SYMBOLS_AND_PUNCTUATION = 3010 new UnicodeBlock("IDEOGRAPHIC_SYMBOLS_AND_PUNCTUATION", 3011 "IDEOGRAPHIC SYMBOLS AND PUNCTUATION", 3012 "IDEOGRAPHICSYMBOLSANDPUNCTUATION"); 3013 3014 /** 3015 * Constant for the "Tangut" Unicode 3016 * character block. 3017 * @since 11 3018 */ 3019 public static final UnicodeBlock TANGUT = 3020 new UnicodeBlock("TANGUT"); 3021 3022 /** 3023 * Constant for the "Tangut Components" Unicode 3024 * character block. 3025 * @since 11 3026 */ 3027 public static final UnicodeBlock TANGUT_COMPONENTS = 3028 new UnicodeBlock("TANGUT_COMPONENTS", 3029 "TANGUT COMPONENTS", 3030 "TANGUTCOMPONENTS"); 3031 3032 /** 3033 * Constant for the "Kana Extended-A" Unicode 3034 * character block. 3035 * @since 11 3036 */ 3037 public static final UnicodeBlock KANA_EXTENDED_A = 3038 new UnicodeBlock("KANA_EXTENDED_A", 3039 "KANA EXTENDED-A", 3040 "KANAEXTENDED-A"); 3041 /** 3042 * Constant for the "Glagolitic Supplement" Unicode 3043 * character block. 3044 * @since 11 3045 */ 3046 public static final UnicodeBlock GLAGOLITIC_SUPPLEMENT = 3047 new UnicodeBlock("GLAGOLITIC_SUPPLEMENT", 3048 "GLAGOLITIC SUPPLEMENT", 3049 "GLAGOLITICSUPPLEMENT"); 3050 /** 3051 * Constant for the "Adlam" Unicode 3052 * character block. 3053 * @since 11 3054 */ 3055 public static final UnicodeBlock ADLAM = 3056 new UnicodeBlock("ADLAM"); 3057 3058 /** 3059 * Constant for the "Masaram Gondi" Unicode 3060 * character block. 3061 * @since 11 3062 */ 3063 public static final UnicodeBlock MASARAM_GONDI = 3064 new UnicodeBlock("MASARAM_GONDI", 3065 "MASARAM GONDI", 3066 "MASARAMGONDI"); 3067 3068 /** 3069 * Constant for the "Zanabazar Square" Unicode 3070 * character block. 3071 * @since 11 3072 */ 3073 public static final UnicodeBlock ZANABAZAR_SQUARE = 3074 new UnicodeBlock("ZANABAZAR_SQUARE", 3075 "ZANABAZAR SQUARE", 3076 "ZANABAZARSQUARE"); 3077 3078 /** 3079 * Constant for the "Nushu" Unicode 3080 * character block. 3081 * @since 11 3082 */ 3083 public static final UnicodeBlock NUSHU = 3084 new UnicodeBlock("NUSHU"); 3085 3086 /** 3087 * Constant for the "Soyombo" Unicode 3088 * character block. 3089 * @since 11 3090 */ 3091 public static final UnicodeBlock SOYOMBO = 3092 new UnicodeBlock("SOYOMBO"); 3093 3094 /** 3095 * Constant for the "Bhaiksuki" Unicode 3096 * character block. 3097 * @since 11 3098 */ 3099 public static final UnicodeBlock BHAIKSUKI = 3100 new UnicodeBlock("BHAIKSUKI"); 3101 3102 /** 3103 * Constant for the "CJK Unified Ideographs Extension F" Unicode 3104 * character block. 3105 * @since 11 3106 */ 3107 public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_F = 3108 new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS_EXTENSION_F", 3109 "CJK UNIFIED IDEOGRAPHS EXTENSION F", 3110 "CJKUNIFIEDIDEOGRAPHSEXTENSIONF"); 3111 /** 3112 * Constant for the "Georgian Extended" Unicode 3113 * character block. 3114 * @since 12 3115 */ 3116 public static final UnicodeBlock GEORGIAN_EXTENDED = 3117 new UnicodeBlock("GEORGIAN_EXTENDED", 3118 "GEORGIAN EXTENDED", 3119 "GEORGIANEXTENDED"); 3120 3121 /** 3122 * Constant for the "Hanifi Rohingya" Unicode 3123 * character block. 3124 * @since 12 3125 */ 3126 public static final UnicodeBlock HANIFI_ROHINGYA = 3127 new UnicodeBlock("HANIFI_ROHINGYA", 3128 "HANIFI ROHINGYA", 3129 "HANIFIROHINGYA"); 3130 3131 /** 3132 * Constant for the "Old Sogdian" Unicode 3133 * character block. 3134 * @since 12 3135 */ 3136 public static final UnicodeBlock OLD_SOGDIAN = 3137 new UnicodeBlock("OLD_SOGDIAN", 3138 "OLD SOGDIAN", 3139 "OLDSOGDIAN"); 3140 3141 /** 3142 * Constant for the "Sogdian" Unicode 3143 * character block. 3144 * @since 12 3145 */ 3146 public static final UnicodeBlock SOGDIAN = 3147 new UnicodeBlock("SOGDIAN"); 3148 3149 /** 3150 * Constant for the "Dogra" Unicode 3151 * character block. 3152 * @since 12 3153 */ 3154 public static final UnicodeBlock DOGRA = 3155 new UnicodeBlock("DOGRA"); 3156 3157 /** 3158 * Constant for the "Gunjala Gondi" Unicode 3159 * character block. 3160 * @since 12 3161 */ 3162 public static final UnicodeBlock GUNJALA_GONDI = 3163 new UnicodeBlock("GUNJALA_GONDI", 3164 "GUNJALA GONDI", 3165 "GUNJALAGONDI"); 3166 3167 /** 3168 * Constant for the "Makasar" Unicode 3169 * character block. 3170 * @since 12 3171 */ 3172 public static final UnicodeBlock MAKASAR = 3173 new UnicodeBlock("MAKASAR"); 3174 3175 /** 3176 * Constant for the "Medefaidrin" Unicode 3177 * character block. 3178 * @since 12 3179 */ 3180 public static final UnicodeBlock MEDEFAIDRIN = 3181 new UnicodeBlock("MEDEFAIDRIN"); 3182 3183 /** 3184 * Constant for the "Mayan Numerals" Unicode 3185 * character block. 3186 * @since 12 3187 */ 3188 public static final UnicodeBlock MAYAN_NUMERALS = 3189 new UnicodeBlock("MAYAN_NUMERALS", 3190 "MAYAN NUMERALS", 3191 "MAYANNUMERALS"); 3192 3193 /** 3194 * Constant for the "Indic Siyaq Numbers" Unicode 3195 * character block. 3196 * @since 12 3197 */ 3198 public static final UnicodeBlock INDIC_SIYAQ_NUMBERS = 3199 new UnicodeBlock("INDIC_SIYAQ_NUMBERS", 3200 "INDIC SIYAQ NUMBERS", 3201 "INDICSIYAQNUMBERS"); 3202 3203 /** 3204 * Constant for the "Chess Symbols" Unicode 3205 * character block. 3206 * @since 12 3207 */ 3208 public static final UnicodeBlock CHESS_SYMBOLS = 3209 new UnicodeBlock("CHESS_SYMBOLS", 3210 "CHESS SYMBOLS", 3211 "CHESSSYMBOLS"); 3212 3213 3214 private static final int blockStarts[] = { 3215 0x0000, // 0000..007F; Basic Latin 3216 0x0080, // 0080..00FF; Latin-1 Supplement 3217 0x0100, // 0100..017F; Latin Extended-A 3218 0x0180, // 0180..024F; Latin Extended-B 3219 0x0250, // 0250..02AF; IPA Extensions 3220 0x02B0, // 02B0..02FF; Spacing Modifier Letters 3221 0x0300, // 0300..036F; Combining Diacritical Marks 3222 0x0370, // 0370..03FF; Greek and Coptic 3223 0x0400, // 0400..04FF; Cyrillic 3224 0x0500, // 0500..052F; Cyrillic Supplement 3225 0x0530, // 0530..058F; Armenian 3226 0x0590, // 0590..05FF; Hebrew 3227 0x0600, // 0600..06FF; Arabic 3228 0x0700, // 0700..074F; Syriac 3229 0x0750, // 0750..077F; Arabic Supplement 3230 0x0780, // 0780..07BF; Thaana 3231 0x07C0, // 07C0..07FF; NKo 3232 0x0800, // 0800..083F; Samaritan 3233 0x0840, // 0840..085F; Mandaic 3234 0x0860, // 0860..086F; Syriac Supplement 3235 0x0870, // unassigned 3236 0x08A0, // 08A0..08FF; Arabic Extended-A 3237 0x0900, // 0900..097F; Devanagari 3238 0x0980, // 0980..09FF; Bengali 3239 0x0A00, // 0A00..0A7F; Gurmukhi 3240 0x0A80, // 0A80..0AFF; Gujarati 3241 0x0B00, // 0B00..0B7F; Oriya 3242 0x0B80, // 0B80..0BFF; Tamil 3243 0x0C00, // 0C00..0C7F; Telugu 3244 0x0C80, // 0C80..0CFF; Kannada 3245 0x0D00, // 0D00..0D7F; Malayalam 3246 0x0D80, // 0D80..0DFF; Sinhala 3247 0x0E00, // 0E00..0E7F; Thai 3248 0x0E80, // 0E80..0EFF; Lao 3249 0x0F00, // 0F00..0FFF; Tibetan 3250 0x1000, // 1000..109F; Myanmar 3251 0x10A0, // 10A0..10FF; Georgian 3252 0x1100, // 1100..11FF; Hangul Jamo 3253 0x1200, // 1200..137F; Ethiopic 3254 0x1380, // 1380..139F; Ethiopic Supplement 3255 0x13A0, // 13A0..13FF; Cherokee 3256 0x1400, // 1400..167F; Unified Canadian Aboriginal Syllabics 3257 0x1680, // 1680..169F; Ogham 3258 0x16A0, // 16A0..16FF; Runic 3259 0x1700, // 1700..171F; Tagalog 3260 0x1720, // 1720..173F; Hanunoo 3261 0x1740, // 1740..175F; Buhid 3262 0x1760, // 1760..177F; Tagbanwa 3263 0x1780, // 1780..17FF; Khmer 3264 0x1800, // 1800..18AF; Mongolian 3265 0x18B0, // 18B0..18FF; Unified Canadian Aboriginal Syllabics Extended 3266 0x1900, // 1900..194F; Limbu 3267 0x1950, // 1950..197F; Tai Le 3268 0x1980, // 1980..19DF; New Tai Lue 3269 0x19E0, // 19E0..19FF; Khmer Symbols 3270 0x1A00, // 1A00..1A1F; Buginese 3271 0x1A20, // 1A20..1AAF; Tai Tham 3272 0x1AB0, // 1AB0..1AFF; Combining Diacritical Marks Extended 3273 0x1B00, // 1B00..1B7F; Balinese 3274 0x1B80, // 1B80..1BBF; Sundanese 3275 0x1BC0, // 1BC0..1BFF; Batak 3276 0x1C00, // 1C00..1C4F; Lepcha 3277 0x1C50, // 1C50..1C7F; Ol Chiki 3278 0x1C80, // 1C80..1C8F; Cyrillic Extended-C 3279 0x1C90, // 1C90..1CBF; Georgian Extended 3280 0x1CC0, // 1CC0..1CCF; Sundanese Supplement 3281 0x1CD0, // 1CD0..1CFF; Vedic Extensions 3282 0x1D00, // 1D00..1D7F; Phonetic Extensions 3283 0x1D80, // 1D80..1DBF; Phonetic Extensions Supplement 3284 0x1DC0, // 1DC0..1DFF; Combining Diacritical Marks Supplement 3285 0x1E00, // 1E00..1EFF; Latin Extended Additional 3286 0x1F00, // 1F00..1FFF; Greek Extended 3287 0x2000, // 2000..206F; General Punctuation 3288 0x2070, // 2070..209F; Superscripts and Subscripts 3289 0x20A0, // 20A0..20CF; Currency Symbols 3290 0x20D0, // 20D0..20FF; Combining Diacritical Marks for Symbols 3291 0x2100, // 2100..214F; Letterlike Symbols 3292 0x2150, // 2150..218F; Number Forms 3293 0x2190, // 2190..21FF; Arrows 3294 0x2200, // 2200..22FF; Mathematical Operators 3295 0x2300, // 2300..23FF; Miscellaneous Technical 3296 0x2400, // 2400..243F; Control Pictures 3297 0x2440, // 2440..245F; Optical Character Recognition 3298 0x2460, // 2460..24FF; Enclosed Alphanumerics 3299 0x2500, // 2500..257F; Box Drawing 3300 0x2580, // 2580..259F; Block Elements 3301 0x25A0, // 25A0..25FF; Geometric Shapes 3302 0x2600, // 2600..26FF; Miscellaneous Symbols 3303 0x2700, // 2700..27BF; Dingbats 3304 0x27C0, // 27C0..27EF; Miscellaneous Mathematical Symbols-A 3305 0x27F0, // 27F0..27FF; Supplemental Arrows-A 3306 0x2800, // 2800..28FF; Braille Patterns 3307 0x2900, // 2900..297F; Supplemental Arrows-B 3308 0x2980, // 2980..29FF; Miscellaneous Mathematical Symbols-B 3309 0x2A00, // 2A00..2AFF; Supplemental Mathematical Operators 3310 0x2B00, // 2B00..2BFF; Miscellaneous Symbols and Arrows 3311 0x2C00, // 2C00..2C5F; Glagolitic 3312 0x2C60, // 2C60..2C7F; Latin Extended-C 3313 0x2C80, // 2C80..2CFF; Coptic 3314 0x2D00, // 2D00..2D2F; Georgian Supplement 3315 0x2D30, // 2D30..2D7F; Tifinagh 3316 0x2D80, // 2D80..2DDF; Ethiopic Extended 3317 0x2DE0, // 2DE0..2DFF; Cyrillic Extended-A 3318 0x2E00, // 2E00..2E7F; Supplemental Punctuation 3319 0x2E80, // 2E80..2EFF; CJK Radicals Supplement 3320 0x2F00, // 2F00..2FDF; Kangxi Radicals 3321 0x2FE0, // unassigned 3322 0x2FF0, // 2FF0..2FFF; Ideographic Description Characters 3323 0x3000, // 3000..303F; CJK Symbols and Punctuation 3324 0x3040, // 3040..309F; Hiragana 3325 0x30A0, // 30A0..30FF; Katakana 3326 0x3100, // 3100..312F; Bopomofo 3327 0x3130, // 3130..318F; Hangul Compatibility Jamo 3328 0x3190, // 3190..319F; Kanbun 3329 0x31A0, // 31A0..31BF; Bopomofo Extended 3330 0x31C0, // 31C0..31EF; CJK Strokes 3331 0x31F0, // 31F0..31FF; Katakana Phonetic Extensions 3332 0x3200, // 3200..32FF; Enclosed CJK Letters and Months 3333 0x3300, // 3300..33FF; CJK Compatibility 3334 0x3400, // 3400..4DBF; CJK Unified Ideographs Extension A 3335 0x4DC0, // 4DC0..4DFF; Yijing Hexagram Symbols 3336 0x4E00, // 4E00..9FFF; CJK Unified Ideographs 3337 0xA000, // A000..A48F; Yi Syllables 3338 0xA490, // A490..A4CF; Yi Radicals 3339 0xA4D0, // A4D0..A4FF; Lisu 3340 0xA500, // A500..A63F; Vai 3341 0xA640, // A640..A69F; Cyrillic Extended-B 3342 0xA6A0, // A6A0..A6FF; Bamum 3343 0xA700, // A700..A71F; Modifier Tone Letters 3344 0xA720, // A720..A7FF; Latin Extended-D 3345 0xA800, // A800..A82F; Syloti Nagri 3346 0xA830, // A830..A83F; Common Indic Number Forms 3347 0xA840, // A840..A87F; Phags-pa 3348 0xA880, // A880..A8DF; Saurashtra 3349 0xA8E0, // A8E0..A8FF; Devanagari Extended 3350 0xA900, // A900..A92F; Kayah Li 3351 0xA930, // A930..A95F; Rejang 3352 0xA960, // A960..A97F; Hangul Jamo Extended-A 3353 0xA980, // A980..A9DF; Javanese 3354 0xA9E0, // A9E0..A9FF; Myanmar Extended-B 3355 0xAA00, // AA00..AA5F; Cham 3356 0xAA60, // AA60..AA7F; Myanmar Extended-A 3357 0xAA80, // AA80..AADF; Tai Viet 3358 0xAAE0, // AAE0..AAFF; Meetei Mayek Extensions 3359 0xAB00, // AB00..AB2F; Ethiopic Extended-A 3360 0xAB30, // AB30..AB6F; Latin Extended-E 3361 0xAB70, // AB70..ABBF; Cherokee Supplement 3362 0xABC0, // ABC0..ABFF; Meetei Mayek 3363 0xAC00, // AC00..D7AF; Hangul Syllables 3364 0xD7B0, // D7B0..D7FF; Hangul Jamo Extended-B 3365 0xD800, // D800..DB7F; High Surrogates 3366 0xDB80, // DB80..DBFF; High Private Use Surrogates 3367 0xDC00, // DC00..DFFF; Low Surrogates 3368 0xE000, // E000..F8FF; Private Use Area 3369 0xF900, // F900..FAFF; CJK Compatibility Ideographs 3370 0xFB00, // FB00..FB4F; Alphabetic Presentation Forms 3371 0xFB50, // FB50..FDFF; Arabic Presentation Forms-A 3372 0xFE00, // FE00..FE0F; Variation Selectors 3373 0xFE10, // FE10..FE1F; Vertical Forms 3374 0xFE20, // FE20..FE2F; Combining Half Marks 3375 0xFE30, // FE30..FE4F; CJK Compatibility Forms 3376 0xFE50, // FE50..FE6F; Small Form Variants 3377 0xFE70, // FE70..FEFF; Arabic Presentation Forms-B 3378 0xFF00, // FF00..FFEF; Halfwidth and Fullwidth Forms 3379 0xFFF0, // FFF0..FFFF; Specials 3380 0x10000, // 10000..1007F; Linear B Syllabary 3381 0x10080, // 10080..100FF; Linear B Ideograms 3382 0x10100, // 10100..1013F; Aegean Numbers 3383 0x10140, // 10140..1018F; Ancient Greek Numbers 3384 0x10190, // 10190..101CF; Ancient Symbols 3385 0x101D0, // 101D0..101FF; Phaistos Disc 3386 0x10200, // unassigned 3387 0x10280, // 10280..1029F; Lycian 3388 0x102A0, // 102A0..102DF; Carian 3389 0x102E0, // 102E0..102FF; Coptic Epact Numbers 3390 0x10300, // 10300..1032F; Old Italic 3391 0x10330, // 10330..1034F; Gothic 3392 0x10350, // 10350..1037F; Old Permic 3393 0x10380, // 10380..1039F; Ugaritic 3394 0x103A0, // 103A0..103DF; Old Persian 3395 0x103E0, // unassigned 3396 0x10400, // 10400..1044F; Deseret 3397 0x10450, // 10450..1047F; Shavian 3398 0x10480, // 10480..104AF; Osmanya 3399 0x104B0, // 104B0..104FF; Osage 3400 0x10500, // 10500..1052F; Elbasan 3401 0x10530, // 10530..1056F; Caucasian Albanian 3402 0x10570, // unassigned 3403 0x10600, // 10600..1077F; Linear A 3404 0x10780, // unassigned 3405 0x10800, // 10800..1083F; Cypriot Syllabary 3406 0x10840, // 10840..1085F; Imperial Aramaic 3407 0x10860, // 10860..1087F; Palmyrene 3408 0x10880, // 10880..108AF; Nabataean 3409 0x108B0, // unassigned 3410 0x108E0, // 108E0..108FF; Hatran 3411 0x10900, // 10900..1091F; Phoenician 3412 0x10920, // 10920..1093F; Lydian 3413 0x10940, // unassigned 3414 0x10980, // 10980..1099F; Meroitic Hieroglyphs 3415 0x109A0, // 109A0..109FF; Meroitic Cursive 3416 0x10A00, // 10A00..10A5F; Kharoshthi 3417 0x10A60, // 10A60..10A7F; Old South Arabian 3418 0x10A80, // 10A80..10A9F; Old North Arabian 3419 0x10AA0, // unassigned 3420 0x10AC0, // 10AC0..10AFF; Manichaean 3421 0x10B00, // 10B00..10B3F; Avestan 3422 0x10B40, // 10B40..10B5F; Inscriptional Parthian 3423 0x10B60, // 10B60..10B7F; Inscriptional Pahlavi 3424 0x10B80, // 10B80..10BAF; Psalter Pahlavi 3425 0x10BB0, // unassigned 3426 0x10C00, // 10C00..10C4F; Old Turkic 3427 0x10C50, // unassigned 3428 0x10C80, // 10C80..10CFF; Old Hungarian 3429 0x10D00, // 10D00..10D3F; Hanifi Rohingya 3430 0x10D40, // unassigned 3431 0x10E60, // 10E60..10E7F; Rumi Numeral Symbols 3432 0x10E80, // unassigned 3433 0x10F00, // 10F00..10F2F; Old Sogdian 3434 0x10F30, // 10F30..10F6F; Sogdian 3435 0x10F70, // unassigned 3436 0x11000, // 11000..1107F; Brahmi 3437 0x11080, // 11080..110CF; Kaithi 3438 0x110D0, // 110D0..110FF; Sora Sompeng 3439 0x11100, // 11100..1114F; Chakma 3440 0x11150, // 11150..1117F; Mahajani 3441 0x11180, // 11180..111DF; Sharada 3442 0x111E0, // 111E0..111FF; Sinhala Archaic Numbers 3443 0x11200, // 11200..1124F; Khojki 3444 0x11250, // unassigned 3445 0x11280, // 11280..112AF; Multani 3446 0x112B0, // 112B0..112FF; Khudawadi 3447 0x11300, // 11300..1137F; Grantha 3448 0x11380, // unassigned 3449 0x11400, // 11400..1147F; Newa 3450 0x11480, // 11480..114DF; Tirhuta 3451 0x114E0, // unassigned 3452 0x11580, // 11580..115FF; Siddham 3453 0x11600, // 11600..1165F; Modi 3454 0x11660, // 11660..1167F; Mongolian Supplement 3455 0x11680, // 11680..116CF; Takri 3456 0x116D0, // unassigned 3457 0x11700, // 11700..1173F; Ahom 3458 0x11740, // unassigned 3459 0x11800, // 11800..1184F; Dogra 3460 0x11850, // unassigned 3461 0x118A0, // 118A0..118FF; Warang Citi 3462 0x11900, // unassigned 3463 0x11A00, // 11A00..11A4F; Zanabazar Square 3464 0x11A50, // 11A50..11AAF; Soyombo 3465 0x11AB0, // unassigned 3466 0x11AC0, // 11AC0..11AFF; Pau Cin Hau 3467 0x11B00, // unassigned 3468 0x11C00, // 11C00..11C6F; Bhaiksuki 3469 0x11C70, // 11C70..11CBF; Marchen 3470 0x11CC0, // unassigned 3471 0x11D00, // 11D00..11D5F; Masaram Gondi 3472 0x11D60, // 11D60..11DAF; Gunjala Gondi 3473 0x11DB0, // unassigned 3474 0x11EE0, // 11EE0..11EFF; Makasar 3475 0x11F00, // unassigned 3476 0x12000, // 12000..123FF; Cuneiform 3477 0x12400, // 12400..1247F; Cuneiform Numbers and Punctuation 3478 0x12480, // 12480..1254F; Early Dynastic Cuneiform 3479 0x12550, // unassigned 3480 0x13000, // 13000..1342F; Egyptian Hieroglyphs 3481 0x13430, // unassigned 3482 0x14400, // 14400..1467F; Anatolian Hieroglyphs 3483 0x14680, // unassigned 3484 0x16800, // 16800..16A3F; Bamum Supplement 3485 0x16A40, // 16A40..16A6F; Mro 3486 0x16A70, // unassigned 3487 0x16AD0, // 16AD0..16AFF; Bassa Vah 3488 0x16B00, // 16B00..16B8F; Pahawh Hmong 3489 0x16B90, // unassigned 3490 0x16E40, // 16E40..16E9F; Medefaidrin 3491 0x16EA0, // unassigned 3492 0x16F00, // 16F00..16F9F; Miao 3493 0x16FA0, // unassigned 3494 0x16FE0, // 16FE0..16FFF; Ideographic Symbols and Punctuation 3495 0x17000, // 17000..187FF; Tangut 3496 0x18800, // 18800..18AFF; Tangut Components 3497 0x18B00, // unassigned 3498 0x1B000, // 1B000..1B0FF; Kana Supplement 3499 0x1B100, // 1B100..1B12F; Kana Extended-A 3500 0x1B130, // unassigned 3501 0x1B170, // 1B170..1B2FF; Nushu 3502 0x1B300, // unassigned 3503 0x1BC00, // 1BC00..1BC9F; Duployan 3504 0x1BCA0, // 1BCA0..1BCAF; Shorthand Format Controls 3505 0x1BCB0, // unassigned 3506 0x1D000, // 1D000..1D0FF; Byzantine Musical Symbols 3507 0x1D100, // 1D100..1D1FF; Musical Symbols 3508 0x1D200, // 1D200..1D24F; Ancient Greek Musical Notation 3509 0x1D250, // unassigned 3510 0x1D2E0, // 1D2E0..1D2FF; Mayan Numerals 3511 0x1D300, // 1D300..1D35F; Tai Xuan Jing Symbols 3512 0x1D360, // 1D360..1D37F; Counting Rod Numerals 3513 0x1D380, // unassigned 3514 0x1D400, // 1D400..1D7FF; Mathematical Alphanumeric Symbols 3515 0x1D800, // 1D800..1DAAF; Sutton SignWriting 3516 0x1DAB0, // unassigned 3517 0x1E000, // 1E000..1E02F; Glagolitic Supplement 3518 0x1E030, // unassigned 3519 0x1E800, // 1E800..1E8DF; Mende Kikakui 3520 0x1E8E0, // unassigned 3521 0x1E900, // 1E900..1E95F; Adlam 3522 0x1E960, // unassigned 3523 0x1EC70, // 1EC70..1ECBF; Indic Siyaq Numbers 3524 0x1ECC0, // unassigned 3525 0x1EE00, // 1EE00..1EEFF; Arabic Mathematical Alphabetic Symbols 3526 0x1EF00, // unassigned 3527 0x1F000, // 1F000..1F02F; Mahjong Tiles 3528 0x1F030, // 1F030..1F09F; Domino Tiles 3529 0x1F0A0, // 1F0A0..1F0FF; Playing Cards 3530 0x1F100, // 1F100..1F1FF; Enclosed Alphanumeric Supplement 3531 0x1F200, // 1F200..1F2FF; Enclosed Ideographic Supplement 3532 0x1F300, // 1F300..1F5FF; Miscellaneous Symbols and Pictographs 3533 0x1F600, // 1F600..1F64F; Emoticons 3534 0x1F650, // 1F650..1F67F; Ornamental Dingbats 3535 0x1F680, // 1F680..1F6FF; Transport and Map Symbols 3536 0x1F700, // 1F700..1F77F; Alchemical Symbols 3537 0x1F780, // 1F780..1F7FF; Geometric Shapes Extended 3538 0x1F800, // 1F800..1F8FF; Supplemental Arrows-C 3539 0x1F900, // 1F900..1F9FF; Supplemental Symbols and Pictographs 3540 0x1FA00, // 1FA00..1FA6F; Chess Symbols 3541 0x1FA70, // unassigned 3542 0x20000, // 20000..2A6DF; CJK Unified Ideographs Extension B 3543 0x2A6E0, // unassigned 3544 0x2A700, // 2A700..2B73F; CJK Unified Ideographs Extension C 3545 0x2B740, // 2B740..2B81F; CJK Unified Ideographs Extension D 3546 0x2B820, // 2B820..2CEAF; CJK Unified Ideographs Extension E 3547 0x2CEB0, // 2CEB0..2EBEF; CJK Unified Ideographs Extension F 3548 0x2EBF0, // unassigned 3549 0x2F800, // 2F800..2FA1F; CJK Compatibility Ideographs Supplement 3550 0x2FA20, // unassigned 3551 0xE0000, // E0000..E007F; Tags 3552 0xE0080, // unassigned 3553 0xE0100, // E0100..E01EF; Variation Selectors Supplement 3554 0xE01F0, // unassigned 3555 0xF0000, // F0000..FFFFF; Supplementary Private Use Area-A 3556 0x100000 // 100000..10FFFF; Supplementary Private Use Area-B 3557 }; 3558 3559 private static final UnicodeBlock[] blocks = { 3560 BASIC_LATIN, 3561 LATIN_1_SUPPLEMENT, 3562 LATIN_EXTENDED_A, 3563 LATIN_EXTENDED_B, 3564 IPA_EXTENSIONS, 3565 SPACING_MODIFIER_LETTERS, 3566 COMBINING_DIACRITICAL_MARKS, 3567 GREEK, 3568 CYRILLIC, 3569 CYRILLIC_SUPPLEMENTARY, 3570 ARMENIAN, 3571 HEBREW, 3572 ARABIC, 3573 SYRIAC, 3574 ARABIC_SUPPLEMENT, 3575 THAANA, 3576 NKO, 3577 SAMARITAN, 3578 MANDAIC, 3579 SYRIAC_SUPPLEMENT, 3580 null, 3581 ARABIC_EXTENDED_A, 3582 DEVANAGARI, 3583 BENGALI, 3584 GURMUKHI, 3585 GUJARATI, 3586 ORIYA, 3587 TAMIL, 3588 TELUGU, 3589 KANNADA, 3590 MALAYALAM, 3591 SINHALA, 3592 THAI, 3593 LAO, 3594 TIBETAN, 3595 MYANMAR, 3596 GEORGIAN, 3597 HANGUL_JAMO, 3598 ETHIOPIC, 3599 ETHIOPIC_SUPPLEMENT, 3600 CHEROKEE, 3601 UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS, 3602 OGHAM, 3603 RUNIC, 3604 TAGALOG, 3605 HANUNOO, 3606 BUHID, 3607 TAGBANWA, 3608 KHMER, 3609 MONGOLIAN, 3610 UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS_EXTENDED, 3611 LIMBU, 3612 TAI_LE, 3613 NEW_TAI_LUE, 3614 KHMER_SYMBOLS, 3615 BUGINESE, 3616 TAI_THAM, 3617 COMBINING_DIACRITICAL_MARKS_EXTENDED, 3618 BALINESE, 3619 SUNDANESE, 3620 BATAK, 3621 LEPCHA, 3622 OL_CHIKI, 3623 CYRILLIC_EXTENDED_C, 3624 GEORGIAN_EXTENDED, 3625 SUNDANESE_SUPPLEMENT, 3626 VEDIC_EXTENSIONS, 3627 PHONETIC_EXTENSIONS, 3628 PHONETIC_EXTENSIONS_SUPPLEMENT, 3629 COMBINING_DIACRITICAL_MARKS_SUPPLEMENT, 3630 LATIN_EXTENDED_ADDITIONAL, 3631 GREEK_EXTENDED, 3632 GENERAL_PUNCTUATION, 3633 SUPERSCRIPTS_AND_SUBSCRIPTS, 3634 CURRENCY_SYMBOLS, 3635 COMBINING_MARKS_FOR_SYMBOLS, 3636 LETTERLIKE_SYMBOLS, 3637 NUMBER_FORMS, 3638 ARROWS, 3639 MATHEMATICAL_OPERATORS, 3640 MISCELLANEOUS_TECHNICAL, 3641 CONTROL_PICTURES, 3642 OPTICAL_CHARACTER_RECOGNITION, 3643 ENCLOSED_ALPHANUMERICS, 3644 BOX_DRAWING, 3645 BLOCK_ELEMENTS, 3646 GEOMETRIC_SHAPES, 3647 MISCELLANEOUS_SYMBOLS, 3648 DINGBATS, 3649 MISCELLANEOUS_MATHEMATICAL_SYMBOLS_A, 3650 SUPPLEMENTAL_ARROWS_A, 3651 BRAILLE_PATTERNS, 3652 SUPPLEMENTAL_ARROWS_B, 3653 MISCELLANEOUS_MATHEMATICAL_SYMBOLS_B, 3654 SUPPLEMENTAL_MATHEMATICAL_OPERATORS, 3655 MISCELLANEOUS_SYMBOLS_AND_ARROWS, 3656 GLAGOLITIC, 3657 LATIN_EXTENDED_C, 3658 COPTIC, 3659 GEORGIAN_SUPPLEMENT, 3660 TIFINAGH, 3661 ETHIOPIC_EXTENDED, 3662 CYRILLIC_EXTENDED_A, 3663 SUPPLEMENTAL_PUNCTUATION, 3664 CJK_RADICALS_SUPPLEMENT, 3665 KANGXI_RADICALS, 3666 null, 3667 IDEOGRAPHIC_DESCRIPTION_CHARACTERS, 3668 CJK_SYMBOLS_AND_PUNCTUATION, 3669 HIRAGANA, 3670 KATAKANA, 3671 BOPOMOFO, 3672 HANGUL_COMPATIBILITY_JAMO, 3673 KANBUN, 3674 BOPOMOFO_EXTENDED, 3675 CJK_STROKES, 3676 KATAKANA_PHONETIC_EXTENSIONS, 3677 ENCLOSED_CJK_LETTERS_AND_MONTHS, 3678 CJK_COMPATIBILITY, 3679 CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A, 3680 YIJING_HEXAGRAM_SYMBOLS, 3681 CJK_UNIFIED_IDEOGRAPHS, 3682 YI_SYLLABLES, 3683 YI_RADICALS, 3684 LISU, 3685 VAI, 3686 CYRILLIC_EXTENDED_B, 3687 BAMUM, 3688 MODIFIER_TONE_LETTERS, 3689 LATIN_EXTENDED_D, 3690 SYLOTI_NAGRI, 3691 COMMON_INDIC_NUMBER_FORMS, 3692 PHAGS_PA, 3693 SAURASHTRA, 3694 DEVANAGARI_EXTENDED, 3695 KAYAH_LI, 3696 REJANG, 3697 HANGUL_JAMO_EXTENDED_A, 3698 JAVANESE, 3699 MYANMAR_EXTENDED_B, 3700 CHAM, 3701 MYANMAR_EXTENDED_A, 3702 TAI_VIET, 3703 MEETEI_MAYEK_EXTENSIONS, 3704 ETHIOPIC_EXTENDED_A, 3705 LATIN_EXTENDED_E, 3706 CHEROKEE_SUPPLEMENT, 3707 MEETEI_MAYEK, 3708 HANGUL_SYLLABLES, 3709 HANGUL_JAMO_EXTENDED_B, 3710 HIGH_SURROGATES, 3711 HIGH_PRIVATE_USE_SURROGATES, 3712 LOW_SURROGATES, 3713 PRIVATE_USE_AREA, 3714 CJK_COMPATIBILITY_IDEOGRAPHS, 3715 ALPHABETIC_PRESENTATION_FORMS, 3716 ARABIC_PRESENTATION_FORMS_A, 3717 VARIATION_SELECTORS, 3718 VERTICAL_FORMS, 3719 COMBINING_HALF_MARKS, 3720 CJK_COMPATIBILITY_FORMS, 3721 SMALL_FORM_VARIANTS, 3722 ARABIC_PRESENTATION_FORMS_B, 3723 HALFWIDTH_AND_FULLWIDTH_FORMS, 3724 SPECIALS, 3725 LINEAR_B_SYLLABARY, 3726 LINEAR_B_IDEOGRAMS, 3727 AEGEAN_NUMBERS, 3728 ANCIENT_GREEK_NUMBERS, 3729 ANCIENT_SYMBOLS, 3730 PHAISTOS_DISC, 3731 null, 3732 LYCIAN, 3733 CARIAN, 3734 COPTIC_EPACT_NUMBERS, 3735 OLD_ITALIC, 3736 GOTHIC, 3737 OLD_PERMIC, 3738 UGARITIC, 3739 OLD_PERSIAN, 3740 null, 3741 DESERET, 3742 SHAVIAN, 3743 OSMANYA, 3744 OSAGE, 3745 ELBASAN, 3746 CAUCASIAN_ALBANIAN, 3747 null, 3748 LINEAR_A, 3749 null, 3750 CYPRIOT_SYLLABARY, 3751 IMPERIAL_ARAMAIC, 3752 PALMYRENE, 3753 NABATAEAN, 3754 null, 3755 HATRAN, 3756 PHOENICIAN, 3757 LYDIAN, 3758 null, 3759 MEROITIC_HIEROGLYPHS, 3760 MEROITIC_CURSIVE, 3761 KHAROSHTHI, 3762 OLD_SOUTH_ARABIAN, 3763 OLD_NORTH_ARABIAN, 3764 null, 3765 MANICHAEAN, 3766 AVESTAN, 3767 INSCRIPTIONAL_PARTHIAN, 3768 INSCRIPTIONAL_PAHLAVI, 3769 PSALTER_PAHLAVI, 3770 null, 3771 OLD_TURKIC, 3772 null, 3773 OLD_HUNGARIAN, 3774 HANIFI_ROHINGYA, 3775 null, 3776 RUMI_NUMERAL_SYMBOLS, 3777 null, 3778 OLD_SOGDIAN, 3779 SOGDIAN, 3780 null, 3781 BRAHMI, 3782 KAITHI, 3783 SORA_SOMPENG, 3784 CHAKMA, 3785 MAHAJANI, 3786 SHARADA, 3787 SINHALA_ARCHAIC_NUMBERS, 3788 KHOJKI, 3789 null, 3790 MULTANI, 3791 KHUDAWADI, 3792 GRANTHA, 3793 null, 3794 NEWA, 3795 TIRHUTA, 3796 null, 3797 SIDDHAM, 3798 MODI, 3799 MONGOLIAN_SUPPLEMENT, 3800 TAKRI, 3801 null, 3802 AHOM, 3803 null, 3804 DOGRA, 3805 null, 3806 WARANG_CITI, 3807 null, 3808 ZANABAZAR_SQUARE, 3809 SOYOMBO, 3810 null, 3811 PAU_CIN_HAU, 3812 null, 3813 BHAIKSUKI, 3814 MARCHEN, 3815 null, 3816 MASARAM_GONDI, 3817 GUNJALA_GONDI, 3818 null, 3819 MAKASAR, 3820 null, 3821 CUNEIFORM, 3822 CUNEIFORM_NUMBERS_AND_PUNCTUATION, 3823 EARLY_DYNASTIC_CUNEIFORM, 3824 null, 3825 EGYPTIAN_HIEROGLYPHS, 3826 null, 3827 ANATOLIAN_HIEROGLYPHS, 3828 null, 3829 BAMUM_SUPPLEMENT, 3830 MRO, 3831 null, 3832 BASSA_VAH, 3833 PAHAWH_HMONG, 3834 null, 3835 MEDEFAIDRIN, 3836 null, 3837 MIAO, 3838 null, 3839 IDEOGRAPHIC_SYMBOLS_AND_PUNCTUATION, 3840 TANGUT, 3841 TANGUT_COMPONENTS, 3842 null, 3843 KANA_SUPPLEMENT, 3844 KANA_EXTENDED_A, 3845 null, 3846 NUSHU, 3847 null, 3848 DUPLOYAN, 3849 SHORTHAND_FORMAT_CONTROLS, 3850 null, 3851 BYZANTINE_MUSICAL_SYMBOLS, 3852 MUSICAL_SYMBOLS, 3853 ANCIENT_GREEK_MUSICAL_NOTATION, 3854 null, 3855 MAYAN_NUMERALS, 3856 TAI_XUAN_JING_SYMBOLS, 3857 COUNTING_ROD_NUMERALS, 3858 null, 3859 MATHEMATICAL_ALPHANUMERIC_SYMBOLS, 3860 SUTTON_SIGNWRITING, 3861 null, 3862 GLAGOLITIC_SUPPLEMENT, 3863 null, 3864 MENDE_KIKAKUI, 3865 null, 3866 ADLAM, 3867 null, 3868 INDIC_SIYAQ_NUMBERS, 3869 null, 3870 ARABIC_MATHEMATICAL_ALPHABETIC_SYMBOLS, 3871 null, 3872 MAHJONG_TILES, 3873 DOMINO_TILES, 3874 PLAYING_CARDS, 3875 ENCLOSED_ALPHANUMERIC_SUPPLEMENT, 3876 ENCLOSED_IDEOGRAPHIC_SUPPLEMENT, 3877 MISCELLANEOUS_SYMBOLS_AND_PICTOGRAPHS, 3878 EMOTICONS, 3879 ORNAMENTAL_DINGBATS, 3880 TRANSPORT_AND_MAP_SYMBOLS, 3881 ALCHEMICAL_SYMBOLS, 3882 GEOMETRIC_SHAPES_EXTENDED, 3883 SUPPLEMENTAL_ARROWS_C, 3884 SUPPLEMENTAL_SYMBOLS_AND_PICTOGRAPHS, 3885 CHESS_SYMBOLS, 3886 null, 3887 CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B, 3888 null, 3889 CJK_UNIFIED_IDEOGRAPHS_EXTENSION_C, 3890 CJK_UNIFIED_IDEOGRAPHS_EXTENSION_D, 3891 CJK_UNIFIED_IDEOGRAPHS_EXTENSION_E, 3892 CJK_UNIFIED_IDEOGRAPHS_EXTENSION_F, 3893 null, 3894 CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT, 3895 null, 3896 TAGS, 3897 null, 3898 VARIATION_SELECTORS_SUPPLEMENT, 3899 null, 3900 SUPPLEMENTARY_PRIVATE_USE_AREA_A, 3901 SUPPLEMENTARY_PRIVATE_USE_AREA_B 3902 }; 3903 3904 3905 /** 3906 * Returns the object representing the Unicode block containing the 3907 * given character, or {@code null} if the character is not a 3908 * member of a defined block. 3909 * 3910 * <p><b>Note:</b> This method cannot handle 3911 * <a href="Character.html#supplementary"> supplementary 3912 * characters</a>. To support all Unicode characters, including 3913 * supplementary characters, use the {@link #of(int)} method. 3914 * 3915 * @param c The character in question 3916 * @return The {@code UnicodeBlock} instance representing the 3917 * Unicode block of which this character is a member, or 3918 * {@code null} if the character is not a member of any 3919 * Unicode block 3920 */ 3921 public static UnicodeBlock of(char c) { 3922 return of((int)c); 3923 } 3924 3925 /** 3926 * Returns the object representing the Unicode block 3927 * containing the given character (Unicode code point), or 3928 * {@code null} if the character is not a member of a 3929 * defined block. 3930 * 3931 * @param codePoint the character (Unicode code point) in question. 3932 * @return The {@code UnicodeBlock} instance representing the 3933 * Unicode block of which this character is a member, or 3934 * {@code null} if the character is not a member of any 3935 * Unicode block 3936 * @throws IllegalArgumentException if the specified 3937 * {@code codePoint} is an invalid Unicode code point. 3938 * @see Character#isValidCodePoint(int) 3939 * @since 1.5 3940 */ 3941 public static UnicodeBlock of(int codePoint) { 3942 if (!isValidCodePoint(codePoint)) { 3943 throw new IllegalArgumentException( 3944 String.format("Not a valid Unicode code point: 0x%X", codePoint)); 3945 } 3946 3947 int top, bottom, current; 3948 bottom = 0; 3949 top = blockStarts.length; 3950 current = top/2; 3951 3952 // invariant: top > current >= bottom && codePoint >= unicodeBlockStarts[bottom] 3953 while (top - bottom > 1) { 3954 if (codePoint >= blockStarts[current]) { 3955 bottom = current; 3956 } else { 3957 top = current; 3958 } 3959 current = (top + bottom) / 2; 3960 } 3961 return blocks[current]; 3962 } 3963 3964 /** 3965 * Returns the UnicodeBlock with the given name. Block 3966 * names are determined by The Unicode Standard. The file 3967 * {@code Blocks-<version>.txt} defines blocks for a particular 3968 * version of the standard. The {@link Character} class specifies 3969 * the version of the standard that it supports. 3970 * <p> 3971 * This method accepts block names in the following forms: 3972 * <ol> 3973 * <li> Canonical block names as defined by the Unicode Standard. 3974 * For example, the standard defines a "Basic Latin" block. Therefore, this 3975 * method accepts "Basic Latin" as a valid block name. The documentation of 3976 * each UnicodeBlock provides the canonical name. 3977 * <li>Canonical block names with all spaces removed. For example, "BasicLatin" 3978 * is a valid block name for the "Basic Latin" block. 3979 * <li>The text representation of each constant UnicodeBlock identifier. 3980 * For example, this method will return the {@link #BASIC_LATIN} block if 3981 * provided with the "BASIC_LATIN" name. This form replaces all spaces and 3982 * hyphens in the canonical name with underscores. 3983 * </ol> 3984 * Finally, character case is ignored for all of the valid block name forms. 3985 * For example, "BASIC_LATIN" and "basic_latin" are both valid block names. 3986 * The en_US locale's case mapping rules are used to provide case-insensitive 3987 * string comparisons for block name validation. 3988 * <p> 3989 * If the Unicode Standard changes block names, both the previous and 3990 * current names will be accepted. 3991 * 3992 * @param blockName A {@code UnicodeBlock} name. 3993 * @return The {@code UnicodeBlock} instance identified 3994 * by {@code blockName} 3995 * @throws IllegalArgumentException if {@code blockName} is an 3996 * invalid name 3997 * @throws NullPointerException if {@code blockName} is null 3998 * @since 1.5 3999 */ 4000 public static final UnicodeBlock forName(String blockName) { 4001 UnicodeBlock block = map.get(blockName.toUpperCase(Locale.US)); 4002 if (block == null) { 4003 throw new IllegalArgumentException("Not a valid block name: " 4004 + blockName); 4005 } 4006 return block; 4007 } 4008 } 4009 4010 4011 /** 4012 * A family of character subsets representing the character scripts 4013 * defined in the <a href="http://www.unicode.org/reports/tr24/"> 4014 * <i>Unicode Standard Annex #24: Script Names</i></a>. Every Unicode 4015 * character is assigned to a single Unicode script, either a specific 4016 * script, such as {@link Character.UnicodeScript#LATIN Latin}, or 4017 * one of the following three special values, 4018 * {@link Character.UnicodeScript#INHERITED Inherited}, 4019 * {@link Character.UnicodeScript#COMMON Common} or 4020 * {@link Character.UnicodeScript#UNKNOWN Unknown}. 4021 * 4022 * @since 1.7 4023 */ 4024 public static enum UnicodeScript { 4025 /** 4026 * Unicode script "Common". 4027 */ 4028 COMMON, 4029 4030 /** 4031 * Unicode script "Latin". 4032 */ 4033 LATIN, 4034 4035 /** 4036 * Unicode script "Greek". 4037 */ 4038 GREEK, 4039 4040 /** 4041 * Unicode script "Cyrillic". 4042 */ 4043 CYRILLIC, 4044 4045 /** 4046 * Unicode script "Armenian". 4047 */ 4048 ARMENIAN, 4049 4050 /** 4051 * Unicode script "Hebrew". 4052 */ 4053 HEBREW, 4054 4055 /** 4056 * Unicode script "Arabic". 4057 */ 4058 ARABIC, 4059 4060 /** 4061 * Unicode script "Syriac". 4062 */ 4063 SYRIAC, 4064 4065 /** 4066 * Unicode script "Thaana". 4067 */ 4068 THAANA, 4069 4070 /** 4071 * Unicode script "Devanagari". 4072 */ 4073 DEVANAGARI, 4074 4075 /** 4076 * Unicode script "Bengali". 4077 */ 4078 BENGALI, 4079 4080 /** 4081 * Unicode script "Gurmukhi". 4082 */ 4083 GURMUKHI, 4084 4085 /** 4086 * Unicode script "Gujarati". 4087 */ 4088 GUJARATI, 4089 4090 /** 4091 * Unicode script "Oriya". 4092 */ 4093 ORIYA, 4094 4095 /** 4096 * Unicode script "Tamil". 4097 */ 4098 TAMIL, 4099 4100 /** 4101 * Unicode script "Telugu". 4102 */ 4103 TELUGU, 4104 4105 /** 4106 * Unicode script "Kannada". 4107 */ 4108 KANNADA, 4109 4110 /** 4111 * Unicode script "Malayalam". 4112 */ 4113 MALAYALAM, 4114 4115 /** 4116 * Unicode script "Sinhala". 4117 */ 4118 SINHALA, 4119 4120 /** 4121 * Unicode script "Thai". 4122 */ 4123 THAI, 4124 4125 /** 4126 * Unicode script "Lao". 4127 */ 4128 LAO, 4129 4130 /** 4131 * Unicode script "Tibetan". 4132 */ 4133 TIBETAN, 4134 4135 /** 4136 * Unicode script "Myanmar". 4137 */ 4138 MYANMAR, 4139 4140 /** 4141 * Unicode script "Georgian". 4142 */ 4143 GEORGIAN, 4144 4145 /** 4146 * Unicode script "Hangul". 4147 */ 4148 HANGUL, 4149 4150 /** 4151 * Unicode script "Ethiopic". 4152 */ 4153 ETHIOPIC, 4154 4155 /** 4156 * Unicode script "Cherokee". 4157 */ 4158 CHEROKEE, 4159 4160 /** 4161 * Unicode script "Canadian_Aboriginal". 4162 */ 4163 CANADIAN_ABORIGINAL, 4164 4165 /** 4166 * Unicode script "Ogham". 4167 */ 4168 OGHAM, 4169 4170 /** 4171 * Unicode script "Runic". 4172 */ 4173 RUNIC, 4174 4175 /** 4176 * Unicode script "Khmer". 4177 */ 4178 KHMER, 4179 4180 /** 4181 * Unicode script "Mongolian". 4182 */ 4183 MONGOLIAN, 4184 4185 /** 4186 * Unicode script "Hiragana". 4187 */ 4188 HIRAGANA, 4189 4190 /** 4191 * Unicode script "Katakana". 4192 */ 4193 KATAKANA, 4194 4195 /** 4196 * Unicode script "Bopomofo". 4197 */ 4198 BOPOMOFO, 4199 4200 /** 4201 * Unicode script "Han". 4202 */ 4203 HAN, 4204 4205 /** 4206 * Unicode script "Yi". 4207 */ 4208 YI, 4209 4210 /** 4211 * Unicode script "Old_Italic". 4212 */ 4213 OLD_ITALIC, 4214 4215 /** 4216 * Unicode script "Gothic". 4217 */ 4218 GOTHIC, 4219 4220 /** 4221 * Unicode script "Deseret". 4222 */ 4223 DESERET, 4224 4225 /** 4226 * Unicode script "Inherited". 4227 */ 4228 INHERITED, 4229 4230 /** 4231 * Unicode script "Tagalog". 4232 */ 4233 TAGALOG, 4234 4235 /** 4236 * Unicode script "Hanunoo". 4237 */ 4238 HANUNOO, 4239 4240 /** 4241 * Unicode script "Buhid". 4242 */ 4243 BUHID, 4244 4245 /** 4246 * Unicode script "Tagbanwa". 4247 */ 4248 TAGBANWA, 4249 4250 /** 4251 * Unicode script "Limbu". 4252 */ 4253 LIMBU, 4254 4255 /** 4256 * Unicode script "Tai_Le". 4257 */ 4258 TAI_LE, 4259 4260 /** 4261 * Unicode script "Linear_B". 4262 */ 4263 LINEAR_B, 4264 4265 /** 4266 * Unicode script "Ugaritic". 4267 */ 4268 UGARITIC, 4269 4270 /** 4271 * Unicode script "Shavian". 4272 */ 4273 SHAVIAN, 4274 4275 /** 4276 * Unicode script "Osmanya". 4277 */ 4278 OSMANYA, 4279 4280 /** 4281 * Unicode script "Cypriot". 4282 */ 4283 CYPRIOT, 4284 4285 /** 4286 * Unicode script "Braille". 4287 */ 4288 BRAILLE, 4289 4290 /** 4291 * Unicode script "Buginese". 4292 */ 4293 BUGINESE, 4294 4295 /** 4296 * Unicode script "Coptic". 4297 */ 4298 COPTIC, 4299 4300 /** 4301 * Unicode script "New_Tai_Lue". 4302 */ 4303 NEW_TAI_LUE, 4304 4305 /** 4306 * Unicode script "Glagolitic". 4307 */ 4308 GLAGOLITIC, 4309 4310 /** 4311 * Unicode script "Tifinagh". 4312 */ 4313 TIFINAGH, 4314 4315 /** 4316 * Unicode script "Syloti_Nagri". 4317 */ 4318 SYLOTI_NAGRI, 4319 4320 /** 4321 * Unicode script "Old_Persian". 4322 */ 4323 OLD_PERSIAN, 4324 4325 /** 4326 * Unicode script "Kharoshthi". 4327 */ 4328 KHAROSHTHI, 4329 4330 /** 4331 * Unicode script "Balinese". 4332 */ 4333 BALINESE, 4334 4335 /** 4336 * Unicode script "Cuneiform". 4337 */ 4338 CUNEIFORM, 4339 4340 /** 4341 * Unicode script "Phoenician". 4342 */ 4343 PHOENICIAN, 4344 4345 /** 4346 * Unicode script "Phags_Pa". 4347 */ 4348 PHAGS_PA, 4349 4350 /** 4351 * Unicode script "Nko". 4352 */ 4353 NKO, 4354 4355 /** 4356 * Unicode script "Sundanese". 4357 */ 4358 SUNDANESE, 4359 4360 /** 4361 * Unicode script "Batak". 4362 */ 4363 BATAK, 4364 4365 /** 4366 * Unicode script "Lepcha". 4367 */ 4368 LEPCHA, 4369 4370 /** 4371 * Unicode script "Ol_Chiki". 4372 */ 4373 OL_CHIKI, 4374 4375 /** 4376 * Unicode script "Vai". 4377 */ 4378 VAI, 4379 4380 /** 4381 * Unicode script "Saurashtra". 4382 */ 4383 SAURASHTRA, 4384 4385 /** 4386 * Unicode script "Kayah_Li". 4387 */ 4388 KAYAH_LI, 4389 4390 /** 4391 * Unicode script "Rejang". 4392 */ 4393 REJANG, 4394 4395 /** 4396 * Unicode script "Lycian". 4397 */ 4398 LYCIAN, 4399 4400 /** 4401 * Unicode script "Carian". 4402 */ 4403 CARIAN, 4404 4405 /** 4406 * Unicode script "Lydian". 4407 */ 4408 LYDIAN, 4409 4410 /** 4411 * Unicode script "Cham". 4412 */ 4413 CHAM, 4414 4415 /** 4416 * Unicode script "Tai_Tham". 4417 */ 4418 TAI_THAM, 4419 4420 /** 4421 * Unicode script "Tai_Viet". 4422 */ 4423 TAI_VIET, 4424 4425 /** 4426 * Unicode script "Avestan". 4427 */ 4428 AVESTAN, 4429 4430 /** 4431 * Unicode script "Egyptian_Hieroglyphs". 4432 */ 4433 EGYPTIAN_HIEROGLYPHS, 4434 4435 /** 4436 * Unicode script "Samaritan". 4437 */ 4438 SAMARITAN, 4439 4440 /** 4441 * Unicode script "Mandaic". 4442 */ 4443 MANDAIC, 4444 4445 /** 4446 * Unicode script "Lisu". 4447 */ 4448 LISU, 4449 4450 /** 4451 * Unicode script "Bamum". 4452 */ 4453 BAMUM, 4454 4455 /** 4456 * Unicode script "Javanese". 4457 */ 4458 JAVANESE, 4459 4460 /** 4461 * Unicode script "Meetei_Mayek". 4462 */ 4463 MEETEI_MAYEK, 4464 4465 /** 4466 * Unicode script "Imperial_Aramaic". 4467 */ 4468 IMPERIAL_ARAMAIC, 4469 4470 /** 4471 * Unicode script "Old_South_Arabian". 4472 */ 4473 OLD_SOUTH_ARABIAN, 4474 4475 /** 4476 * Unicode script "Inscriptional_Parthian". 4477 */ 4478 INSCRIPTIONAL_PARTHIAN, 4479 4480 /** 4481 * Unicode script "Inscriptional_Pahlavi". 4482 */ 4483 INSCRIPTIONAL_PAHLAVI, 4484 4485 /** 4486 * Unicode script "Old_Turkic". 4487 */ 4488 OLD_TURKIC, 4489 4490 /** 4491 * Unicode script "Brahmi". 4492 */ 4493 BRAHMI, 4494 4495 /** 4496 * Unicode script "Kaithi". 4497 */ 4498 KAITHI, 4499 4500 /** 4501 * Unicode script "Meroitic Hieroglyphs". 4502 * @since 1.8 4503 */ 4504 MEROITIC_HIEROGLYPHS, 4505 4506 /** 4507 * Unicode script "Meroitic Cursive". 4508 * @since 1.8 4509 */ 4510 MEROITIC_CURSIVE, 4511 4512 /** 4513 * Unicode script "Sora Sompeng". 4514 * @since 1.8 4515 */ 4516 SORA_SOMPENG, 4517 4518 /** 4519 * Unicode script "Chakma". 4520 * @since 1.8 4521 */ 4522 CHAKMA, 4523 4524 /** 4525 * Unicode script "Sharada". 4526 * @since 1.8 4527 */ 4528 SHARADA, 4529 4530 /** 4531 * Unicode script "Takri". 4532 * @since 1.8 4533 */ 4534 TAKRI, 4535 4536 /** 4537 * Unicode script "Miao". 4538 * @since 1.8 4539 */ 4540 MIAO, 4541 4542 /** 4543 * Unicode script "Caucasian Albanian". 4544 * @since 9 4545 */ 4546 CAUCASIAN_ALBANIAN, 4547 4548 /** 4549 * Unicode script "Bassa Vah". 4550 * @since 9 4551 */ 4552 BASSA_VAH, 4553 4554 /** 4555 * Unicode script "Duployan". 4556 * @since 9 4557 */ 4558 DUPLOYAN, 4559 4560 /** 4561 * Unicode script "Elbasan". 4562 * @since 9 4563 */ 4564 ELBASAN, 4565 4566 /** 4567 * Unicode script "Grantha". 4568 * @since 9 4569 */ 4570 GRANTHA, 4571 4572 /** 4573 * Unicode script "Pahawh Hmong". 4574 * @since 9 4575 */ 4576 PAHAWH_HMONG, 4577 4578 /** 4579 * Unicode script "Khojki". 4580 * @since 9 4581 */ 4582 KHOJKI, 4583 4584 /** 4585 * Unicode script "Linear A". 4586 * @since 9 4587 */ 4588 LINEAR_A, 4589 4590 /** 4591 * Unicode script "Mahajani". 4592 * @since 9 4593 */ 4594 MAHAJANI, 4595 4596 /** 4597 * Unicode script "Manichaean". 4598 * @since 9 4599 */ 4600 MANICHAEAN, 4601 4602 /** 4603 * Unicode script "Mende Kikakui". 4604 * @since 9 4605 */ 4606 MENDE_KIKAKUI, 4607 4608 /** 4609 * Unicode script "Modi". 4610 * @since 9 4611 */ 4612 MODI, 4613 4614 /** 4615 * Unicode script "Mro". 4616 * @since 9 4617 */ 4618 MRO, 4619 4620 /** 4621 * Unicode script "Old North Arabian". 4622 * @since 9 4623 */ 4624 OLD_NORTH_ARABIAN, 4625 4626 /** 4627 * Unicode script "Nabataean". 4628 * @since 9 4629 */ 4630 NABATAEAN, 4631 4632 /** 4633 * Unicode script "Palmyrene". 4634 * @since 9 4635 */ 4636 PALMYRENE, 4637 4638 /** 4639 * Unicode script "Pau Cin Hau". 4640 * @since 9 4641 */ 4642 PAU_CIN_HAU, 4643 4644 /** 4645 * Unicode script "Old Permic". 4646 * @since 9 4647 */ 4648 OLD_PERMIC, 4649 4650 /** 4651 * Unicode script "Psalter Pahlavi". 4652 * @since 9 4653 */ 4654 PSALTER_PAHLAVI, 4655 4656 /** 4657 * Unicode script "Siddham". 4658 * @since 9 4659 */ 4660 SIDDHAM, 4661 4662 /** 4663 * Unicode script "Khudawadi". 4664 * @since 9 4665 */ 4666 KHUDAWADI, 4667 4668 /** 4669 * Unicode script "Tirhuta". 4670 * @since 9 4671 */ 4672 TIRHUTA, 4673 4674 /** 4675 * Unicode script "Warang Citi". 4676 * @since 9 4677 */ 4678 WARANG_CITI, 4679 4680 /** 4681 * Unicode script "Ahom". 4682 * @since 9 4683 */ 4684 AHOM, 4685 4686 /** 4687 * Unicode script "Anatolian Hieroglyphs". 4688 * @since 9 4689 */ 4690 ANATOLIAN_HIEROGLYPHS, 4691 4692 /** 4693 * Unicode script "Hatran". 4694 * @since 9 4695 */ 4696 HATRAN, 4697 4698 /** 4699 * Unicode script "Multani". 4700 * @since 9 4701 */ 4702 MULTANI, 4703 4704 /** 4705 * Unicode script "Old Hungarian". 4706 * @since 9 4707 */ 4708 OLD_HUNGARIAN, 4709 4710 /** 4711 * Unicode script "SignWriting". 4712 * @since 9 4713 */ 4714 SIGNWRITING, 4715 4716 /** 4717 * Unicode script "Adlam". 4718 * @since 11 4719 */ 4720 ADLAM, 4721 4722 /** 4723 * Unicode script "Bhaiksuki". 4724 * @since 11 4725 */ 4726 BHAIKSUKI, 4727 4728 /** 4729 * Unicode script "Marchen". 4730 * @since 11 4731 */ 4732 MARCHEN, 4733 4734 /** 4735 * Unicode script "Newa". 4736 * @since 11 4737 */ 4738 NEWA, 4739 4740 /** 4741 * Unicode script "Osage". 4742 * @since 11 4743 */ 4744 OSAGE, 4745 4746 /** 4747 * Unicode script "Tangut". 4748 * @since 11 4749 */ 4750 TANGUT, 4751 4752 /** 4753 * Unicode script "Masaram Gondi". 4754 * @since 11 4755 */ 4756 MASARAM_GONDI, 4757 4758 /** 4759 * Unicode script "Nushu". 4760 * @since 11 4761 */ 4762 NUSHU, 4763 4764 /** 4765 * Unicode script "Soyombo". 4766 * @since 11 4767 */ 4768 SOYOMBO, 4769 4770 /** 4771 * Unicode script "Zanabazar Square". 4772 * @since 11 4773 */ 4774 ZANABAZAR_SQUARE, 4775 4776 /** 4777 * Unicode script "Hanifi Rohingya". 4778 * @since 12 4779 */ 4780 HANIFI_ROHINGYA, 4781 4782 /** 4783 * Unicode script "Old Sogdian". 4784 * @since 12 4785 */ 4786 OLD_SOGDIAN, 4787 4788 /** 4789 * Unicode script "Sogdian". 4790 * @since 12 4791 */ 4792 SOGDIAN, 4793 4794 /** 4795 * Unicode script "Dogra". 4796 * @since 12 4797 */ 4798 DOGRA, 4799 4800 /** 4801 * Unicode script "Gunjala Gondi". 4802 * @since 12 4803 */ 4804 GUNJALA_GONDI, 4805 4806 /** 4807 * Unicode script "Makasar". 4808 * @since 12 4809 */ 4810 MAKASAR, 4811 4812 /** 4813 * Unicode script "Medefaidrin". 4814 * @since 12 4815 */ 4816 MEDEFAIDRIN, 4817 4818 /** 4819 * Unicode script "Unknown". 4820 */ 4821 UNKNOWN; 4822 4823 private static final int[] scriptStarts = { 4824 0x0000, // 0000..0040; COMMON 4825 0x0041, // 0041..005A; LATIN 4826 0x005B, // 005B..0060; COMMON 4827 0x0061, // 0061..007A; LATIN 4828 0x007B, // 007B..00A9; COMMON 4829 0x00AA, // 00AA ; LATIN 4830 0x00AB, // 00AB..00B9; COMMON 4831 0x00BA, // 00BA ; LATIN 4832 0x00BB, // 00BB..00BF; COMMON 4833 0x00C0, // 00C0..00D6; LATIN 4834 0x00D7, // 00D7 ; COMMON 4835 0x00D8, // 00D8..00F6; LATIN 4836 0x00F7, // 00F7 ; COMMON 4837 0x00F8, // 00F8..02B8; LATIN 4838 0x02B9, // 02B9..02DF; COMMON 4839 0x02E0, // 02E0..02E4; LATIN 4840 0x02E5, // 02E5..02E9; COMMON 4841 0x02EA, // 02EA..02EB; BOPOMOFO 4842 0x02EC, // 02EC..02FF; COMMON 4843 0x0300, // 0300..036F; INHERITED 4844 0x0370, // 0370..0373; GREEK 4845 0x0374, // 0374 ; COMMON 4846 0x0375, // 0375..0377; GREEK 4847 0x0378, // 0378..0379; UNKNOWN 4848 0x037A, // 037A..037D; GREEK 4849 0x037E, // 037E ; COMMON 4850 0x037F, // 037F ; GREEK 4851 0x0380, // 0380..0383; UNKNOWN 4852 0x0384, // 0384 ; GREEK 4853 0x0385, // 0385 ; COMMON 4854 0x0386, // 0386 ; GREEK 4855 0x0387, // 0387 ; COMMON 4856 0x0388, // 0388..038A; GREEK 4857 0x038B, // 038B ; UNKNOWN 4858 0x038C, // 038C ; GREEK 4859 0x038D, // 038D ; UNKNOWN 4860 0x038E, // 038E..03A1; GREEK 4861 0x03A2, // 03A2 ; UNKNOWN 4862 0x03A3, // 03A3..03E1; GREEK 4863 0x03E2, // 03E2..03EF; COPTIC 4864 0x03F0, // 03F0..03FF; GREEK 4865 0x0400, // 0400..0484; CYRILLIC 4866 0x0485, // 0485..0486; INHERITED 4867 0x0487, // 0487..052F; CYRILLIC 4868 0x0530, // 0530 ; UNKNOWN 4869 0x0531, // 0531..0556; ARMENIAN 4870 0x0557, // 0557..0558; UNKNOWN 4871 0x0559, // 0559..0588; ARMENIAN 4872 0x0589, // 0589 ; COMMON 4873 0x058A, // 058A ; ARMENIAN 4874 0x058B, // 058B..058C; UNKNOWN 4875 0x058D, // 058D..058F; ARMENIAN 4876 0x0590, // 0590 ; UNKNOWN 4877 0x0591, // 0591..05C7; HEBREW 4878 0x05C8, // 05C8..05CF; UNKNOWN 4879 0x05D0, // 05D0..05EA; HEBREW 4880 0x05EB, // 05EB..05EE; UNKNOWN 4881 0x05EF, // 05EF..05F4; HEBREW 4882 0x05F5, // 05F5..05FF; UNKNOWN 4883 0x0600, // 0600..0604; ARABIC 4884 0x0605, // 0605 ; COMMON 4885 0x0606, // 0606..060B; ARABIC 4886 0x060C, // 060C ; COMMON 4887 0x060D, // 060D..061A; ARABIC 4888 0x061B, // 061B ; COMMON 4889 0x061C, // 061C ; ARABIC 4890 0x061D, // 061D ; UNKNOWN 4891 0x061E, // 061E ; ARABIC 4892 0x061F, // 061F ; COMMON 4893 0x0620, // 0620..063F; ARABIC 4894 0x0640, // 0640 ; COMMON 4895 0x0641, // 0641..064A; ARABIC 4896 0x064B, // 064B..0655; INHERITED 4897 0x0656, // 0656..066F; ARABIC 4898 0x0670, // 0670 ; INHERITED 4899 0x0671, // 0671..06DC; ARABIC 4900 0x06DD, // 06DD ; COMMON 4901 0x06DE, // 06DE..06FF; ARABIC 4902 0x0700, // 0700..070D; SYRIAC 4903 0x070E, // 070E ; UNKNOWN 4904 0x070F, // 070F..074A; SYRIAC 4905 0x074B, // 074B..074C; UNKNOWN 4906 0x074D, // 074D..074F; SYRIAC 4907 0x0750, // 0750..077F; ARABIC 4908 0x0780, // 0780..07B1; THAANA 4909 0x07B2, // 07B2..07BF; UNKNOWN 4910 0x07C0, // 07C0..07FA; NKO 4911 0x07FB, // 07FB..07FC; UNKNOWN 4912 0X07FD, // 07FD..07FF; NKO 4913 0x0800, // 0800..082D; SAMARITAN 4914 0x082E, // 082E..082F; UNKNOWN 4915 0x0830, // 0830..083E; SAMARITAN 4916 0x083F, // 083F ; UNKNOWN 4917 0x0840, // 0840..085B; MANDAIC 4918 0x085C, // 085C..085D; UNKNOWN 4919 0x085E, // 085E ; MANDAIC 4920 0x085F, // 085F ; UNKNOWN 4921 0x0860, // 0860..086A; SYRIAC 4922 0x086B, // 086B..089F; UNKNOWN 4923 0x08A0, // 08A0..08B4; ARABIC 4924 0x08B5, // 08B5 ; UNKNOWN 4925 0x08B6, // 08B6..08BD; ARABIC 4926 0x08BE, // 08BE..08D2; UNKNOWN 4927 0x08D3, // 08D3..08E1; ARABIC 4928 0x08E2, // 08E2 ; COMMON 4929 0x08E3, // 08E3..08FF; ARABIC 4930 0x0900, // 0900..0950; DEVANAGARI 4931 0x0951, // 0951..0952; INHERITED 4932 0x0953, // 0953..0963; DEVANAGARI 4933 0x0964, // 0964..0965; COMMON 4934 0x0966, // 0966..097F; DEVANAGARI 4935 0x0980, // 0980..0983; BENGALI 4936 0x0984, // 0984 ; UNKNOWN 4937 0x0985, // 0985..098C; BENGALI 4938 0x098D, // 098D..098E; UNKNOWN 4939 0x098F, // 098F..0990; BENGALI 4940 0x0991, // 0991..0992; UNKNOWN 4941 0x0993, // 0993..09A8; BENGALI 4942 0x09A9, // 09A9 ; UNKNOWN 4943 0x09AA, // 09AA..09B0; BENGALI 4944 0x09B1, // 09B1 ; UNKNOWN 4945 0x09B2, // 09B2 ; BENGALI 4946 0x09B3, // 09B3..09B5; UNKNOWN 4947 0x09B6, // 09B6..09B9; BENGALI 4948 0x09BA, // 09BA..09BB; UNKNOWN 4949 0x09BC, // 09BC..09C4; BENGALI 4950 0x09C5, // 09C5..09C6; UNKNOWN 4951 0x09C7, // 09C7..09C8; BENGALI 4952 0x09C9, // 09C9..09CA; UNKNOWN 4953 0x09CB, // 09CB..09CE; BENGALI 4954 0x09CF, // 09CF..09D6; UNKNOWN 4955 0x09D7, // 09D7 ; BENGALI 4956 0x09D8, // 09D8..09DB; UNKNOWN 4957 0x09DC, // 09DC..09DD; BENGALI 4958 0x09DE, // 09DE ; UNKNOWN 4959 0x09DF, // 09DF..09E3; BENGALI 4960 0x09E4, // 09E4..09E5; UNKNOWN 4961 0x09E6, // 09E6..09FE; BENGALI 4962 0x09FF, // 09FF..0A00; UNKNOWN 4963 0x0A01, // 0A01..0A03; GURMUKHI 4964 0x0A04, // 0A04 ; UNKNOWN 4965 0x0A05, // 0A05..0A0A; GURMUKHI 4966 0x0A0B, // 0A0B..0A0E; UNKNOWN 4967 0x0A0F, // 0A0F..0A10; GURMUKHI 4968 0x0A11, // 0A11..0A12; UNKNOWN 4969 0x0A13, // 0A13..0A28; GURMUKHI 4970 0x0A29, // 0A29 ; UNKNOWN 4971 0x0A2A, // 0A2A..0A30; GURMUKHI 4972 0x0A31, // 0A31 ; UNKNOWN 4973 0x0A32, // 0A32..0A33; GURMUKHI 4974 0x0A34, // 0A34 ; UNKNOWN 4975 0x0A35, // 0A35..0A36; GURMUKHI 4976 0x0A37, // 0A37 ; UNKNOWN 4977 0x0A38, // 0A38..0A39; GURMUKHI 4978 0x0A3A, // 0A3A..0A3B; UNKNOWN 4979 0x0A3C, // 0A3C ; GURMUKHI 4980 0x0A3D, // 0A3D ; UNKNOWN 4981 0x0A3E, // 0A3E..0A42; GURMUKHI 4982 0x0A43, // 0A43..0A46; UNKNOWN 4983 0x0A47, // 0A47..0A48; GURMUKHI 4984 0x0A49, // 0A49..0A4A; UNKNOWN 4985 0x0A4B, // 0A4B..0A4D; GURMUKHI 4986 0x0A4E, // 0A4E..0A50; UNKNOWN 4987 0x0A51, // 0A51 ; GURMUKHI 4988 0x0A52, // 0A52..0A58; UNKNOWN 4989 0x0A59, // 0A59..0A5C; GURMUKHI 4990 0x0A5D, // 0A5D ; UNKNOWN 4991 0x0A5E, // 0A5E ; GURMUKHI 4992 0x0A5F, // 0A5F..0A65; UNKNOWN 4993 0x0A66, // 0A66..0A76; GURMUKHI 4994 0x0A77, // 0A77..0A80; UNKNOWN 4995 0x0A81, // 0A81..0A83; GUJARATI 4996 0x0A84, // 0A84 ; UNKNOWN 4997 0x0A85, // 0A85..0A8D; GUJARATI 4998 0x0A8E, // 0A8E ; UNKNOWN 4999 0x0A8F, // 0A8F..0A91; GUJARATI 5000 0x0A92, // 0A92 ; UNKNOWN 5001 0x0A93, // 0A93..0AA8; GUJARATI 5002 0x0AA9, // 0AA9 ; UNKNOWN 5003 0x0AAA, // 0AAA..0AB0; GUJARATI 5004 0x0AB1, // 0AB1 ; UNKNOWN 5005 0x0AB2, // 0AB2..0AB3; GUJARATI 5006 0x0AB4, // 0AB4 ; UNKNOWN 5007 0x0AB5, // 0AB5..0AB9; GUJARATI 5008 0x0ABA, // 0ABA..0ABB; UNKNOWN 5009 0x0ABC, // 0ABC..0AC5; GUJARATI 5010 0x0AC6, // 0AC6 ; UNKNOWN 5011 0x0AC7, // 0AC7..0AC9; GUJARATI 5012 0x0ACA, // 0ACA ; UNKNOWN 5013 0x0ACB, // 0ACB..0ACD; GUJARATI 5014 0x0ACE, // 0ACE..0ACF; UNKNOWN 5015 0x0AD0, // 0AD0 ; GUJARATI 5016 0x0AD1, // 0AD1..0ADF; UNKNOWN 5017 0x0AE0, // 0AE0..0AE3; GUJARATI 5018 0x0AE4, // 0AE4..0AE5; UNKNOWN 5019 0x0AE6, // 0AE6..0AF1; GUJARATI 5020 0x0AF2, // 0AF2..0AF8; UNKNOWN 5021 0x0AF9, // 0AF9..0AFF; GUJARATI 5022 0x0B00, // 0B00 ; UNKNOWN 5023 0x0B01, // 0B01..0B03; ORIYA 5024 0x0B04, // 0B04 ; UNKNOWN 5025 0x0B05, // 0B05..0B0C; ORIYA 5026 0x0B0D, // 0B0D..0B0E; UNKNOWN 5027 0x0B0F, // 0B0F..0B10; ORIYA 5028 0x0B11, // 0B11..0B12; UNKNOWN 5029 0x0B13, // 0B13..0B28; ORIYA 5030 0x0B29, // 0B29 ; UNKNOWN 5031 0x0B2A, // 0B2A..0B30; ORIYA 5032 0x0B31, // 0B31 ; UNKNOWN 5033 0x0B32, // 0B32..0B33; ORIYA 5034 0x0B34, // 0B34 ; UNKNOWN 5035 0x0B35, // 0B35..0B39; ORIYA 5036 0x0B3A, // 0B3A..0B3B; UNKNOWN 5037 0x0B3C, // 0B3C..0B44; ORIYA 5038 0x0B45, // 0B45..0B46; UNKNOWN 5039 0x0B47, // 0B47..0B48; ORIYA 5040 0x0B49, // 0B49..0B4A; UNKNOWN 5041 0x0B4B, // 0B4B..0B4D; ORIYA 5042 0x0B4E, // 0B4E..0B55; UNKNOWN 5043 0x0B56, // 0B56..0B57; ORIYA 5044 0x0B58, // 0B58..0B5B; UNKNOWN 5045 0x0B5C, // 0B5C..0B5D; ORIYA 5046 0x0B5E, // 0B5E ; UNKNOWN 5047 0x0B5F, // 0B5F..0B63; ORIYA 5048 0x0B64, // 0B64..0B65; UNKNOWN 5049 0x0B66, // 0B66..0B77; ORIYA 5050 0x0B78, // 0B78..0B81; UNKNOWN 5051 0x0B82, // 0B82..0B83; TAMIL 5052 0x0B84, // 0B84 ; UNKNOWN 5053 0x0B85, // 0B85..0B8A; TAMIL 5054 0x0B8B, // 0B8B..0B8D; UNKNOWN 5055 0x0B8E, // 0B8E..0B90; TAMIL 5056 0x0B91, // 0B91 ; UNKNOWN 5057 0x0B92, // 0B92..0B95; TAMIL 5058 0x0B96, // 0B96..0B98; UNKNOWN 5059 0x0B99, // 0B99..0B9A; TAMIL 5060 0x0B9B, // 0B9B ; UNKNOWN 5061 0x0B9C, // 0B9C ; TAMIL 5062 0x0B9D, // 0B9D ; UNKNOWN 5063 0x0B9E, // 0B9E..0B9F; TAMIL 5064 0x0BA0, // 0BA0..0BA2; UNKNOWN 5065 0x0BA3, // 0BA3..0BA4; TAMIL 5066 0x0BA5, // 0BA5..0BA7; UNKNOWN 5067 0x0BA8, // 0BA8..0BAA; TAMIL 5068 0x0BAB, // 0BAB..0BAD; UNKNOWN 5069 0x0BAE, // 0BAE..0BB9; TAMIL 5070 0x0BBA, // 0BBA..0BBD; UNKNOWN 5071 0x0BBE, // 0BBE..0BC2; TAMIL 5072 0x0BC3, // 0BC3..0BC5; UNKNOWN 5073 0x0BC6, // 0BC6..0BC8; TAMIL 5074 0x0BC9, // 0BC9 ; UNKNOWN 5075 0x0BCA, // 0BCA..0BCD; TAMIL 5076 0x0BCE, // 0BCE..0BCF; UNKNOWN 5077 0x0BD0, // 0BD0 ; TAMIL 5078 0x0BD1, // 0BD1..0BD6; UNKNOWN 5079 0x0BD7, // 0BD7 ; TAMIL 5080 0x0BD8, // 0BD8..0BE5; UNKNOWN 5081 0x0BE6, // 0BE6..0BFA; TAMIL 5082 0x0BFB, // 0BFB..0BFF; UNKNOWN 5083 0x0C00, // 0C00..0C0C; TELUGU 5084 0x0C0D, // 0C0D ; UNKNOWN 5085 0x0C0E, // 0C0E..0C10; TELUGU 5086 0x0C11, // 0C11 ; UNKNOWN 5087 0x0C12, // 0C12..0C28; TELUGU 5088 0x0C29, // 0C29 ; UNKNOWN 5089 0x0C2A, // 0C2A..0C39; TELUGU 5090 0x0C3A, // 0C3A..0C3C; UNKNOWN 5091 0x0C3D, // 0C3D..0C44; TELUGU 5092 0x0C45, // 0C45 ; UNKNOWN 5093 0x0C46, // 0C46..0C48; TELUGU 5094 0x0C49, // 0C49 ; UNKNOWN 5095 0x0C4A, // 0C4A..0C4D; TELUGU 5096 0x0C4E, // 0C4E..0C54; UNKNOWN 5097 0x0C55, // 0C55..0C56; TELUGU 5098 0x0C57, // 0C57 ; UNKNOWN 5099 0x0C58, // 0C58..0C5A; TELUGU 5100 0x0C5B, // 0C5B..0C5F; UNKNOWN 5101 0x0C60, // 0C60..0C63; TELUGU 5102 0x0C64, // 0C64..0C65; UNKNOWN 5103 0x0C66, // 0C66..0C6F; TELUGU 5104 0x0C70, // 0C70..0C77; UNKNOWN 5105 0x0C78, // 0C78..0C7F; TELUGU 5106 0x0C80, // 0C80..0C8C; KANNADA 5107 0x0C8D, // 0C8D ; UNKNOWN 5108 0x0C8E, // 0C8E..0C90; KANNADA 5109 0x0C91, // 0C91 ; UNKNOWN 5110 0x0C92, // 0C92..0CA8; KANNADA 5111 0x0CA9, // 0CA9 ; UNKNOWN 5112 0x0CAA, // 0CAA..0CB3; KANNADA 5113 0x0CB4, // 0CB4 ; UNKNOWN 5114 0x0CB5, // 0CB5..0CB9; KANNADA 5115 0x0CBA, // 0CBA..0CBB; UNKNOWN 5116 0x0CBC, // 0CBC..0CC4; KANNADA 5117 0x0CC5, // 0CC5 ; UNKNOWN 5118 0x0CC6, // 0CC6..0CC8; KANNADA 5119 0x0CC9, // 0CC9 ; UNKNOWN 5120 0x0CCA, // 0CCA..0CCD; KANNADA 5121 0x0CCE, // 0CCE..0CD4; UNKNOWN 5122 0x0CD5, // 0CD5..0CD6; KANNADA 5123 0x0CD7, // 0CD7..0CDD; UNKNOWN 5124 0x0CDE, // 0CDE ; KANNADA 5125 0x0CDF, // 0CDF ; UNKNOWN 5126 0x0CE0, // 0CE0..0CE3; KANNADA 5127 0x0CE4, // 0CE4..0CE5; UNKNOWN 5128 0x0CE6, // 0CE6..0CEF; KANNADA 5129 0x0CF0, // 0CF0 ; UNKNOWN 5130 0x0CF1, // 0CF1..0CF2; KANNADA 5131 0x0CF3, // 0CF3..0CFF; UNKNOWN 5132 0x0D00, // 0D00..0D03; MALAYALAM 5133 0x0D04, // 0D04 ; UNKNOWN 5134 0x0D05, // 0D05..0D0C; MALAYALAM 5135 0x0D0D, // 0D0D ; UNKNOWN 5136 0x0D0E, // 0D0E..0D10; MALAYALAM 5137 0x0D11, // 0D11 ; UNKNOWN 5138 0x0D12, // 0D12..0D44; MALAYALAM 5139 0x0D45, // 0D45 ; UNKNOWN 5140 0x0D46, // 0D46..0D48; MALAYALAM 5141 0x0D49, // 0D49 ; UNKNOWN 5142 0x0D4A, // 0D4A..0D4F; MALAYALAM 5143 0x0D50, // 0D50..0D53; UNKNOWN 5144 0x0D54, // 0D54..0D63; MALAYALAM 5145 0x0D64, // 0D64..0D65; UNKNOWN 5146 0x0D66, // 0D66..0D7F; MALAYALAM 5147 0x0D80, // 0D80..0D81; UNKNOWN 5148 0x0D82, // 0D82..0D83; SINHALA 5149 0x0D84, // 0D84 ; UNKNOWN 5150 0x0D85, // 0D85..0D96; SINHALA 5151 0x0D97, // 0D97..0D99; UNKNOWN 5152 0x0D9A, // 0D9A..0DB1; SINHALA 5153 0x0DB2, // 0DB2 ; UNKNOWN 5154 0x0DB3, // 0DB3..0DBB; SINHALA 5155 0x0DBC, // 0DBC ; UNKNOWN 5156 0x0DBD, // 0DBD ; SINHALA 5157 0x0DBE, // 0DBE..0DBF; UNKNOWN 5158 0x0DC0, // 0DC0..0DC6; SINHALA 5159 0x0DC7, // 0DC7..0DC9; UNKNOWN 5160 0x0DCA, // 0DCA ; SINHALA 5161 0x0DCB, // 0DCB..0DCE; UNKNOWN 5162 0x0DCF, // 0DCF..0DD4; SINHALA 5163 0x0DD5, // 0DD5 ; UNKNOWN 5164 0x0DD6, // 0DD6 ; SINHALA 5165 0x0DD7, // 0DD7 ; UNKNOWN 5166 0x0DD8, // 0DD8..0DDF; SINHALA 5167 0x0DE0, // 0DE0..0DE5; UNKNOWN 5168 0x0DE6, // 0DE6..0DEF; SINHALA 5169 0x0DF0, // 0DF0..0DF1; UNKNOWN 5170 0x0DF2, // 0DF2..0DF4; SINHALA 5171 0x0DF5, // 0DF5..0E00; UNKNOWN 5172 0x0E01, // 0E01..0E3A; THAI 5173 0x0E3B, // 0E3B..0E3E; UNKNOWN 5174 0x0E3F, // 0E3F ; COMMON 5175 0x0E40, // 0E40..0E5B; THAI 5176 0x0E5C, // 0E5C..0E80; UNKNOWN 5177 0x0E81, // 0E81..0E82; LAO 5178 0x0E83, // 0E83 ; UNKNOWN 5179 0x0E84, // 0E84 ; LAO 5180 0x0E85, // 0E85..0E86; UNKNOWN 5181 0x0E87, // 0E87..0E88; LAO 5182 0x0E89, // 0E89 ; UNKNOWN 5183 0x0E8A, // 0E8A ; LAO 5184 0x0E8B, // 0E8B..0E8C; UNKNOWN 5185 0x0E8D, // 0E8D ; LAO 5186 0x0E8E, // 0E8E..0E93; UNKNOWN 5187 0x0E94, // 0E94..0E97; LAO 5188 0x0E98, // 0E98 ; UNKNOWN 5189 0x0E99, // 0E99..0E9F; LAO 5190 0x0EA0, // 0EA0 ; UNKNOWN 5191 0x0EA1, // 0EA1..0EA3; LAO 5192 0x0EA4, // 0EA4 ; UNKNOWN 5193 0x0EA5, // 0EA5 ; LAO 5194 0x0EA6, // 0EA6 ; UNKNOWN 5195 0x0EA7, // 0EA7 ; LAO 5196 0x0EA8, // 0EA8..0EA9; UNKNOWN 5197 0x0EAA, // 0EAA..0EAB; LAO 5198 0x0EAC, // 0EAC ; UNKNOWN 5199 0x0EAD, // 0EAD..0EB9; LAO 5200 0x0EBA, // 0EBA ; UNKNOWN 5201 0x0EBB, // 0EBB..0EBD; LAO 5202 0x0EBE, // 0EBE..0EBF; UNKNOWN 5203 0x0EC0, // 0EC0..0EC4; LAO 5204 0x0EC5, // 0EC5 ; UNKNOWN 5205 0x0EC6, // 0EC6 ; LAO 5206 0x0EC7, // 0EC7 ; UNKNOWN 5207 0x0EC8, // 0EC8..0ECD; LAO 5208 0x0ECE, // 0ECE..0ECF; UNKNOWN 5209 0x0ED0, // 0ED0..0ED9; LAO 5210 0x0EDA, // 0EDA..0EDB; UNKNOWN 5211 0x0EDC, // 0EDC..0EDF; LAO 5212 0x0EE0, // 0EE0..0EFF; UNKNOWN 5213 0x0F00, // 0F00..0F47; TIBETAN 5214 0x0F48, // 0F48 ; UNKNOWN 5215 0x0F49, // 0F49..0F6C; TIBETAN 5216 0x0F6D, // 0F6D..0F70; UNKNOWN 5217 0x0F71, // 0F71..0F97; TIBETAN 5218 0x0F98, // 0F98 ; UNKNOWN 5219 0x0F99, // 0F99..0FBC; TIBETAN 5220 0x0FBD, // 0FBD ; UNKNOWN 5221 0x0FBE, // 0FBE..0FCC; TIBETAN 5222 0x0FCD, // 0FCD ; UNKNOWN 5223 0x0FCE, // 0FCE..0FD4; TIBETAN 5224 0x0FD5, // 0FD5..0FD8; COMMON 5225 0x0FD9, // 0FD9..0FDA; TIBETAN 5226 0x0FDB, // 0FDB..FFF; UNKNOWN 5227 0x1000, // 1000..109F; MYANMAR 5228 0x10A0, // 10A0..10C5; GEORGIAN 5229 0x10C6, // 10C6 ; UNKNOWN 5230 0x10C7, // 10C7 ; GEORGIAN 5231 0x10C8, // 10C8..10CC; UNKNOWN 5232 0x10CD, // 10CD ; GEORGIAN 5233 0x10CE, // 10CE..10CF; UNKNOWN 5234 0x10D0, // 10D0..10FA; GEORGIAN 5235 0x10FB, // 10FB ; COMMON 5236 0x10FC, // 10FC..10FF; GEORGIAN 5237 0x1100, // 1100..11FF; HANGUL 5238 0x1200, // 1200..1248; ETHIOPIC 5239 0x1249, // 1249 ; UNKNOWN 5240 0x124A, // 124A..124D; ETHIOPIC 5241 0x124E, // 124E..124F; UNKNOWN 5242 0x1250, // 1250..1256; ETHIOPIC 5243 0x1257, // 1257 ; UNKNOWN 5244 0x1258, // 1258 ; ETHIOPIC 5245 0x1259, // 1259 ; UNKNOWN 5246 0x125A, // 125A..125D; ETHIOPIC 5247 0x125E, // 125E..125F; UNKNOWN 5248 0x1260, // 1260..1288; ETHIOPIC 5249 0x1289, // 1289 ; UNKNOWN 5250 0x128A, // 128A..128D; ETHIOPIC 5251 0x128E, // 128E..128F; UNKNOWN 5252 0x1290, // 1290..12B0; ETHIOPIC 5253 0x12B1, // 12B1 ; UNKNOWN 5254 0x12B2, // 12B2..12B5; ETHIOPIC 5255 0x12B6, // 12B6..12B7; UNKNOWN 5256 0x12B8, // 12B8..12BE; ETHIOPIC 5257 0x12BF, // 12BF ; UNKNOWN 5258 0x12C0, // 12C0 ; ETHIOPIC 5259 0x12C1, // 12C1 ; UNKNOWN 5260 0x12C2, // 12C2..12C5; ETHIOPIC 5261 0x12C6, // 12C6..12C7; UNKNOWN 5262 0x12C8, // 12C8..12D6; ETHIOPIC 5263 0x12D7, // 12D7 ; UNKNOWN 5264 0x12D8, // 12D8..1310; ETHIOPIC 5265 0x1311, // 1311 ; UNKNOWN 5266 0x1312, // 1312..1315; ETHIOPIC 5267 0x1316, // 1316..1317; UNKNOWN 5268 0x1318, // 1318..135A; ETHIOPIC 5269 0x135B, // 135B..135C; UNKNOWN 5270 0x135D, // 135D..137C; ETHIOPIC 5271 0x137D, // 137D..137F; UNKNOWN 5272 0x1380, // 1380..1399; ETHIOPIC 5273 0x139A, // 139A..139F; UNKNOWN 5274 0x13A0, // 13A0..13F5; CHEROKEE 5275 0x13F6, // 13F6..13F7; UNKNOWN 5276 0x13F8, // 13F8..13FD; CHEROKEE 5277 0x13FE, // 13FE..13FF; UNKNOWN 5278 0x1400, // 1400..167F; CANADIAN_ABORIGINAL 5279 0x1680, // 1680..169C; OGHAM 5280 0x169D, // 169D..169F; UNKNOWN 5281 0x16A0, // 16A0..16EA; RUNIC 5282 0x16EB, // 16EB..16ED; COMMON 5283 0x16EE, // 16EE..16F8; RUNIC 5284 0x16F9, // 16F9..16FF; UNKNOWN 5285 0x1700, // 1700..170C; TAGALOG 5286 0x170D, // 170D ; UNKNOWN 5287 0x170E, // 170E..1714; TAGALOG 5288 0x1715, // 1715..171F; UNKNOWN 5289 0x1720, // 1720..1734; HANUNOO 5290 0x1735, // 1735..1736; COMMON 5291 0x1737, // 1737..173F; UNKNOWN 5292 0x1740, // 1740..1753; BUHID 5293 0x1754, // 1754..175F; UNKNOWN 5294 0x1760, // 1760..176C; TAGBANWA 5295 0x176D, // 176D ; UNKNOWN 5296 0x176E, // 176E..1770; TAGBANWA 5297 0x1771, // 1771 ; UNKNOWN 5298 0x1772, // 1772..1773; TAGBANWA 5299 0x1774, // 1774..177F; UNKNOWN 5300 0x1780, // 1780..17DD; KHMER 5301 0x17DE, // 17DE..17DF; UNKNOWN 5302 0x17E0, // 17E0..17E9; KHMER 5303 0x17EA, // 17EA..17EF; UNKNOWN 5304 0x17F0, // 17F0..17F9; KHMER 5305 0x17FA, // 17FA..17FF; UNKNOWN 5306 0x1800, // 1800..1801; MONGOLIAN 5307 0x1802, // 1802..1803; COMMON 5308 0x1804, // 1804 ; MONGOLIAN 5309 0x1805, // 1805 ; COMMON 5310 0x1806, // 1806..180E; MONGOLIAN 5311 0x180F, // 180F ; UNKNOWN 5312 0x1810, // 1810..1819; MONGOLIAN 5313 0x181A, // 181A..181F; UNKNOWN 5314 0x1820, // 1820..1878; MONGOLIAN 5315 0x1879, // 1879..187F; UNKNOWN 5316 0x1880, // 1880..18AA; MONGOLIAN 5317 0x18AB, // 18AB..18AF; UNKNOWN 5318 0x18B0, // 18B0..18F5; CANADIAN_ABORIGINAL 5319 0x18F6, // 18F6..18FF; UNKNOWN 5320 0x1900, // 1900..191E; LIMBU 5321 0x191F, // 191F ; UNKNOWN 5322 0x1920, // 1920..192B; LIMBU 5323 0x192C, // 192C..192F; UNKNOWN 5324 0x1930, // 1930..193B; LIMBU 5325 0x193C, // 193C..193F; UNKNOWN 5326 0x1940, // 1940 ; LIMBU 5327 0x1941, // 1941..1943; UNKNOWN 5328 0x1944, // 1944..194F; LIMBU 5329 0x1950, // 1950..196D; TAI_LE 5330 0x196E, // 196E..196F; UNKNOWN 5331 0x1970, // 1970..1974; TAI_LE 5332 0x1975, // 1975..197F; UNKNOWN 5333 0x1980, // 1980..19AB; NEW_TAI_LUE 5334 0x19AC, // 19AC..19AF; UNKNOWN 5335 0x19B0, // 19B0..19C9; NEW_TAI_LUE 5336 0x19CA, // 19CA..19CF; UNKNOWN 5337 0x19D0, // 19D0..19DA; NEW_TAI_LUE 5338 0x19DB, // 19DB..19DD; UNKNOWN 5339 0x19DE, // 19DE..19DF; NEW_TAI_LUE 5340 0x19E0, // 19E0..19FF; KHMER 5341 0x1A00, // 1A00..1A1B; BUGINESE 5342 0x1A1C, // 1A1C..1A1D; UNKNOWN 5343 0x1A1E, // 1A1E..1A1F; BUGINESE 5344 0x1A20, // 1A20..1A5E; TAI_THAM 5345 0x1A5F, // 1A5F ; UNKNOWN 5346 0x1A60, // 1A60..1A7C; TAI_THAM 5347 0x1A7D, // 1A7D..1A7E; UNKNOWN 5348 0x1A7F, // 1A7F..1A89; TAI_THAM 5349 0x1A8A, // 1A8A..1A8F; UNKNOWN 5350 0x1A90, // 1A90..1A99; TAI_THAM 5351 0x1A9A, // 1A9A..1A9F; UNKNOWN 5352 0x1AA0, // 1AA0..1AAD; TAI_THAM 5353 0x1AAE, // 1AAE..1AAF; UNKNOWN 5354 0x1AB0, // 1AB0..1ABE; INHERITED 5355 0x1ABF, // 1ABF..1AFF; UNKNOWN 5356 0x1B00, // 1B00..1B4B; BALINESE 5357 0x1B4C, // 1B4C..1B4F; UNKNOWN 5358 0x1B50, // 1B50..1B7C; BALINESE 5359 0x1B7D, // 1B7D..1B7F; UNKNOWN 5360 0x1B80, // 1B80..1BBF; SUNDANESE 5361 0x1BC0, // 1BC0..1BF3; BATAK 5362 0x1BF4, // 1BF4..1BFB; UNKNOWN 5363 0x1BFC, // 1BFC..1BFF; BATAK 5364 0x1C00, // 1C00..1C37; LEPCHA 5365 0x1C38, // 1C38..1C3A; UNKNOWN 5366 0x1C3B, // 1C3B..1C49; LEPCHA 5367 0x1C4A, // 1C4A..1C4C; UNKNOWN 5368 0x1C4D, // 1C4D..1C4F; LEPCHA 5369 0x1C50, // 1C50..1C7F; OL_CHIKI 5370 0x1C80, // 1C80..1C88; CYRILLIC 5371 0x1C89, // 1C89 ; UNKNOWN 5372 0x1C90, // 1C90..1CBA; GEORGIAN 5373 0x1CBB, // 1CBB..1CBC; UNKNOWN 5374 0x1CBD, // 1CBD..1CBF; GEORGIAN 5375 0x1CC0, // 1CC0..1CC7; SUNDANESE 5376 0x1CC8, // 1CC8..1CCF; UNKNOWN 5377 0x1CD0, // 1CD0..1CD2; INHERITED 5378 0x1CD3, // 1CD3 ; COMMON 5379 0x1CD4, // 1CD4..1CE0; INHERITED 5380 0x1CE1, // 1CE1 ; COMMON 5381 0x1CE2, // 1CE2..1CE8; INHERITED 5382 0x1CE9, // 1CE9..1CEC; COMMON 5383 0x1CED, // 1CED ; INHERITED 5384 0x1CEE, // 1CEE..1CF3; COMMON 5385 0x1CF4, // 1CF4 ; INHERITED 5386 0x1CF5, // 1CF5..1CF7; COMMON 5387 0x1CF8, // 1CF8..1CF9; INHERITED 5388 0x1CFA, // 1CFA..1CFF; UNKNOWN 5389 0x1D00, // 1D00..1D25; LATIN 5390 0x1D26, // 1D26..1D2A; GREEK 5391 0x1D2B, // 1D2B ; CYRILLIC 5392 0x1D2C, // 1D2C..1D5C; LATIN 5393 0x1D5D, // 1D5D..1D61; GREEK 5394 0x1D62, // 1D62..1D65; LATIN 5395 0x1D66, // 1D66..1D6A; GREEK 5396 0x1D6B, // 1D6B..1D77; LATIN 5397 0x1D78, // 1D78 ; CYRILLIC 5398 0x1D79, // 1D79..1DBE; LATIN 5399 0x1DBF, // 1DBF ; GREEK 5400 0x1DC0, // 1DC0..1DF9; INHERITED 5401 0x1DFA, // 1DFA ; UNKNOWN 5402 0x1DFB, // 1DFB..1DFF; INHERITED 5403 0x1E00, // 1E00..1EFF; LATIN 5404 0x1F00, // 1F00..1F15; GREEK 5405 0x1F16, // 1F16..1F17; UNKNOWN 5406 0x1F18, // 1F18..1F1D; GREEK 5407 0x1F1E, // 1F1E..1F1F; UNKNOWN 5408 0x1F20, // 1F20..1F45; GREEK 5409 0x1F46, // 1F46..1F47; UNKNOWN 5410 0x1F48, // 1F48..1F4D; GREEK 5411 0x1F4E, // 1F4E..1F4F; UNKNOWN 5412 0x1F50, // 1F50..1F57; GREEK 5413 0x1F58, // 1F58 ; UNKNOWN 5414 0x1F59, // 1F59 ; GREEK 5415 0x1F5A, // 1F5A ; UNKNOWN 5416 0x1F5B, // 1F5B ; GREEK 5417 0x1F5C, // 1F5C ; UNKNOWN 5418 0x1F5D, // 1F5D ; GREEK 5419 0x1F5E, // 1F5E ; UNKNOWN 5420 0x1F5F, // 1F5F..1F7D; GREEK 5421 0x1F7E, // 1F7E..1F7F; UNKNOWN 5422 0x1F80, // 1F80..1FB4; GREEK 5423 0x1FB5, // 1FB5 ; UNKNOWN 5424 0x1FB6, // 1FB6..1FC4; GREEK 5425 0x1FC5, // 1FC5 ; UNKNOWN 5426 0x1FC6, // 1FC6..1FD3; GREEK 5427 0x1FD4, // 1FD4..1FD5; UNKNOWN 5428 0x1FD6, // 1FD6..1FDB; GREEK 5429 0x1FDC, // 1FDC ; UNKNOWN 5430 0x1FDD, // 1FDD..1FEF; GREEK 5431 0x1FF0, // 1FF0..1FF1; UNKNOWN 5432 0x1FF2, // 1FF2..1FF4; GREEK 5433 0x1FF5, // 1FF5 ; UNKNOWN 5434 0x1FF6, // 1FF6..1FFE; GREEK 5435 0x1FFF, // 1FFF ; UNKNOWN 5436 0x2000, // 2000..200B; COMMON 5437 0x200C, // 200C..200D; INHERITED 5438 0x200E, // 200E..2064; COMMON 5439 0x2065, // 2065 ; UNKNOWN 5440 0x2066, // 2066..2070; COMMON 5441 0x2071, // 2071 ; LATIN 5442 0x2072, // 2072..2073; UNKNOWN 5443 0x2074, // 2074..207E; COMMON 5444 0x207F, // 207F ; LATIN 5445 0x2080, // 2080..208E; COMMON 5446 0x208F, // 208F ; UNKNOWN 5447 0x2090, // 2090..209C; LATIN 5448 0x209D, // 209D..209F; UNKNOWN 5449 0x20A0, // 20A0..20BF; COMMON 5450 0x20C0, // 20C0..20CF; UNKNOWN 5451 0x20D0, // 20D0..20F0; INHERITED 5452 0x20F1, // 20F1..20FF; UNKNOWN 5453 0x2100, // 2100..2125; COMMON 5454 0x2126, // 2126 ; GREEK 5455 0x2127, // 2127..2129; COMMON 5456 0x212A, // 212A..212B; LATIN 5457 0x212C, // 212C..2131; COMMON 5458 0x2132, // 2132 ; LATIN 5459 0x2133, // 2133..214D; COMMON 5460 0x214E, // 214E ; LATIN 5461 0x214F, // 214F..215F; COMMON 5462 0x2160, // 2160..2188; LATIN 5463 0x2189, // 2189..218B; COMMON 5464 0x218C, // 218C..218F; UNKNOWN 5465 0x2190, // 2190..2426; COMMON 5466 0x2427, // 2427..243F; UNKNOWN 5467 0x2440, // 2440..244A; COMMON 5468 0x244B, // 244B..245F; UNKNOWN 5469 0x2460, // 2460..27FF; COMMON 5470 0x2800, // 2800..28FF; BRAILLE 5471 0x2900, // 2900..2B73; COMMON 5472 0x2B74, // 2B74..2B75; UNKNOWN 5473 0x2B76, // 2B76..2B95; COMMON 5474 0x2B96, // 2B96..2B97; UNKNOWN 5475 0x2B98, // 2B98..2BC8; COMMON 5476 0x2BC9, // 2BC9 ; UNKNOWN 5477 0x2BCA, // 2BCA..2BFE; COMMON 5478 0x2BFF, // 2BFF; UNKNOWN 5479 0x2C00, // 2C00..2C2E; GLAGOLITIC 5480 0x2C2F, // 2C2F ; UNKNOWN 5481 0x2C30, // 2C30..2C5E; GLAGOLITIC 5482 0x2C5F, // 2C5F ; UNKNOWN 5483 0x2C60, // 2C60..2C7F; LATIN 5484 0x2C80, // 2C80..2CF3; COPTIC 5485 0x2CF4, // 2CF4..2CF8; UNKNOWN 5486 0x2CF9, // 2CF9..2CFF; COPTIC 5487 0x2D00, // 2D00..2D25; GEORGIAN 5488 0x2D26, // 2D26 ; UNKNOWN 5489 0x2D27, // 2D27 ; GEORGIAN 5490 0x2D28, // 2D28..2D2C; UNKNOWN 5491 0x2D2D, // 2D2D ; GEORGIAN 5492 0x2D2E, // 2D2E..2D2F; UNKNOWN 5493 0x2D30, // 2D30..2D67; TIFINAGH 5494 0x2D68, // 2D68..2D6E; UNKNOWN 5495 0x2D6F, // 2D6F..2D70; TIFINAGH 5496 0x2D71, // 2D71..2D7E; UNKNOWN 5497 0x2D7F, // 2D7F ; TIFINAGH 5498 0x2D80, // 2D80..2D96; ETHIOPIC 5499 0x2D97, // 2D97..2D9F; UNKNOWN 5500 0x2DA0, // 2DA0..2DA6; ETHIOPIC 5501 0x2DA7, // 2DA7 ; UNKNOWN 5502 0x2DA8, // 2DA8..2DAE; ETHIOPIC 5503 0x2DAF, // 2DAF ; UNKNOWN 5504 0x2DB0, // 2DB0..2DB6; ETHIOPIC 5505 0x2DB7, // 2DB7 ; UNKNOWN 5506 0x2DB8, // 2DB8..2DBE; ETHIOPIC 5507 0x2DBF, // 2DBF ; UNKNOWN 5508 0x2DC0, // 2DC0..2DC6; ETHIOPIC 5509 0x2DC7, // 2DC7 ; UNKNOWN 5510 0x2DC8, // 2DC8..2DCE; ETHIOPIC 5511 0x2DCF, // 2DCF ; UNKNOWN 5512 0x2DD0, // 2DD0..2DD6; ETHIOPIC 5513 0x2DD7, // 2DD7 ; UNKNOWN 5514 0x2DD8, // 2DD8..2DDE; ETHIOPIC 5515 0x2DDF, // 2DDF ; UNKNOWN 5516 0x2DE0, // 2DE0..2DFF; CYRILLIC 5517 0x2E00, // 2E00..2E4E; COMMON 5518 0x2E4F, // 2E4F..2E7F; UNKNOWN 5519 0x2E80, // 2E80..2E99; HAN 5520 0x2E9A, // 2E9A ; UNKNOWN 5521 0x2E9B, // 2E9B..2EF3; HAN 5522 0x2EF4, // 2EF4..2EFF; UNKNOWN 5523 0x2F00, // 2F00..2FD5; HAN 5524 0x2FD6, // 2FD6..2FEF; UNKNOWN 5525 0x2FF0, // 2FF0..2FFB; COMMON 5526 0x2FFC, // 2FFC..2FFF; UNKNOWN 5527 0x3000, // 3000..3004; COMMON 5528 0x3005, // 3005 ; HAN 5529 0x3006, // 3006 ; COMMON 5530 0x3007, // 3007 ; HAN 5531 0x3008, // 3008..3020; COMMON 5532 0x3021, // 3021..3029; HAN 5533 0x302A, // 302A..302D; INHERITED 5534 0x302E, // 302E..302F; HANGUL 5535 0x3030, // 3030..3037; COMMON 5536 0x3038, // 3038..303B; HAN 5537 0x303C, // 303C..303F; COMMON 5538 0x3040, // 3040 ; UNKNOWN 5539 0x3041, // 3041..3096; HIRAGANA 5540 0x3097, // 3097..3098; UNKNOWN 5541 0x3099, // 3099..309A; INHERITED 5542 0x309B, // 309B..309C; COMMON 5543 0x309D, // 309D..309F; HIRAGANA 5544 0x30A0, // 30A0 ; COMMON 5545 0x30A1, // 30A1..30FA; KATAKANA 5546 0x30FB, // 30FB..30FC; COMMON 5547 0x30FD, // 30FD..30FF; KATAKANA 5548 0x3100, // 3100..3104; UNKNOWN 5549 0x3105, // 3105..312F; BOPOMOFO 5550 0x3130, // 3130; UNKNOWN 5551 0x3131, // 3131..318E; HANGUL 5552 0x318F, // 318F ; UNKNOWN 5553 0x3190, // 3190..319F; COMMON 5554 0x31A0, // 31A0..31BA; BOPOMOFO 5555 0x31BB, // 31BB..31BF; UNKNOWN 5556 0x31C0, // 31C0..31E3; COMMON 5557 0x31E4, // 31E4..31EF; UNKNOWN 5558 0x31F0, // 31F0..31FF; KATAKANA 5559 0x3200, // 3200..321E; HANGUL 5560 0x321F, // 321F ; UNKNOWN 5561 0x3220, // 3220..325F; COMMON 5562 0x3260, // 3260..327E; HANGUL 5563 0x327F, // 327F..32CF; COMMON 5564 0x32D0, // 32D0..32FE; KATAKANA 5565 0x32FF, // 32FF ; COMMON 5566 0x3300, // 3300..3357; KATAKANA 5567 0x3358, // 3358..33FF; COMMON 5568 0x3400, // 3400..4DB5; HAN 5569 0x4DB6, // 4DB6..4DBF; UNKNOWN 5570 0x4DC0, // 4DC0..4DFF; COMMON 5571 0x4E00, // 4E00..9FEF; HAN 5572 0x9FF0, // 9FF0..9FFF; UNKNOWN 5573 0xA000, // A000..A48C; YI 5574 0xA48D, // A48D..A48F; UNKNOWN 5575 0xA490, // A490..A4C6; YI 5576 0xA4C7, // A4C7..A4CF; UNKNOWN 5577 0xA4D0, // A4D0..A4FF; LISU 5578 0xA500, // A500..A62B; VAI 5579 0xA62C, // A62C..A63F; UNKNOWN 5580 0xA640, // A640..A69F; CYRILLIC 5581 0xA6A0, // A6A0..A6F7; BAMUM 5582 0xA6F8, // A6F8..A6FF; UNKNOWN 5583 0xA700, // A700..A721; COMMON 5584 0xA722, // A722..A787; LATIN 5585 0xA788, // A788..A78A; COMMON 5586 0xA78B, // A78B..A7B9; LATIN 5587 0xA7C0, // A7C0..A7F6; UNKNOWN 5588 0xA7F7, // A7F7..A7FF; LATIN 5589 0xA800, // A800..A82B; SYLOTI_NAGRI 5590 0xA82C, // A82C..A82F; UNKNOWN 5591 0xA830, // A830..A839; COMMON 5592 0xA83A, // A83A..A83F; UNKNOWN 5593 0xA840, // A840..A877; PHAGS_PA 5594 0xA878, // A878..A87F; UNKNOWN 5595 0xA880, // A880..A8C5; SAURASHTRA 5596 0xA8C6, // A8C6..A8CD; UNKNOWN 5597 0xA8CE, // A8CE..A8D9; SAURASHTRA 5598 0xA8DA, // A8DA..A8DF; UNKNOWN 5599 0xA8E0, // A8E0..A8FF; DEVANAGARI 5600 0xA900, // A900..A92D; KAYAH_LI 5601 0xA92E, // A92E ; COMMON 5602 0xA92F, // A92F ; KAYAH_LI 5603 0xA930, // A930..A953; REJANG 5604 0xA954, // A954..A95E; UNKNOWN 5605 0xA95F, // A95F ; REJANG 5606 0xA960, // A960..A97C; HANGUL 5607 0xA97D, // A97D..A97F; UNKNOWN 5608 0xA980, // A980..A9CD; JAVANESE 5609 0xA9CE, // A9CE ; UNKNOWN 5610 0xA9CF, // A9CF ; COMMON 5611 0xA9D0, // A9D0..A9D9; JAVANESE 5612 0xA9DA, // A9DA..A9DD; UNKNOWN 5613 0xA9DE, // A9DE..A9DF; JAVANESE 5614 0xA9E0, // A9E0..A9FE; MYANMAR 5615 0xA9FF, // A9FF ; UNKNOWN 5616 0xAA00, // AA00..AA36; CHAM 5617 0xAA37, // AA37..AA3F; UNKNOWN 5618 0xAA40, // AA40..AA4D; CHAM 5619 0xAA4E, // AA4E..AA4F; UNKNOWN 5620 0xAA50, // AA50..AA59; CHAM 5621 0xAA5A, // AA5A..AA5B; UNKNOWN 5622 0xAA5C, // AA5C..AA5F; CHAM 5623 0xAA60, // AA60..AA7F; MYANMAR 5624 0xAA80, // AA80..AAC2; TAI_VIET 5625 0xAAC3, // AAC3..AADA; UNKNOWN 5626 0xAADB, // AADB..AADF; TAI_VIET 5627 0xAAE0, // AAE0..AAF6; MEETEI_MAYEK 5628 0xAAF7, // AAF7..AB00; UNKNOWN 5629 0xAB01, // AB01..AB06; ETHIOPIC 5630 0xAB07, // AB07..AB08; UNKNOWN 5631 0xAB09, // AB09..AB0E; ETHIOPIC 5632 0xAB0F, // AB0F..AB10; UNKNOWN 5633 0xAB11, // AB11..AB16; ETHIOPIC 5634 0xAB17, // AB17..AB1F; UNKNOWN 5635 0xAB20, // AB20..AB26; ETHIOPIC 5636 0xAB27, // AB27 ; UNKNOWN 5637 0xAB28, // AB28..AB2E; ETHIOPIC 5638 0xAB2F, // AB2F ; UNKNOWN 5639 0xAB30, // AB30..AB5A; LATIN 5640 0xAB5B, // AB5B ; COMMON 5641 0xAB5C, // AB5C..AB64; LATIN 5642 0xAB65, // AB65 ; GREEK 5643 0xAB66, // AB66..AB6F; UNKNOWN 5644 0xAB70, // AB70..ABBF; CHEROKEE 5645 0xABC0, // ABC0..ABED; MEETEI_MAYEK 5646 0xABEE, // ABEE..ABEF; UNKNOWN 5647 0xABF0, // ABF0..ABF9; MEETEI_MAYEK 5648 0xABFA, // ABFA..ABFF; UNKNOWN 5649 0xAC00, // AC00..D7A3; HANGUL 5650 0xD7A4, // D7A4..D7AF; UNKNOWN 5651 0xD7B0, // D7B0..D7C6; HANGUL 5652 0xD7C7, // D7C7..D7CA; UNKNOWN 5653 0xD7CB, // D7CB..D7FB; HANGUL 5654 0xD7FC, // D7FC..F8FF; UNKNOWN 5655 0xF900, // F900..FA6D; HAN 5656 0xFA6E, // FA6E..FA6F; UNKNOWN 5657 0xFA70, // FA70..FAD9; HAN 5658 0xFADA, // FADA..FAFF; UNKNOWN 5659 0xFB00, // FB00..FB06; LATIN 5660 0xFB07, // FB07..FB12; UNKNOWN 5661 0xFB13, // FB13..FB17; ARMENIAN 5662 0xFB18, // FB18..FB1C; UNKNOWN 5663 0xFB1D, // FB1D..FB36; HEBREW 5664 0xFB37, // FB37 ; UNKNOWN 5665 0xFB38, // FB38..FB3C; HEBREW 5666 0xFB3D, // FB3D ; UNKNOWN 5667 0xFB3E, // FB3E ; HEBREW 5668 0xFB3F, // FB3F ; UNKNOWN 5669 0xFB40, // FB40..FB41; HEBREW 5670 0xFB42, // FB42 ; UNKNOWN 5671 0xFB43, // FB43..FB44; HEBREW 5672 0xFB45, // FB45 ; UNKNOWN 5673 0xFB46, // FB46..FB4F; HEBREW 5674 0xFB50, // FB50..FBC1; ARABIC 5675 0xFBC2, // FBC2..FBD2; UNKNOWN 5676 0xFBD3, // FBD3..FD3D; ARABIC 5677 0xFD3E, // FD3E..FD3F; COMMON 5678 0xFD40, // FD40..FD4F; UNKNOWN 5679 0xFD50, // FD50..FD8F; ARABIC 5680 0xFD90, // FD90..FD91; UNKNOWN 5681 0xFD92, // FD92..FDC7; ARABIC 5682 0xFDC8, // FDC8..FDEF; UNKNOWN 5683 0xFDF0, // FDF0..FDFD; ARABIC 5684 0xFDFE, // FDFE..FDFF; UNKNOWN 5685 0xFE00, // FE00..FE0F; INHERITED 5686 0xFE10, // FE10..FE19; COMMON 5687 0xFE1A, // FE1A..FE1F; UNKNOWN 5688 0xFE20, // FE20..FE2D; INHERITED 5689 0xFE2E, // FE2E..FE2F; CYRILLIC 5690 0xFE30, // FE30..FE52; COMMON 5691 0xFE53, // FE53 ; UNKNOWN 5692 0xFE54, // FE54..FE66; COMMON 5693 0xFE67, // FE67 ; UNKNOWN 5694 0xFE68, // FE68..FE6B; COMMON 5695 0xFE6C, // FE6C..FE6F; UNKNOWN 5696 0xFE70, // FE70..FE74; ARABIC 5697 0xFE75, // FE75 ; UNKNOWN 5698 0xFE76, // FE76..FEFC; ARABIC 5699 0xFEFD, // FEFD..FEFE; UNKNOWN 5700 0xFEFF, // FEFF ; COMMON 5701 0xFF00, // FF00 ; UNKNOWN 5702 0xFF01, // FF01..FF20; COMMON 5703 0xFF21, // FF21..FF3A; LATIN 5704 0xFF3B, // FF3B..FF40; COMMON 5705 0xFF41, // FF41..FF5A; LATIN 5706 0xFF5B, // FF5B..FF65; COMMON 5707 0xFF66, // FF66..FF6F; KATAKANA 5708 0xFF70, // FF70 ; COMMON 5709 0xFF71, // FF71..FF9D; KATAKANA 5710 0xFF9E, // FF9E..FF9F; COMMON 5711 0xFFA0, // FFA0..FFBE; HANGUL 5712 0xFFBF, // FFBF..FFC1; UNKNOWN 5713 0xFFC2, // FFC2..FFC7; HANGUL 5714 0xFFC8, // FFC8..FFC9; UNKNOWN 5715 0xFFCA, // FFCA..FFCF; HANGUL 5716 0xFFD0, // FFD0..FFD1; UNKNOWN 5717 0xFFD2, // FFD2..FFD7; HANGUL 5718 0xFFD8, // FFD8..FFD9; UNKNOWN 5719 0xFFDA, // FFDA..FFDC; HANGUL 5720 0xFFDD, // FFDD..FFDF; UNKNOWN 5721 0xFFE0, // FFE0..FFE6; COMMON 5722 0xFFE7, // FFE7 ; UNKNOWN 5723 0xFFE8, // FFE8..FFEE; COMMON 5724 0xFFEF, // FFEF..FFF8; UNKNOWN 5725 0xFFF9, // FFF9..FFFD; COMMON 5726 0xFFFE, // FFFE..FFFF; UNKNOWN 5727 0x10000, // 10000..1000B; LINEAR_B 5728 0x1000C, // 1000C ; UNKNOWN 5729 0x1000D, // 1000D..10026; LINEAR_B 5730 0x10027, // 10027 ; UNKNOWN 5731 0x10028, // 10028..1003A; LINEAR_B 5732 0x1003B, // 1003B ; UNKNOWN 5733 0x1003C, // 1003C..1003D; LINEAR_B 5734 0x1003E, // 1003E ; UNKNOWN 5735 0x1003F, // 1003F..1004D; LINEAR_B 5736 0x1004E, // 1004E..1004F; UNKNOWN 5737 0x10050, // 10050..1005D; LINEAR_B 5738 0x1005E, // 1005E..1007F; UNKNOWN 5739 0x10080, // 10080..100FA; LINEAR_B 5740 0x100FB, // 100FB..100FF; UNKNOWN 5741 0x10100, // 10100..10102; COMMON 5742 0x10103, // 10103..10106; UNKNOWN 5743 0x10107, // 10107..10133; COMMON 5744 0x10134, // 10134..10136; UNKNOWN 5745 0x10137, // 10137..1013F; COMMON 5746 0x10140, // 10140..1018E; GREEK 5747 0x1018F, // 1018F ; UNKNOWN 5748 0x10190, // 10190..1019B; COMMON 5749 0x1019C, // 1019C..1019F; UNKNOWN 5750 0x101A0, // 101A0 ; GREEK 5751 0x101A1, // 101A1..101CF; UNKNOWN 5752 0x101D0, // 101D0..101FC; COMMON 5753 0x101FD, // 101FD ; INHERITED 5754 0x101FE, // 101FE..1027F; UNKNOWN 5755 0x10280, // 10280..1029C; LYCIAN 5756 0x1029D, // 1029D..1029F; UNKNOWN 5757 0x102A0, // 102A0..102D0; CARIAN 5758 0x102D1, // 102D1..102DF; UNKNOWN 5759 0x102E0, // 102E0 ; INHERITED 5760 0x102E1, // 102E1..102FB; COMMON 5761 0x102FC, // 102FC..102FF; UNKNOWN 5762 0x10300, // 10300..10323; OLD_ITALIC 5763 0x10324, // 10324..1032C; UNKNOWN 5764 0x1032D, // 1032D..1032F; OLD_ITALIC 5765 0x10330, // 10330..1034A; GOTHIC 5766 0x1034B, // 1034B..1034F; UNKNOWN 5767 0x10350, // 10350..1037A; OLD_PERMIC 5768 0x1037B, // 1037B..1037F; UNKNOWN 5769 0x10380, // 10380..1039D; UGARITIC 5770 0x1039E, // 1039E ; UNKNOWN 5771 0x1039F, // 1039F ; UGARITIC 5772 0x103A0, // 103A0..103C3; OLD_PERSIAN 5773 0x103C4, // 103C4..103C7; UNKNOWN 5774 0x103C8, // 103C8..103D5; OLD_PERSIAN 5775 0x103D6, // 103D6..103FF; UNKNOWN 5776 0x10400, // 10400..1044F; DESERET 5777 0x10450, // 10450..1047F; SHAVIAN 5778 0x10480, // 10480..1049D; OSMANYA 5779 0x1049E, // 1049E..1049F; UNKNOWN 5780 0x104A0, // 104A0..104A9; OSMANYA 5781 0x104AA, // 104AA..104AF; UNKNOWN 5782 0x104B0, // 104B0..104D3; OSAGE 5783 0x104D4, // 104D4..104D7; UNKNOWN 5784 0x104D8, // 104D8..104FB; OSAGE 5785 0x104FC, // 104FC..104FF; UNKNOWN 5786 0x10500, // 10500..10527; ELBASAN 5787 0x10528, // 10528..1052F; UNKNOWN 5788 0x10530, // 10530..10563; CAUCASIAN_ALBANIAN 5789 0x10564, // 10564..1056E; UNKNOWN 5790 0x1056F, // 1056F ; CAUCASIAN_ALBANIAN 5791 0x10570, // 10570..105FF; UNKNOWN 5792 0x10600, // 10600..10736; LINEAR_A 5793 0x10737, // 10737..1073F; UNKNOWN 5794 0x10740, // 10740..10755; LINEAR_A 5795 0x10756, // 10756..1075F; UNKNOWN 5796 0x10760, // 10760..10767; LINEAR_A 5797 0x10768, // 10768..107FF; UNKNOWN 5798 0x10800, // 10800..10805; CYPRIOT 5799 0x10806, // 10806..10807; UNKNOWN 5800 0x10808, // 10808 ; CYPRIOT 5801 0x10809, // 10809 ; UNKNOWN 5802 0x1080A, // 1080A..10835; CYPRIOT 5803 0x10836, // 10836 ; UNKNOWN 5804 0x10837, // 10837..10838; CYPRIOT 5805 0x10839, // 10839..1083B; UNKNOWN 5806 0x1083C, // 1083C ; CYPRIOT 5807 0x1083D, // 1083D..1083E; UNKNOWN 5808 0x1083F, // 1083F ; CYPRIOT 5809 0x10840, // 10840..10855; IMPERIAL_ARAMAIC 5810 0x10856, // 10856 ; UNKNOWN 5811 0x10857, // 10857..1085F; IMPERIAL_ARAMAIC 5812 0x10860, // 10860..1087F; PALMYRENE 5813 0x10880, // 10880..1089E; NABATAEAN 5814 0x1089F, // 1089F..108A6; UNKNOWN 5815 0x108A7, // 108A7..108AF; NABATAEAN 5816 0x108B0, // 108B0..108DF; UNKNOWN 5817 0x108E0, // 108E0..108F2; HATRAN 5818 0x108F3, // 108F3 ; UNKNOWN 5819 0x108F4, // 108F4..108F5; HATRAN 5820 0x108F6, // 108F6..108FA; UNKNOWN 5821 0x108FB, // 108FB..108FF; HATRAN 5822 0x10900, // 10900..1091B; PHOENICIAN 5823 0x1091C, // 1091C..1091E; UNKNOWN 5824 0x1091F, // 1091F ; PHOENICIAN 5825 0x10920, // 10920..10939; LYDIAN 5826 0x1093A, // 1093A..1093E; UNKNOWN 5827 0x1093F, // 1093F ; LYDIAN 5828 0x10940, // 10940..1097F; UNKNOWN 5829 0x10980, // 10980..1099F; MEROITIC_HIEROGLYPHS 5830 0x109A0, // 109A0..109B7; MEROITIC_CURSIVE 5831 0x109B8, // 109B8..109BB; UNKNOWN 5832 0x109BC, // 109BC..109CF; MEROITIC_CURSIVE 5833 0x109D0, // 109D0..109D1; UNKNOWN 5834 0x109D2, // 109D2..109FF; MEROITIC_CURSIVE 5835 0x10A00, // 10A00..10A03; KHAROSHTHI 5836 0x10A04, // 10A04 ; UNKNOWN 5837 0x10A05, // 10A05..10A06; KHAROSHTHI 5838 0x10A07, // 10A07..10A0B; UNKNOWN 5839 0x10A0C, // 10A0C..10A13; KHAROSHTHI 5840 0x10A14, // 10A14 ; UNKNOWN 5841 0x10A15, // 10A15..10A17; KHAROSHTHI 5842 0x10A18, // 10A18 ; UNKNOWN 5843 0x10A19, // 10A19..10A35; KHAROSHTHI 5844 0x10A36, // 10A36..10A37; UNKNOWN 5845 0x10A38, // 10A38..10A3A; KHAROSHTHI 5846 0x10A3B, // 10A3B..10A3E; UNKNOWN 5847 0x10A3F, // 10A3F..10A48; KHAROSHTHI 5848 0x10A49, // 10A49..10A4F; UNKNOWN 5849 0x10A50, // 10A50..10A58; KHAROSHTHI 5850 0x10A59, // 10A59..10A5F; UNKNOWN 5851 0x10A60, // 10A60..10A7F; OLD_SOUTH_ARABIAN 5852 0x10A80, // 10A80..10A9F; OLD_NORTH_ARABIAN 5853 0x10AA0, // 10AA0..10ABF; UNKNOWN 5854 0x10AC0, // 10AC0..10AE6; MANICHAEAN 5855 0x10AE7, // 10AE7..10AEA; UNKNOWN 5856 0x10AEB, // 10AEB..10AF6; MANICHAEAN 5857 0x10AF7, // 10AF7..10AFF; UNKNOWN 5858 0x10B00, // 10B00..10B35; AVESTAN 5859 0x10B36, // 10B36..10B38; UNKNOWN 5860 0x10B39, // 10B39..10B3F; AVESTAN 5861 0x10B40, // 10B40..10B55; INSCRIPTIONAL_PARTHIAN 5862 0x10B56, // 10B56..10B57; UNKNOWN 5863 0x10B58, // 10B58..10B5F; INSCRIPTIONAL_PARTHIAN 5864 0x10B60, // 10B60..10B72; INSCRIPTIONAL_PAHLAVI 5865 0x10B73, // 10B73..10B77; UNKNOWN 5866 0x10B78, // 10B78..10B7F; INSCRIPTIONAL_PAHLAVI 5867 0x10B80, // 10B80..10B91; PSALTER_PAHLAVI 5868 0x10B92, // 10B92..10B98; UNKNOWN 5869 0x10B99, // 10B99..10B9C; PSALTER_PAHLAVI 5870 0x10B9D, // 10B9D..10BA8; UNKNOWN 5871 0x10BA9, // 10BA9..10BAF; PSALTER_PAHLAVI 5872 0x10BB0, // 10BB0..10BFF; UNKNOWN 5873 0x10C00, // 10C00..10C48; OLD_TURKIC 5874 0x10C49, // 10C49..10C7F; UNKNOWN 5875 0x10C80, // 10C80..10CB2; OLD_HUNGARIAN 5876 0x10CB3, // 10CB3..10CBF; UNKNOWN 5877 0x10CC0, // 10CC0..10CF2; OLD_HUNGARIAN 5878 0x10CF3, // 10CF3..10CF9; UNKNOWN 5879 0x10CFA, // 10CFA..10CFF; OLD_HUNGARIAN 5880 0x10D00, // 10D00..10D27; HANIFI ROHINGYA 5881 0x10D28, // 10D28..10D29; UNKNOWN 5882 0x10D30, // 10D30..10D39; HANIFI ROHINGYA 5883 0x10D3A, // 10D3A..10E5F; UNKNOWN 5884 0x10E60, // 10E60..10E7E; ARABIC 5885 0x10E7F, // 10E7F..10EFF; UNKNOWN 5886 0x10F00, // 10F00..10F27; OLD SOGDIAN 5887 0x10F28, // 10F28..10F2F; UNKNOWN 5888 0x10F30, // 10F30..10F59; SOGDIAN 5889 0x10F5A, // 10F5A..10FFF; UNKNOWN 5890 0x11000, // 11000..1104D; BRAHMI 5891 0x1104E, // 1104E..11051; UNKNOWN 5892 0x11052, // 11052..1106F; BRAHMI 5893 0x11070, // 11070..1107E; UNKNOWN 5894 0x1107F, // 1107F ; BRAHMI 5895 0x11080, // 11080..110C1; KAITHI 5896 0x110C2, // 110C2..110CC; UNKNOWN 5897 0x110CD, // 110CD ; KAITHI 5898 0x110CE, // 110CE..110CF; UNKNOWN 5899 0x110D0, // 110D0..110E8; SORA_SOMPENG 5900 0x110E9, // 110E9..110EF; UNKNOWN 5901 0x110F0, // 110F0..110F9; SORA_SOMPENG 5902 0x110FA, // 110FA..110FF; UNKNOWN 5903 0x11100, // 11100..11134; CHAKMA 5904 0x11135, // 11135 ; UNKNOWN 5905 0x11136, // 11136..11146; CHAKMA 5906 0x11147, // 11147..1114F; UNKNOWN 5907 0x11150, // 11150..11176; MAHAJANI 5908 0x11177, // 11177..1117F; UNKNOWN 5909 0x11180, // 11180..111CD; SHARADA 5910 0x111CE, // 111CE..111CF; UNKNOWN 5911 0x111D0, // 111D0..111DF; SHARADA 5912 0x111E0, // 111E0 ; UNKNOWN 5913 0x111E1, // 111E1..111F4; SINHALA 5914 0x111F5, // 111F5..111FF; UNKNOWN 5915 0x11200, // 11200..11211; KHOJKI 5916 0x11212, // 11212 ; UNKNOWN 5917 0x11213, // 11213..1123E; KHOJKI 5918 0x1123F, // 1123F..1127F; UNKNOWN 5919 0x11280, // 11280..11286; MULTANI 5920 0x11287, // 11287 ; UNKNOWN 5921 0x11288, // 11288 ; MULTANI 5922 0x11289, // 11289 ; UNKNOWN 5923 0x1128A, // 1128A..1128D; MULTANI 5924 0x1128E, // 1128E ; UNKNOWN 5925 0x1128F, // 1128F..1129D; MULTANI 5926 0x1129E, // 1129E ; UNKNOWN 5927 0x1129F, // 1129F..112A9; MULTANI 5928 0x112AA, // 112AA..112AF; UNKNOWN 5929 0x112B0, // 112B0..112EA; KHUDAWADI 5930 0x112EB, // 112EB..112EF; UNKNOWN 5931 0x112F0, // 112F0..112F9; KHUDAWADI 5932 0x112FA, // 112FA..112FF; UNKNOWN 5933 0x11300, // 11300..11303; GRANTHA 5934 0x11304, // 11304 ; UNKNOWN 5935 0x11305, // 11305..1130C; GRANTHA 5936 0x1130D, // 1130D..1130E; UNKNOWN 5937 0x1130F, // 1130F..11310; GRANTHA 5938 0x11311, // 11311..11312; UNKNOWN 5939 0x11313, // 11313..11328; GRANTHA 5940 0x11329, // 11329 ; UNKNOWN 5941 0x1132A, // 1132A..11330; GRANTHA 5942 0x11331, // 11331 ; UNKNOWN 5943 0x11332, // 11332..11333; GRANTHA 5944 0x11334, // 11334 ; UNKNOWN 5945 0x11335, // 11335..11339; GRANTHA 5946 0x1133A, // 1133A ; UNKNOWN 5947 0x1133B, // 1133B ; INHERITED 5948 0x1133C, // 1133C..11344; GRANTHA 5949 0x11345, // 11345..11346; UNKNOWN 5950 0x11347, // 11347..11348; GRANTHA 5951 0x11349, // 11349..1134A; UNKNOWN 5952 0x1134B, // 1134B..1134D; GRANTHA 5953 0x1134E, // 1134E..1134F; UNKNOWN 5954 0x11350, // 11350 ; GRANTHA 5955 0x11351, // 11351..11356; UNKNOWN 5956 0x11357, // 11357 ; GRANTHA 5957 0x11358, // 11358..1135C; UNKNOWN 5958 0x1135D, // 1135D..11363; GRANTHA 5959 0x11364, // 11364..11365; UNKNOWN 5960 0x11366, // 11366..1136C; GRANTHA 5961 0x1136D, // 1136D..1136F; UNKNOWN 5962 0x11370, // 11370..11374; GRANTHA 5963 0x11375, // 11375..113FF; UNKNOWN 5964 0x11400, // 11400..11459; NEWA 5965 0x1145A, // 1145A ; UNKNOWN 5966 0x1145B, // 1145B ; NEWA 5967 0x1145C, // 1145C ; UNKNOWN 5968 0x1145D, // 1145D..1145E; NEWA 5969 0x1145F, // 1145F..1147F; UNKNOWN 5970 0x11480, // 11480..114C7; TIRHUTA 5971 0x114C8, // 114C8..114CF; UNKNOWN 5972 0x114D0, // 114D0..114D9; TIRHUTA 5973 0x114DA, // 114DA..1157F; UNKNOWN 5974 0x11580, // 11580..115B5; SIDDHAM 5975 0x115B6, // 115B6..115B7; UNKNOWN 5976 0x115B8, // 115B8..115DD; SIDDHAM 5977 0x115DE, // 115DE..115FF; UNKNOWN 5978 0x11600, // 11600..11644; MODI 5979 0x11645, // 11645..1164F; UNKNOWN 5980 0x11650, // 11650..11659; MODI 5981 0x1165A, // 1165A..1165F; UNKNOWN 5982 0x11660, // 11660..1166C; MONGOLIAN 5983 0X1166D, // 1166D..1167F; UNKNOWN 5984 0x11680, // 11680..116B7; TAKRI 5985 0x116B8, // 116B8..116BF; UNKNOWN 5986 0x116C0, // 116C0..116C9; TAKRI 5987 0x116CA, // 116CA..116FF; UNKNOWN 5988 0x11700, // 11700..1171A; AHOM 5989 0x1171B, // 1171B..1171C; UNKNOWN 5990 0x1171D, // 1171D..1172B; AHOM 5991 0x1172C, // 1172C..1172F; UNKNOWN 5992 0x11730, // 11730..1173F; AHOM 5993 0x11740, // 11740..117FF; UNKNOWN 5994 0x11800, // 11800..1183B; DOGRA 5995 0x1183C, // 1183C..1189F; UNKNOWN 5996 0x118A0, // 118A0..118F2; WARANG_CITI 5997 0x118F3, // 118F3..118FE; UNKNOWN 5998 0x118FF, // 118FF ; WARANG_CITI 5999 0x11900, // 11900..119FF; UNKNOWN 6000 0x11A00, // 11A00..11A47; ZANABAZAR_SQUARE 6001 0X11A48, // 11A48..11A4F; UNKNOWN 6002 0x11A50, // 11A50..11A83; SOYOMBO 6003 0x11A84, // 11A84..11A85; UNKNOWN 6004 0x11A86, // 11A86..11AA2; SOYOMBO 6005 0x11AA3, // 11AA3..11ABF; UNKNOWN 6006 0x11AC0, // 11AC0..11AF8; PAU_CIN_HAU 6007 0x11AF9, // 11AF9..11BFF; UNKNOWN 6008 0x11C00, // 11C00..11C08; BHAIKSUKI 6009 0x11C09, // 11C09 ; UNKNOWN 6010 0x11C0A, // 11C0A..11C36; BHAIKSUKI 6011 0x11C37, // 11C37 ; UNKNOWN 6012 0x11C38, // 11C38..11C45; BHAIKSUKI 6013 0x11C46, // 11C46..11C49; UNKNOWN 6014 0x11C50, // 11C50..11C6C; BHAIKSUKI 6015 0x11C6D, // 11C6D..11C6F; UNKNOWN 6016 0x11C70, // 11C70..11C8F; MARCHEN 6017 0x11C90, // 11C90..11C91; UNKNOWN 6018 0x11C92, // 11C92..11CA7; MARCHEN 6019 0x11CA8, // 11CA8 ; UNKNOWN 6020 0x11CA9, // 11CA9..11CB6; MARCHEN 6021 0x11CB7, // 11CB7..11CFF; UNKNOWN 6022 0x11D00, // 11D00..11D06; MASARAM_GONDI 6023 0x11D07, // 11D07 ; UNKNOWN 6024 0x11D08, // 11D08..11D09; MASARAM_GONDI 6025 0x11D0A, // 11D0A ; UNKNOWN 6026 0x11D0B, // 11D0B..11D36; MASARAM_GONDI 6027 0x11D37, // 11D37..11D39; UNKNOWN 6028 0x11D3A, // 11D3A ; MASARAM_GONDI 6029 0x11D3B, // 11D3B ; UNKNOWN 6030 0x11D3C, // 11D3C..11D3D; MASARAM_GONDI 6031 0x11D3E, // 11D3E ; UNKNOWN 6032 0x11D3F, // 11D3F..11D47; MASARAM_GONDI 6033 0x11D48, // 11D48..11D49, UNKNOWN 6034 0x11D50, // 11D50..11D59; MASARAM_GONDI 6035 0x11D5A, // 11D5A..11D5F; UNKNOWN 6036 0x11D60, // 11D60..11D68; GUNJALA GONDI 6037 0x11D69, // ; UNKNOWN 6038 0x11D6A, // 11D6A..11D8E; GUNJALA GONDI 6039 0x11D8F, // ; UNKNOWN 6040 0x11D90, // 11D90..11D91; GUNJALA GONDI 6041 0x11D92, // ; UNKNOWN 6042 0x11D93, // 11D93..11D98; GUNJALA GONDI 6043 0x11D99, // 11D99 ; UNKNOWN 6044 0x11DA0, // 11DA0..11DA9; GUNJALA GONDI 6045 0x11DAA, // 11DAA..11DFF; UNKNOWN 6046 0x11EE0, // 11EE0..11EF8; MAKASAR 6047 0x11EF9, // 11EF9..11FFF; UNKNOWN 6048 0x12000, // 12000..12399; CUNEIFORM 6049 0x1239A, // 1239A..123FF; UNKNOWN 6050 0x12400, // 12400..1246E; CUNEIFORM 6051 0x1246F, // 1246F ; UNKNOWN 6052 0x12470, // 12470..12474; CUNEIFORM 6053 0x12475, // 12475..1247F; UNKNOWN 6054 0x12480, // 12480..12543; CUNEIFORM 6055 0x12544, // 12544..12FFF; UNKNOWN 6056 0x13000, // 13000..1342E; EGYPTIAN_HIEROGLYPHS 6057 0x1342F, // 1342F..143FF; UNKNOWN 6058 0x14400, // 14400..14646; ANATOLIAN_HIEROGLYPHS 6059 0x14647, // 14647..167FF; UNKNOWN 6060 0x16800, // 16800..16A38; BAMUM 6061 0x16A39, // 16A39..16A3F; UNKNOWN 6062 0x16A40, // 16A40..16A5E; MRO 6063 0x16A5F, // 16A5F ; UNKNOWN 6064 0x16A60, // 16A60..16A69; MRO 6065 0x16A6A, // 16A6A..16A6D; UNKNOWN 6066 0x16A6E, // 16A6E..16A6F; MRO 6067 0x16A70, // 16A70..16ACF; UNKNOWN 6068 0x16AD0, // 16AD0..16AED; BASSA_VAH 6069 0x16AEE, // 16AEE..16AEF; UNKNOWN 6070 0x16AF0, // 16AF0..16AF5; BASSA_VAH 6071 0x16AF6, // 16AF6..16AFF; UNKNOWN 6072 0x16B00, // 16B00..16B45; PAHAWH_HMONG 6073 0x16B46, // 16B46..16B4F; UNKNOWN 6074 0x16B50, // 16B50..16B59; PAHAWH_HMONG 6075 0x16B5A, // 16B5A ; UNKNOWN 6076 0x16B5B, // 16B5B..16B61; PAHAWH_HMONG 6077 0x16B62, // 16B62 ; UNKNOWN 6078 0x16B63, // 16B63..16B77; PAHAWH_HMONG 6079 0x16B78, // 16B78..16B7C; UNKNOWN 6080 0x16B7D, // 16B7D..16B8F; PAHAWH_HMONG 6081 0x16B90, // 16B90..16E3F; UNKNOWN 6082 0x16E40, // 16E40..16E9A; MEDEFAIDRIN 6083 0x16E9B, // 16E9B..16EFF; UNKNOWN 6084 0x16F00, // 16F00..16F44; MIAO 6085 0x16F45, // 16F45..16F4F; UNKNOWN 6086 0x16F50, // 16F50..16F7E; MIAO 6087 0x16F7F, // 16F7F..16F8E; UNKNOWN 6088 0x16F8F, // 16F8F..16F9F; MIAO 6089 0x16FA0, // 16FA0..16FDF; UNKNOWN 6090 0x16FE0, // 16FE0 ; TANGUT 6091 0x16FE1, // 16FE1 ; NUSHU 6092 0x16FE2, // 16FE2..16FFF; UNKNOWN 6093 0x17000, // 17000..187F1; TANGUT 6094 0x187F2, // 187F2..187FF; UNKNOWN 6095 0x18800, // 18800..18AF2; TANGUT 6096 0x18AF3, // 18AF3..1AFFF; UNKNOWN 6097 0x1B000, // 1B000 ; KATAKANA 6098 0x1B001, // 1B001..1B11E; HIRAGANA 6099 0x1B11F, // 1B11F..1B16F; UNKNOWN 6100 0x1B170, // 1B170..1B2FB; NUSHU 6101 0x1B2FC, // 1B2FC..1BBFF; UNKNOWN 6102 0x1BC00, // 1BC00..1BC6A; DUPLOYAN 6103 0x1BC6B, // 1BC6B..1BC6F; UNKNOWN 6104 0x1BC70, // 1BC70..1BC7C; DUPLOYAN 6105 0x1BC7D, // 1BC7D..1BC7F; UNKNOWN 6106 0x1BC80, // 1BC80..1BC88; DUPLOYAN 6107 0x1BC89, // 1BC89..1BC8F; UNKNOWN 6108 0x1BC90, // 1BC90..1BC99; DUPLOYAN 6109 0x1BC9A, // 1BC9A..1BC9B; UNKNOWN 6110 0x1BC9C, // 1BC9C..1BC9F; DUPLOYAN 6111 0x1BCA0, // 1BCA0..1BCA3; COMMON 6112 0x1BCA4, // 1BCA4..1CFFF; UNKNOWN 6113 0x1D000, // 1D000..1D0F5; COMMON 6114 0x1D0F6, // 1D0F6..1D0FF; UNKNOWN 6115 0x1D100, // 1D100..1D126; COMMON 6116 0x1D127, // 1D127..1D128; UNKNOWN 6117 0x1D129, // 1D129..1D166; COMMON 6118 0x1D167, // 1D167..1D169; INHERITED 6119 0x1D16A, // 1D16A..1D17A; COMMON 6120 0x1D17B, // 1D17B..1D182; INHERITED 6121 0x1D183, // 1D183..1D184; COMMON 6122 0x1D185, // 1D185..1D18B; INHERITED 6123 0x1D18C, // 1D18C..1D1A9; COMMON 6124 0x1D1AA, // 1D1AA..1D1AD; INHERITED 6125 0x1D1AE, // 1D1AE..1D1E8; COMMON 6126 0x1D1E9, // 1D1E9..1D1FF; UNKNOWN 6127 0x1D200, // 1D200..1D245; GREEK 6128 0x1D246, // 1D246..1D2DF; UNKNOWN 6129 0x1D2E0, // 1D2E0..1D2F3; COMMON 6130 0x1D2F4, // 1D2F4..1D2FF; UNKNOWN 6131 0x1D300, // 1D300..1D356; COMMON 6132 0x1D357, // 1D357..1D35F; UNKNOWN 6133 0x1D360, // 1D360..1D378; COMMON 6134 0x1D379, // 1D379..1D3FF; UNKNOWN 6135 0x1D400, // 1D400..1D454; COMMON 6136 0x1D455, // 1D455 ; UNKNOWN 6137 0x1D456, // 1D456..1D49C; COMMON 6138 0x1D49D, // 1D49D ; UNKNOWN 6139 0x1D49E, // 1D49E..1D49F; COMMON 6140 0x1D4A0, // 1D4A0..1D4A1; UNKNOWN 6141 0x1D4A2, // 1D4A2 ; COMMON 6142 0x1D4A3, // 1D4A3..1D4A4; UNKNOWN 6143 0x1D4A5, // 1D4A5..1D4A6; COMMON 6144 0x1D4A7, // 1D4A7..1D4A8; UNKNOWN 6145 0x1D4A9, // 1D4A9..1D4AC; COMMON 6146 0x1D4AD, // 1D4AD ; UNKNOWN 6147 0x1D4AE, // 1D4AE..1D4B9; COMMON 6148 0x1D4BA, // 1D4BA ; UNKNOWN 6149 0x1D4BB, // 1D4BB ; COMMON 6150 0x1D4BC, // 1D4BC ; UNKNOWN 6151 0x1D4BD, // 1D4BD..1D4C3; COMMON 6152 0x1D4C4, // 1D4C4 ; UNKNOWN 6153 0x1D4C5, // 1D4C5..1D505; COMMON 6154 0x1D506, // 1D506 ; UNKNOWN 6155 0x1D507, // 1D507..1D50A; COMMON 6156 0x1D50B, // 1D50B..1D50C; UNKNOWN 6157 0x1D50D, // 1D50D..1D514; COMMON 6158 0x1D515, // 1D515 ; UNKNOWN 6159 0x1D516, // 1D516..1D51C; COMMON 6160 0x1D51D, // 1D51D ; UNKNOWN 6161 0x1D51E, // 1D51E..1D539; COMMON 6162 0x1D53A, // 1D53A ; UNKNOWN 6163 0x1D53B, // 1D53B..1D53E; COMMON 6164 0x1D53F, // 1D53F ; UNKNOWN 6165 0x1D540, // 1D540..1D544; COMMON 6166 0x1D545, // 1D545 ; UNKNOWN 6167 0x1D546, // 1D546 ; COMMON 6168 0x1D547, // 1D547..1D549; UNKNOWN 6169 0x1D54A, // 1D54A..1D550; COMMON 6170 0x1D551, // 1D551 ; UNKNOWN 6171 0x1D552, // 1D552..1D6A5; COMMON 6172 0x1D6A6, // 1D6A6..1D6A7; UNKNOWN 6173 0x1D6A8, // 1D6A8..1D7CB; COMMON 6174 0x1D7CC, // 1D7CC..1D7CD; UNKNOWN 6175 0x1D7CE, // 1D7CE..1D7FF; COMMON 6176 0x1D800, // 1D800..1DA8B; SIGNWRITING 6177 0x1DA8C, // 1DA8C..1DA9A; UNKNOWN 6178 0x1DA9B, // 1DA9B..1DA9F; SIGNWRITING 6179 0x1DAA0, // 1DAA0 ; UNKNOWN 6180 0x1DAA1, // 1DAA1..1DAAF; SIGNWRITING 6181 0x1DAB0, // 1DAB0..1DFFF; UNKNOWN 6182 0x1E000, // 1E000..1E006; GLAGOLITIC 6183 0x1E007, // 1E007 ; UNKNOWN 6184 0x1E008, // 1E008..1E018; GLAGOLITIC 6185 0x1E019, // 1E019..1E01A; UNKNOWN 6186 0x1E01B, // 1E01B..1E021; GLAGOLITIC 6187 0x1E022, // 1E022 ; UNKNOWN 6188 0x1E023, // 1E023..1E024; GLAGOLITIC 6189 0x1E025, // 1E025 ; UNKNOWN 6190 0x1E026, // 1E026..1E02A; GLAGOLITIC 6191 0x1E02B, // 1E02B..1E7FF; UNKNOWN 6192 0x1E800, // 1E800..1E8C4; MENDE_KIKAKUI 6193 0x1E8C5, // 1E8C5..1E8C6; UNKNOWN 6194 0x1E8C7, // 1E8C7..1E8D6; MENDE_KIKAKUI 6195 0x1E8D7, // 1E8D7..1E8FF; UNKNOWN 6196 0x1E900, // 1E900..1E94A; ADLAM 6197 0x1E94B, // 1E94B..1E94F; UNKNOWN 6198 0x1E950, // 1E950..1E959; ADLAM 6199 0x1E95A, // 1E95A..1E95D; UNKNOWN 6200 0x1E95E, // 1E95E..1E95F; ADLAM 6201 0x1E960, // 1E960..1EC70; UNKNOWN 6202 0x1EC71, // 1EC71..1ECB4; COMMON 6203 0x1ECB5, // 1ECB5..1EDFF; UNKNOWN 6204 0x1EE00, // 1EE00..1EE03; ARABIC 6205 0x1EE04, // 1EE04 ; UNKNOWN 6206 0x1EE05, // 1EE05..1EE1F; ARABIC 6207 0x1EE20, // 1EE20 ; UNKNOWN 6208 0x1EE21, // 1EE21..1EE22; ARABIC 6209 0x1EE23, // 1EE23 ; UNKNOWN 6210 0x1EE24, // 1EE24 ; ARABIC 6211 0x1EE25, // 1EE25..1EE26; UNKNOWN 6212 0x1EE27, // 1EE27 ; ARABIC 6213 0x1EE28, // 1EE28 ; UNKNOWN 6214 0x1EE29, // 1EE29..1EE32; ARABIC 6215 0x1EE33, // 1EE33 ; UNKNOWN 6216 0x1EE34, // 1EE34..1EE37; ARABIC 6217 0x1EE38, // 1EE38 ; UNKNOWN 6218 0x1EE39, // 1EE39 ; ARABIC 6219 0x1EE3A, // 1EE3A ; UNKNOWN 6220 0x1EE3B, // 1EE3B ; ARABIC 6221 0x1EE3C, // 1EE3C..1EE41; UNKNOWN 6222 0x1EE42, // 1EE42 ; ARABIC 6223 0x1EE43, // 1EE43..1EE46; UNKNOWN 6224 0x1EE47, // 1EE47 ; ARABIC 6225 0x1EE48, // 1EE48 ; UNKNOWN 6226 0x1EE49, // 1EE49 ; ARABIC 6227 0x1EE4A, // 1EE4A ; UNKNOWN 6228 0x1EE4B, // 1EE4B ; ARABIC 6229 0x1EE4C, // 1EE4C ; UNKNOWN 6230 0x1EE4D, // 1EE4D..1EE4F; ARABIC 6231 0x1EE50, // 1EE50 ; UNKNOWN 6232 0x1EE51, // 1EE51..1EE52; ARABIC 6233 0x1EE53, // 1EE53 ; UNKNOWN 6234 0x1EE54, // 1EE54 ; ARABIC 6235 0x1EE55, // 1EE55..1EE56; UNKNOWN 6236 0x1EE57, // 1EE57 ; ARABIC 6237 0x1EE58, // 1EE58 ; UNKNOWN 6238 0x1EE59, // 1EE59 ; ARABIC 6239 0x1EE5A, // 1EE5A ; UNKNOWN 6240 0x1EE5B, // 1EE5B ; ARABIC 6241 0x1EE5C, // 1EE5C ; UNKNOWN 6242 0x1EE5D, // 1EE5D ; ARABIC 6243 0x1EE5E, // 1EE5E ; UNKNOWN 6244 0x1EE5F, // 1EE5F ; ARABIC 6245 0x1EE60, // 1EE60 ; UNKNOWN 6246 0x1EE61, // 1EE61..1EE62; ARABIC 6247 0x1EE63, // 1EE63 ; UNKNOWN 6248 0x1EE64, // 1EE64 ; ARABIC 6249 0x1EE65, // 1EE65..1EE66; UNKNOWN 6250 0x1EE67, // 1EE67..1EE6A; ARABIC 6251 0x1EE6B, // 1EE6B ; UNKNOWN 6252 0x1EE6C, // 1EE6C..1EE72; ARABIC 6253 0x1EE73, // 1EE73 ; UNKNOWN 6254 0x1EE74, // 1EE74..1EE77; ARABIC 6255 0x1EE78, // 1EE78 ; UNKNOWN 6256 0x1EE79, // 1EE79..1EE7C; ARABIC 6257 0x1EE7D, // 1EE7D ; UNKNOWN 6258 0x1EE7E, // 1EE7E ; ARABIC 6259 0x1EE7F, // 1EE7F ; UNKNOWN 6260 0x1EE80, // 1EE80..1EE89; ARABIC 6261 0x1EE8A, // 1EE8A ; UNKNOWN 6262 0x1EE8B, // 1EE8B..1EE9B; ARABIC 6263 0x1EE9C, // 1EE9C..1EEA0; UNKNOWN 6264 0x1EEA1, // 1EEA1..1EEA3; ARABIC 6265 0x1EEA4, // 1EEA4 ; UNKNOWN 6266 0x1EEA5, // 1EEA5..1EEA9; ARABIC 6267 0x1EEAA, // 1EEAA ; UNKNOWN 6268 0x1EEAB, // 1EEAB..1EEBB; ARABIC 6269 0x1EEBC, // 1EEBC..1EEEF; UNKNOWN 6270 0x1EEF0, // 1EEF0..1EEF1; ARABIC 6271 0x1EEF2, // 1EEF2..1EFFF; UNKNOWN 6272 0x1F000, // 1F000..1F02B; COMMON 6273 0x1F02C, // 1F02C..1F02F; UNKNOWN 6274 0x1F030, // 1F030..1F093; COMMON 6275 0x1F094, // 1F094..1F09F; UNKNOWN 6276 0x1F0A0, // 1F0A0..1F0AE; COMMON 6277 0x1F0AF, // 1F0AF..1F0B0; UNKNOWN 6278 0x1F0B1, // 1F0B1..1F0BF; COMMON 6279 0x1F0C0, // 1F0C0 ; UNKNOWN 6280 0x1F0C1, // 1F0C1..1F0CF; COMMON 6281 0x1F0D0, // 1F0D0 ; UNKNOWN 6282 0x1F0D1, // 1F0D1..1F0F5; COMMON 6283 0x1F0F6, // 1F0F6..1F0FF; UNKNOWN 6284 0x1F100, // 1F100..1F10C; COMMON 6285 0x1F10D, // 1F10D..1F10F; UNKNOWN 6286 0x1F110, // 1F110..1F16B; COMMON 6287 0x1F16C, // 1F16C..1F16F; UNKNOWN 6288 0x1F170, // 1F170..1F1AC; COMMON 6289 0x1F1AD, // 1F1AD..1F1E5; UNKNOWN 6290 0x1F1E6, // 1F1E6..1F1FF; COMMON 6291 0x1F200, // 1F200 ; HIRAGANA 6292 0x1F201, // 1F201..1F202; COMMON 6293 0x1F203, // 1F203..1F20F; UNKNOWN 6294 0x1F210, // 1F210..1F23B; COMMON 6295 0x1F23C, // 1F23C..1F23F; UNKNOWN 6296 0x1F240, // 1F240..1F248; COMMON 6297 0x1F249, // 1F249..1F24F; UNKNOWN 6298 0x1F250, // 1F250..1F251; COMMON 6299 0x1F252, // 1F252..1F25F; UNKNOWN 6300 0x1F260, // 1F260..1F265; COMMON 6301 0x1F266, // 1F266..1F2FF; UNKNOWN 6302 0x1F300, // 1F300..1F6D4; COMMON 6303 0x1F6D5, // 1F6D5..1F6DF; UNKNOWN 6304 0x1F6E0, // 1F6E0..1F6EC; COMMON 6305 0x1F6ED, // 1F6ED..1F6EF; UNKNOWN 6306 0x1F6F0, // 1F6F0..1F6F9; COMMON 6307 0x1F6FA, // 1F6FA..1F6FF; UNKNOWN 6308 0x1F700, // 1F700..1F773; COMMON 6309 0x1F774, // 1F774..1F77F; UNKNOWN 6310 0x1F780, // 1F780..1F7D8; COMMON 6311 0x1F7D9, // 1F7D9..1F7FF; UNKNOWN 6312 0x1F800, // 1F800..1F80B; COMMON 6313 0x1F80C, // 1F80C..1F80F; UNKNOWN 6314 0x1F810, // 1F810..1F847; COMMON 6315 0x1F848, // 1F848..1F84F; UNKNOWN 6316 0x1F850, // 1F850..1F859; COMMON 6317 0x1F85A, // 1F85A..1F85F; UNKNOWN 6318 0x1F860, // 1F860..1F887; COMMON 6319 0x1F888, // 1F888..1F88F; UNKNOWN 6320 0x1F890, // 1F890..1F8AD; COMMON 6321 0x1F8AE, // 1F8AE..1F8FF; UNKNOWN 6322 0x1F900, // 1F900..1F90B; COMMON 6323 0x1F90C, // 1F90C..1F90F; UNKNOWN 6324 0x1F910, // 1F910..1F93E; COMMON 6325 0x1F93F, // 1F93F ; UNKNOWN 6326 0x1F940, // 1F940..1F970; COMMON 6327 0x1F971, // 1F971..1F972; UNKNOWN 6328 0x1F973, // 1F973..1F976; COMMON 6329 0x1F977, // 1F977..1F979; UNKNOWN 6330 0x1F97A, // 1F97A ; COMMON 6331 0x1F97B, // 1F97B ; UNKNOWN 6332 0x1F97C, // 1F97C..1F9A2; COMMON 6333 0x1F9A3, // 1F9A3..1F9AF; UNKNOWN 6334 0x1F9B0, // 1F9B0..1F9B9; COMMON 6335 0x1F9BA, // 1F9BA..1F9BF; UNKNOWN 6336 0x1F9C0, // 1F9C0..1F9C2; COMMON 6337 0x1F9C3, // 1F9C3..1F9CF; UNKNOWN 6338 0x1F9D0, // 1F9D0..1F9FF; COMMON 6339 0x1FA00, // 1FA00..1FA5F; UNKNOWN 6340 0x1FA60, // 1FA60..1FA6D; COMMON 6341 0x1FA6E, // 1FA6E..1FFFF; UNKNOWN 6342 0x20000, // 20000..2A6D6; HAN 6343 0x2A6D7, // 2A6D7..2A6FF; UNKNOWN 6344 0x2A700, // 2A700..2B734; HAN 6345 0x2B735, // 2B735..2B73F; UNKNOWN 6346 0x2B740, // 2B740..2B81D; HAN 6347 0x2B81E, // 2B81E..2B81F; UNKNOWN 6348 0x2B820, // 2B820..2CEA1; HAN 6349 0x2CEA2, // 2CEA2..2CEAF; UNKNOWN 6350 0x2CEB0, // 2CEB0..2EBE0; HAN 6351 0x2EBE1, // 2EBE1..2F7FF; UNKNOWN 6352 0x2F800, // 2F800..2FA1D; HAN 6353 0x2FA1E, // 2FA1E..E0000; UNKNOWN 6354 0xE0001, // E0001 ; COMMON 6355 0xE0002, // E0002..E001F; UNKNOWN 6356 0xE0020, // E0020..E007F; COMMON 6357 0xE0080, // E0080..E00FF; UNKNOWN 6358 0xE0100, // E0100..E01EF; INHERITED 6359 0xE01F0 // E01F0..10FFFF; UNKNOWN 6360 }; 6361 6362 private static final UnicodeScript[] scripts = { 6363 COMMON, // 0000..0040 6364 LATIN, // 0041..005A 6365 COMMON, // 005B..0060 6366 LATIN, // 0061..007A 6367 COMMON, // 007B..00A9 6368 LATIN, // 00AA 6369 COMMON, // 00AB..00B9 6370 LATIN, // 00BA 6371 COMMON, // 00BB..00BF 6372 LATIN, // 00C0..00D6 6373 COMMON, // 00D7 6374 LATIN, // 00D8..00F6 6375 COMMON, // 00F7 6376 LATIN, // 00F8..02B8 6377 COMMON, // 02B9..02DF 6378 LATIN, // 02E0..02E4 6379 COMMON, // 02E5..02E9 6380 BOPOMOFO, // 02EA..02EB 6381 COMMON, // 02EC..02FF 6382 INHERITED, // 0300..036F 6383 GREEK, // 0370..0373 6384 COMMON, // 0374 6385 GREEK, // 0375..0377 6386 UNKNOWN, // 0378..0379 6387 GREEK, // 037A..037D 6388 COMMON, // 037E 6389 GREEK, // 037F 6390 UNKNOWN, // 0380..0383 6391 GREEK, // 0384 6392 COMMON, // 0385 6393 GREEK, // 0386 6394 COMMON, // 0387 6395 GREEK, // 0388..038A 6396 UNKNOWN, // 038B 6397 GREEK, // 038C 6398 UNKNOWN, // 038D 6399 GREEK, // 038E..03A1 6400 UNKNOWN, // 03A2 6401 GREEK, // 03A3..03E1 6402 COPTIC, // 03E2..03EF 6403 GREEK, // 03F0..03FF 6404 CYRILLIC, // 0400..0484 6405 INHERITED, // 0485..0486 6406 CYRILLIC, // 0487..052F 6407 UNKNOWN, // 0530 6408 ARMENIAN, // 0531..0556 6409 UNKNOWN, // 0557..0558 6410 ARMENIAN, // 0559..0588 6411 COMMON, // 0589 6412 ARMENIAN, // 058A 6413 UNKNOWN, // 058B..058C 6414 ARMENIAN, // 058D..058F 6415 UNKNOWN, // 0590 6416 HEBREW, // 0591..05C7 6417 UNKNOWN, // 05C8..05CF 6418 HEBREW, // 05D0..05EA 6419 UNKNOWN, // 05EB..05EE 6420 HEBREW, // 05EF..05F4 6421 UNKNOWN, // 05F5..05FF 6422 ARABIC, // 0600..0604 6423 COMMON, // 0605 6424 ARABIC, // 0606..060B 6425 COMMON, // 060C 6426 ARABIC, // 060D..061A 6427 COMMON, // 061B 6428 ARABIC, // 061C 6429 UNKNOWN, // 061D 6430 ARABIC, // 061E 6431 COMMON, // 061F 6432 ARABIC, // 0620..063F 6433 COMMON, // 0640 6434 ARABIC, // 0641..064A 6435 INHERITED, // 064B..0655 6436 ARABIC, // 0656..066F 6437 INHERITED, // 0670 6438 ARABIC, // 0671..06DC 6439 COMMON, // 06DD 6440 ARABIC, // 06DE..06FF 6441 SYRIAC, // 0700..070D 6442 UNKNOWN, // 070E 6443 SYRIAC, // 070F..074A 6444 UNKNOWN, // 074B..074C 6445 SYRIAC, // 074D..074F 6446 ARABIC, // 0750..077F 6447 THAANA, // 0780..07B1 6448 UNKNOWN, // 07B2..07BF 6449 NKO, // 07C0..07FA 6450 UNKNOWN, // 07FB..07FC 6451 NKO, // 07FD..07FF 6452 SAMARITAN, // 0800..082D 6453 UNKNOWN, // 082E..082F 6454 SAMARITAN, // 0830..083E 6455 UNKNOWN, // 083F 6456 MANDAIC, // 0840..085B 6457 UNKNOWN, // 085C..085D 6458 MANDAIC, // 085E 6459 UNKNOWN, // 085F 6460 SYRIAC, // 0860..086A 6461 UNKNOWN, // 086B..089F 6462 ARABIC, // 08A0..08B4 6463 UNKNOWN, // 08B5 6464 ARABIC, // 08B6..08BD 6465 UNKNOWN, // 08BE..08D2 6466 ARABIC, // 08D3..08E1 6467 COMMON, // 08E2 6468 ARABIC, // 08E3..08FF 6469 DEVANAGARI, // 0900..0950 6470 INHERITED, // 0951..0952 6471 DEVANAGARI, // 0953..0963 6472 COMMON, // 0964..0965 6473 DEVANAGARI, // 0966..097F 6474 BENGALI, // 0980..0983 6475 UNKNOWN, // 0984 6476 BENGALI, // 0985..098C 6477 UNKNOWN, // 098D..098E 6478 BENGALI, // 098F..0990 6479 UNKNOWN, // 0991..0992 6480 BENGALI, // 0993..09A8 6481 UNKNOWN, // 09A9 6482 BENGALI, // 09AA..09B0 6483 UNKNOWN, // 09B1 6484 BENGALI, // 09B2 6485 UNKNOWN, // 09B3..09B5 6486 BENGALI, // 09B6..09B9 6487 UNKNOWN, // 09BA..09BB 6488 BENGALI, // 09BC..09C4 6489 UNKNOWN, // 09C5..09C6 6490 BENGALI, // 09C7..09C8 6491 UNKNOWN, // 09C9..09CA 6492 BENGALI, // 09CB..09CE 6493 UNKNOWN, // 09CF..09D6 6494 BENGALI, // 09D7 6495 UNKNOWN, // 09D8..09DB 6496 BENGALI, // 09DC..09DD 6497 UNKNOWN, // 09DE 6498 BENGALI, // 09DF..09E3 6499 UNKNOWN, // 09E4..09E5 6500 BENGALI, // 09E6..09FE 6501 UNKNOWN, // 09FF..0A00 6502 GURMUKHI, // 0A01..0A03 6503 UNKNOWN, // 0A04 6504 GURMUKHI, // 0A05..0A0A 6505 UNKNOWN, // 0A0B..0A0E 6506 GURMUKHI, // 0A0F..0A10 6507 UNKNOWN, // 0A11..0A12 6508 GURMUKHI, // 0A13..0A28 6509 UNKNOWN, // 0A29 6510 GURMUKHI, // 0A2A..0A30 6511 UNKNOWN, // 0A31 6512 GURMUKHI, // 0A32..0A33 6513 UNKNOWN, // 0A34 6514 GURMUKHI, // 0A35..0A36 6515 UNKNOWN, // 0A37 6516 GURMUKHI, // 0A38..0A39 6517 UNKNOWN, // 0A3A..0A3B 6518 GURMUKHI, // 0A3C 6519 UNKNOWN, // 0A3D 6520 GURMUKHI, // 0A3E..0A42 6521 UNKNOWN, // 0A43..0A46 6522 GURMUKHI, // 0A47..0A48 6523 UNKNOWN, // 0A49..0A4A 6524 GURMUKHI, // 0A4B..0A4D 6525 UNKNOWN, // 0A4E..0A50 6526 GURMUKHI, // 0A51 6527 UNKNOWN, // 0A52..0A58 6528 GURMUKHI, // 0A59..0A5C 6529 UNKNOWN, // 0A5D 6530 GURMUKHI, // 0A5E 6531 UNKNOWN, // 0A5F..0A65 6532 GURMUKHI, // 0A66..0A76 6533 UNKNOWN, // 0A77..0A80 6534 GUJARATI, // 0A81..0A83 6535 UNKNOWN, // 0A84 6536 GUJARATI, // 0A85..0A8D 6537 UNKNOWN, // 0A8E 6538 GUJARATI, // 0A8F..0A91 6539 UNKNOWN, // 0A92 6540 GUJARATI, // 0A93..0AA8 6541 UNKNOWN, // 0AA9 6542 GUJARATI, // 0AAA..0AB0 6543 UNKNOWN, // 0AB1 6544 GUJARATI, // 0AB2..0AB3 6545 UNKNOWN, // 0AB4 6546 GUJARATI, // 0AB5..0AB9 6547 UNKNOWN, // 0ABA..0ABB 6548 GUJARATI, // 0ABC..0AC5 6549 UNKNOWN, // 0AC6 6550 GUJARATI, // 0AC7..0AC9 6551 UNKNOWN, // 0ACA 6552 GUJARATI, // 0ACB..0ACD 6553 UNKNOWN, // 0ACE..0ACF 6554 GUJARATI, // 0AD0 6555 UNKNOWN, // 0AD1..0ADF 6556 GUJARATI, // 0AE0..0AE3 6557 UNKNOWN, // 0AE4..0AE5 6558 GUJARATI, // 0AE6..0AF1 6559 UNKNOWN, // 0AF2..0AF8 6560 GUJARATI, // 0AF9..0AFF 6561 UNKNOWN, // 0B00 6562 ORIYA, // 0B01..0B03 6563 UNKNOWN, // 0B04 6564 ORIYA, // 0B05..0B0C 6565 UNKNOWN, // 0B0D..0B0E 6566 ORIYA, // 0B0F..0B10 6567 UNKNOWN, // 0B11..0B12 6568 ORIYA, // 0B13..0B28 6569 UNKNOWN, // 0B29 6570 ORIYA, // 0B2A..0B30 6571 UNKNOWN, // 0B31 6572 ORIYA, // 0B32..0B33 6573 UNKNOWN, // 0B34 6574 ORIYA, // 0B35..0B39 6575 UNKNOWN, // 0B3A..0B3B 6576 ORIYA, // 0B3C..0B44 6577 UNKNOWN, // 0B45..0B46 6578 ORIYA, // 0B47..0B48 6579 UNKNOWN, // 0B49..0B4A 6580 ORIYA, // 0B4B..0B4D 6581 UNKNOWN, // 0B4E..0B55 6582 ORIYA, // 0B56..0B57 6583 UNKNOWN, // 0B58..0B5B 6584 ORIYA, // 0B5C..0B5D 6585 UNKNOWN, // 0B5E 6586 ORIYA, // 0B5F..0B63 6587 UNKNOWN, // 0B64..0B65 6588 ORIYA, // 0B66..0B77 6589 UNKNOWN, // 0B78..0B81 6590 TAMIL, // 0B82..0B83 6591 UNKNOWN, // 0B84 6592 TAMIL, // 0B85..0B8A 6593 UNKNOWN, // 0B8B..0B8D 6594 TAMIL, // 0B8E..0B90 6595 UNKNOWN, // 0B91 6596 TAMIL, // 0B92..0B95 6597 UNKNOWN, // 0B96..0B98 6598 TAMIL, // 0B99..0B9A 6599 UNKNOWN, // 0B9B 6600 TAMIL, // 0B9C 6601 UNKNOWN, // 0B9D 6602 TAMIL, // 0B9E..0B9F 6603 UNKNOWN, // 0BA0..0BA2 6604 TAMIL, // 0BA3..0BA4 6605 UNKNOWN, // 0BA5..0BA7 6606 TAMIL, // 0BA8..0BAA 6607 UNKNOWN, // 0BAB..0BAD 6608 TAMIL, // 0BAE..0BB9 6609 UNKNOWN, // 0BBA..0BBD 6610 TAMIL, // 0BBE..0BC2 6611 UNKNOWN, // 0BC3..0BC5 6612 TAMIL, // 0BC6..0BC8 6613 UNKNOWN, // 0BC9 6614 TAMIL, // 0BCA..0BCD 6615 UNKNOWN, // 0BCE..0BCF 6616 TAMIL, // 0BD0 6617 UNKNOWN, // 0BD1..0BD6 6618 TAMIL, // 0BD7 6619 UNKNOWN, // 0BD8..0BE5 6620 TAMIL, // 0BE6..0BFA 6621 UNKNOWN, // 0BFB..0BFF 6622 TELUGU, // 0C00..0C0C 6623 UNKNOWN, // 0C0D 6624 TELUGU, // 0C0E..0C10 6625 UNKNOWN, // 0C11 6626 TELUGU, // 0C12..0C28 6627 UNKNOWN, // 0C29 6628 TELUGU, // 0C2A..0C39 6629 UNKNOWN, // 0C3A..0C3C 6630 TELUGU, // 0C3D..0C44 6631 UNKNOWN, // 0C45 6632 TELUGU, // 0C46..0C48 6633 UNKNOWN, // 0C49 6634 TELUGU, // 0C4A..0C4D 6635 UNKNOWN, // 0C4E..0C54 6636 TELUGU, // 0C55..0C56 6637 UNKNOWN, // 0C57 6638 TELUGU, // 0C58..0C5A 6639 UNKNOWN, // 0C5B..0C5F 6640 TELUGU, // 0C60..0C63 6641 UNKNOWN, // 0C64..0C65 6642 TELUGU, // 0C66..0C6F 6643 UNKNOWN, // 0C70..0C77 6644 TELUGU, // 0C78..0C7F 6645 KANNADA, // 0C80..0C8C 6646 UNKNOWN, // 0C8D 6647 KANNADA, // 0C8E..0C90 6648 UNKNOWN, // 0C91 6649 KANNADA, // 0C92..0CA8 6650 UNKNOWN, // 0CA9 6651 KANNADA, // 0CAA..0CB3 6652 UNKNOWN, // 0CB4 6653 KANNADA, // 0CB5..0CB9 6654 UNKNOWN, // 0CBA..0CBB 6655 KANNADA, // 0CBC..0CC4 6656 UNKNOWN, // 0CC5 6657 KANNADA, // 0CC6..0CC8 6658 UNKNOWN, // 0CC9 6659 KANNADA, // 0CCA..0CCD 6660 UNKNOWN, // 0CCE..0CD4 6661 KANNADA, // 0CD5..0CD6 6662 UNKNOWN, // 0CD7..0CDD 6663 KANNADA, // 0CDE 6664 UNKNOWN, // 0CDF 6665 KANNADA, // 0CE0..0CE3 6666 UNKNOWN, // 0CE4..0CE5 6667 KANNADA, // 0CE6..0CEF 6668 UNKNOWN, // 0CF0 6669 KANNADA, // 0CF1..0CF2 6670 UNKNOWN, // 0CF3..0CFF 6671 MALAYALAM, // 0D00..0D03 6672 UNKNOWN, // 0D04 6673 MALAYALAM, // 0D05..0D0C 6674 UNKNOWN, // 0D0D 6675 MALAYALAM, // 0D0E..0D10 6676 UNKNOWN, // 0D11 6677 MALAYALAM, // 0D12..0D44 6678 UNKNOWN, // 0D45 6679 MALAYALAM, // 0D46..0D48 6680 UNKNOWN, // 0D49 6681 MALAYALAM, // 0D4A..0D4F 6682 UNKNOWN, // 0D50..0D53 6683 MALAYALAM, // 0D54..0D63 6684 UNKNOWN, // 0D64..0D65 6685 MALAYALAM, // 0D66..0D7F 6686 UNKNOWN, // 0D80..0D81 6687 SINHALA, // 0D82..0D83 6688 UNKNOWN, // 0D84 6689 SINHALA, // 0D85..0D96 6690 UNKNOWN, // 0D97..0D99 6691 SINHALA, // 0D9A..0DB1 6692 UNKNOWN, // 0DB2 6693 SINHALA, // 0DB3..0DBB 6694 UNKNOWN, // 0DBC 6695 SINHALA, // 0DBD 6696 UNKNOWN, // 0DBE..0DBF 6697 SINHALA, // 0DC0..0DC6 6698 UNKNOWN, // 0DC7..0DC9 6699 SINHALA, // 0DCA 6700 UNKNOWN, // 0DCB..0DCE 6701 SINHALA, // 0DCF..0DD4 6702 UNKNOWN, // 0DD5 6703 SINHALA, // 0DD6 6704 UNKNOWN, // 0DD7 6705 SINHALA, // 0DD8..0DDF 6706 UNKNOWN, // 0DE0..0DE5 6707 SINHALA, // 0DE6..0DEF 6708 UNKNOWN, // 0DF0..0DF1 6709 SINHALA, // 0DF2..0DF4 6710 UNKNOWN, // 0DF5..0E00 6711 THAI, // 0E01..0E3A 6712 UNKNOWN, // 0E3B..0E3E 6713 COMMON, // 0E3F 6714 THAI, // 0E40..0E5B 6715 UNKNOWN, // 0E5C..0E80 6716 LAO, // 0E81..0E82 6717 UNKNOWN, // 0E83 6718 LAO, // 0E84 6719 UNKNOWN, // 0E85..0E86 6720 LAO, // 0E87..0E88 6721 UNKNOWN, // 0E89 6722 LAO, // 0E8A 6723 UNKNOWN, // 0E8B..0E8C 6724 LAO, // 0E8D 6725 UNKNOWN, // 0E8E..0E93 6726 LAO, // 0E94..0E97 6727 UNKNOWN, // 0E98 6728 LAO, // 0E99..0E9F 6729 UNKNOWN, // 0EA0 6730 LAO, // 0EA1..0EA3 6731 UNKNOWN, // 0EA4 6732 LAO, // 0EA5 6733 UNKNOWN, // 0EA6 6734 LAO, // 0EA7 6735 UNKNOWN, // 0EA8..0EA9 6736 LAO, // 0EAA..0EAB 6737 UNKNOWN, // 0EAC 6738 LAO, // 0EAD..0EB9 6739 UNKNOWN, // 0EBA 6740 LAO, // 0EBB..0EBD 6741 UNKNOWN, // 0EBE..0EBF 6742 LAO, // 0EC0..0EC4 6743 UNKNOWN, // 0EC5 6744 LAO, // 0EC6 6745 UNKNOWN, // 0EC7 6746 LAO, // 0EC8..0ECD 6747 UNKNOWN, // 0ECE..0ECF 6748 LAO, // 0ED0..0ED9 6749 UNKNOWN, // 0EDA..0EDB 6750 LAO, // 0EDC..0EDF 6751 UNKNOWN, // 0EE0..0EFF 6752 TIBETAN, // 0F00..0F47 6753 UNKNOWN, // 0F48 6754 TIBETAN, // 0F49..0F6C 6755 UNKNOWN, // 0F6D..0F70 6756 TIBETAN, // 0F71..0F97 6757 UNKNOWN, // 0F98 6758 TIBETAN, // 0F99..0FBC 6759 UNKNOWN, // 0FBD 6760 TIBETAN, // 0FBE..0FCC 6761 UNKNOWN, // 0FCD 6762 TIBETAN, // 0FCE..0FD4 6763 COMMON, // 0FD5..0FD8 6764 TIBETAN, // 0FD9..0FDA 6765 UNKNOWN, // 0FDB..FFF 6766 MYANMAR, // 1000..109F 6767 GEORGIAN, // 10A0..10C5 6768 UNKNOWN, // 10C6 6769 GEORGIAN, // 10C7 6770 UNKNOWN, // 10C8..10CC 6771 GEORGIAN, // 10CD 6772 UNKNOWN, // 10CE..10CF 6773 GEORGIAN, // 10D0..10FA 6774 COMMON, // 10FB 6775 GEORGIAN, // 10FC..10FF 6776 HANGUL, // 1100..11FF 6777 ETHIOPIC, // 1200..1248 6778 UNKNOWN, // 1249 6779 ETHIOPIC, // 124A..124D 6780 UNKNOWN, // 124E..124F 6781 ETHIOPIC, // 1250..1256 6782 UNKNOWN, // 1257 6783 ETHIOPIC, // 1258 6784 UNKNOWN, // 1259 6785 ETHIOPIC, // 125A..125D 6786 UNKNOWN, // 125E..125F 6787 ETHIOPIC, // 1260..1288 6788 UNKNOWN, // 1289 6789 ETHIOPIC, // 128A..128D 6790 UNKNOWN, // 128E..128F 6791 ETHIOPIC, // 1290..12B0 6792 UNKNOWN, // 12B1 6793 ETHIOPIC, // 12B2..12B5 6794 UNKNOWN, // 12B6..12B7 6795 ETHIOPIC, // 12B8..12BE 6796 UNKNOWN, // 12BF 6797 ETHIOPIC, // 12C0 6798 UNKNOWN, // 12C1 6799 ETHIOPIC, // 12C2..12C5 6800 UNKNOWN, // 12C6..12C7 6801 ETHIOPIC, // 12C8..12D6 6802 UNKNOWN, // 12D7 6803 ETHIOPIC, // 12D8..1310 6804 UNKNOWN, // 1311 6805 ETHIOPIC, // 1312..1315 6806 UNKNOWN, // 1316..1317 6807 ETHIOPIC, // 1318..135A 6808 UNKNOWN, // 135B..135C 6809 ETHIOPIC, // 135D..137C 6810 UNKNOWN, // 137D..137F 6811 ETHIOPIC, // 1380..1399 6812 UNKNOWN, // 139A..139F 6813 CHEROKEE, // 13A0..13F5 6814 UNKNOWN, // 13F6..13F7 6815 CHEROKEE, // 13F8..13FD 6816 UNKNOWN, // 13FE..13FF 6817 CANADIAN_ABORIGINAL, // 1400..167F 6818 OGHAM, // 1680..169C 6819 UNKNOWN, // 169D..169F 6820 RUNIC, // 16A0..16EA 6821 COMMON, // 16EB..16ED 6822 RUNIC, // 16EE..16F8 6823 UNKNOWN, // 16F9..16FF 6824 TAGALOG, // 1700..170C 6825 UNKNOWN, // 170D 6826 TAGALOG, // 170E..1714 6827 UNKNOWN, // 1715..171F 6828 HANUNOO, // 1720..1734 6829 COMMON, // 1735..1736 6830 UNKNOWN, // 1737..173F 6831 BUHID, // 1740..1753 6832 UNKNOWN, // 1754..175F 6833 TAGBANWA, // 1760..176C 6834 UNKNOWN, // 176D 6835 TAGBANWA, // 176E..1770 6836 UNKNOWN, // 1771 6837 TAGBANWA, // 1772..1773 6838 UNKNOWN, // 1774..177F 6839 KHMER, // 1780..17DD 6840 UNKNOWN, // 17DE..17DF 6841 KHMER, // 17E0..17E9 6842 UNKNOWN, // 17EA..17EF 6843 KHMER, // 17F0..17F9 6844 UNKNOWN, // 17FA..17FF 6845 MONGOLIAN, // 1800..1801 6846 COMMON, // 1802..1803 6847 MONGOLIAN, // 1804 6848 COMMON, // 1805 6849 MONGOLIAN, // 1806..180E 6850 UNKNOWN, // 180F 6851 MONGOLIAN, // 1810..1819 6852 UNKNOWN, // 181A..181F 6853 MONGOLIAN, // 1820..1878 6854 UNKNOWN, // 1879..187F 6855 MONGOLIAN, // 1880..18AA 6856 UNKNOWN, // 18AB..18AF 6857 CANADIAN_ABORIGINAL, // 18B0..18F5 6858 UNKNOWN, // 18F6..18FF 6859 LIMBU, // 1900..191E 6860 UNKNOWN, // 191F 6861 LIMBU, // 1920..192B 6862 UNKNOWN, // 192C..192F 6863 LIMBU, // 1930..193B 6864 UNKNOWN, // 193C..193F 6865 LIMBU, // 1940 6866 UNKNOWN, // 1941..1943 6867 LIMBU, // 1944..194F 6868 TAI_LE, // 1950..196D 6869 UNKNOWN, // 196E..196F 6870 TAI_LE, // 1970..1974 6871 UNKNOWN, // 1975..197F 6872 NEW_TAI_LUE, // 1980..19AB 6873 UNKNOWN, // 19AC..19AF 6874 NEW_TAI_LUE, // 19B0..19C9 6875 UNKNOWN, // 19CA..19CF 6876 NEW_TAI_LUE, // 19D0..19DA 6877 UNKNOWN, // 19DB..19DD 6878 NEW_TAI_LUE, // 19DE..19DF 6879 KHMER, // 19E0..19FF 6880 BUGINESE, // 1A00..1A1B 6881 UNKNOWN, // 1A1C..1A1D 6882 BUGINESE, // 1A1E..1A1F 6883 TAI_THAM, // 1A20..1A5E 6884 UNKNOWN, // 1A5F 6885 TAI_THAM, // 1A60..1A7C 6886 UNKNOWN, // 1A7D..1A7E 6887 TAI_THAM, // 1A7F..1A89 6888 UNKNOWN, // 1A8A..1A8F 6889 TAI_THAM, // 1A90..1A99 6890 UNKNOWN, // 1A9A..1A9F 6891 TAI_THAM, // 1AA0..1AAD 6892 UNKNOWN, // 1AAE..1AAF 6893 INHERITED, // 1AB0..1ABE 6894 UNKNOWN, // 1ABF..1AFF 6895 BALINESE, // 1B00..1B4B 6896 UNKNOWN, // 1B4C..1B4F 6897 BALINESE, // 1B50..1B7C 6898 UNKNOWN, // 1B7D..1B7F 6899 SUNDANESE, // 1B80..1BBF 6900 BATAK, // 1BC0..1BF3 6901 UNKNOWN, // 1BF4..1BFB 6902 BATAK, // 1BFC..1BFF 6903 LEPCHA, // 1C00..1C37 6904 UNKNOWN, // 1C38..1C3A 6905 LEPCHA, // 1C3B..1C49 6906 UNKNOWN, // 1C4A..1C4C 6907 LEPCHA, // 1C4D..1C4F 6908 OL_CHIKI, // 1C50..1C7F 6909 CYRILLIC, // 1C80..1C88 6910 UNKNOWN, // 1C89 6911 GEORGIAN, // 1C90..1CBA 6912 UNKNOWN, // 1CBB..1CBC 6913 GEORGIAN, // 1CBD..1CBF 6914 SUNDANESE, // 1CC0..1CC7 6915 UNKNOWN, // 1CC8..1CCF 6916 INHERITED, // 1CD0..1CD2 6917 COMMON, // 1CD3 6918 INHERITED, // 1CD4..1CE0 6919 COMMON, // 1CE1 6920 INHERITED, // 1CE2..1CE8 6921 COMMON, // 1CE9..1CEC 6922 INHERITED, // 1CED 6923 COMMON, // 1CEE..1CF3 6924 INHERITED, // 1CF4 6925 COMMON, // 1CF5..1CF7 6926 INHERITED, // 1CF8..1CF9 6927 UNKNOWN, // 1CFA..1CFF 6928 LATIN, // 1D00..1D25 6929 GREEK, // 1D26..1D2A 6930 CYRILLIC, // 1D2B 6931 LATIN, // 1D2C..1D5C 6932 GREEK, // 1D5D..1D61 6933 LATIN, // 1D62..1D65 6934 GREEK, // 1D66..1D6A 6935 LATIN, // 1D6B..1D77 6936 CYRILLIC, // 1D78 6937 LATIN, // 1D79..1DBE 6938 GREEK, // 1DBF 6939 INHERITED, // 1DC0..1DF9 6940 UNKNOWN, // 1DFA 6941 INHERITED, // 1DFB..1DFF 6942 LATIN, // 1E00..1EFF 6943 GREEK, // 1F00..1F15 6944 UNKNOWN, // 1F16..1F17 6945 GREEK, // 1F18..1F1D 6946 UNKNOWN, // 1F1E..1F1F 6947 GREEK, // 1F20..1F45 6948 UNKNOWN, // 1F46..1F47 6949 GREEK, // 1F48..1F4D 6950 UNKNOWN, // 1F4E..1F4F 6951 GREEK, // 1F50..1F57 6952 UNKNOWN, // 1F58 6953 GREEK, // 1F59 6954 UNKNOWN, // 1F5A 6955 GREEK, // 1F5B 6956 UNKNOWN, // 1F5C 6957 GREEK, // 1F5D 6958 UNKNOWN, // 1F5E 6959 GREEK, // 1F5F..1F7D 6960 UNKNOWN, // 1F7E..1F7F 6961 GREEK, // 1F80..1FB4 6962 UNKNOWN, // 1FB5 6963 GREEK, // 1FB6..1FC4 6964 UNKNOWN, // 1FC5 6965 GREEK, // 1FC6..1FD3 6966 UNKNOWN, // 1FD4..1FD5 6967 GREEK, // 1FD6..1FDB 6968 UNKNOWN, // 1FDC 6969 GREEK, // 1FDD..1FEF 6970 UNKNOWN, // 1FF0..1FF1 6971 GREEK, // 1FF2..1FF4 6972 UNKNOWN, // 1FF5 6973 GREEK, // 1FF6..1FFE 6974 UNKNOWN, // 1FFF 6975 COMMON, // 2000..200B 6976 INHERITED, // 200C..200D 6977 COMMON, // 200E..2064 6978 UNKNOWN, // 2065 6979 COMMON, // 2066..2070 6980 LATIN, // 2071 6981 UNKNOWN, // 2072..2073 6982 COMMON, // 2074..207E 6983 LATIN, // 207F 6984 COMMON, // 2080..208E 6985 UNKNOWN, // 208F 6986 LATIN, // 2090..209C 6987 UNKNOWN, // 209D..209F 6988 COMMON, // 20A0..20BF 6989 UNKNOWN, // 20C0..20CF 6990 INHERITED, // 20D0..20F0 6991 UNKNOWN, // 20F1..20FF 6992 COMMON, // 2100..2125 6993 GREEK, // 2126 6994 COMMON, // 2127..2129 6995 LATIN, // 212A..212B 6996 COMMON, // 212C..2131 6997 LATIN, // 2132 6998 COMMON, // 2133..214D 6999 LATIN, // 214E 7000 COMMON, // 214F..215F 7001 LATIN, // 2160..2188 7002 COMMON, // 2189..218B 7003 UNKNOWN, // 218C..218F 7004 COMMON, // 2190..2426 7005 UNKNOWN, // 2427..243F 7006 COMMON, // 2440..244A 7007 UNKNOWN, // 244B..245F 7008 COMMON, // 2460..27FF 7009 BRAILLE, // 2800..28FF 7010 COMMON, // 2900..2B73 7011 UNKNOWN, // 2B74..2B75 7012 COMMON, // 2B76..2B95 7013 UNKNOWN, // 2B96..2B97 7014 COMMON, // 2B98..2BC8 7015 UNKNOWN, // 2BC9 7016 COMMON, // 2BCA..2BFE 7017 UNKNOWN, // 0x2BFF 7018 GLAGOLITIC, // 2C00..2C2E 7019 UNKNOWN, // 2C2F 7020 GLAGOLITIC, // 2C30..2C5E 7021 UNKNOWN, // 2C5F 7022 LATIN, // 2C60..2C7F 7023 COPTIC, // 2C80..2CF3 7024 UNKNOWN, // 2CF4..2CF8 7025 COPTIC, // 2CF9..2CFF 7026 GEORGIAN, // 2D00..2D25 7027 UNKNOWN, // 2D26 7028 GEORGIAN, // 2D27 7029 UNKNOWN, // 2D28..2D2C 7030 GEORGIAN, // 2D2D 7031 UNKNOWN, // 2D2E..2D2F 7032 TIFINAGH, // 2D30..2D67 7033 UNKNOWN, // 2D68..2D6E 7034 TIFINAGH, // 2D6F..2D70 7035 UNKNOWN, // 2D71..2D7E 7036 TIFINAGH, // 2D7F 7037 ETHIOPIC, // 2D80..2D96 7038 UNKNOWN, // 2D97..2D9F 7039 ETHIOPIC, // 2DA0..2DA6 7040 UNKNOWN, // 2DA7 7041 ETHIOPIC, // 2DA8..2DAE 7042 UNKNOWN, // 2DAF 7043 ETHIOPIC, // 2DB0..2DB6 7044 UNKNOWN, // 2DB7 7045 ETHIOPIC, // 2DB8..2DBE 7046 UNKNOWN, // 2DBF 7047 ETHIOPIC, // 2DC0..2DC6 7048 UNKNOWN, // 2DC7 7049 ETHIOPIC, // 2DC8..2DCE 7050 UNKNOWN, // 2DCF 7051 ETHIOPIC, // 2DD0..2DD6 7052 UNKNOWN, // 2DD7 7053 ETHIOPIC, // 2DD8..2DDE 7054 UNKNOWN, // 2DDF 7055 CYRILLIC, // 2DE0..2DFF 7056 COMMON, // 2E00..2E4E 7057 UNKNOWN, // 2E4F..2E7F 7058 HAN, // 2E80..2E99 7059 UNKNOWN, // 2E9A 7060 HAN, // 2E9B..2EF3 7061 UNKNOWN, // 2EF4..2EFF 7062 HAN, // 2F00..2FD5 7063 UNKNOWN, // 2FD6..2FEF 7064 COMMON, // 2FF0..2FFB 7065 UNKNOWN, // 2FFC..2FFF 7066 COMMON, // 3000..3004 7067 HAN, // 3005 7068 COMMON, // 3006 7069 HAN, // 3007 7070 COMMON, // 3008..3020 7071 HAN, // 3021..3029 7072 INHERITED, // 302A..302D 7073 HANGUL, // 302E..302F 7074 COMMON, // 3030..3037 7075 HAN, // 3038..303B 7076 COMMON, // 303C..303F 7077 UNKNOWN, // 3040 7078 HIRAGANA, // 3041..3096 7079 UNKNOWN, // 3097..3098 7080 INHERITED, // 3099..309A 7081 COMMON, // 309B..309C 7082 HIRAGANA, // 309D..309F 7083 COMMON, // 30A0 7084 KATAKANA, // 30A1..30FA 7085 COMMON, // 30FB..30FC 7086 KATAKANA, // 30FD..30FF 7087 UNKNOWN, // 3100..3104 7088 BOPOMOFO, // 3105..312F 7089 UNKNOWN, // 3130 7090 HANGUL, // 3131..318E 7091 UNKNOWN, // 318F 7092 COMMON, // 3190..319F 7093 BOPOMOFO, // 31A0..31BA 7094 UNKNOWN, // 31BB..31BF 7095 COMMON, // 31C0..31E3 7096 UNKNOWN, // 31E4..31EF 7097 KATAKANA, // 31F0..31FF 7098 HANGUL, // 3200..321E 7099 UNKNOWN, // 321F 7100 COMMON, // 3220..325F 7101 HANGUL, // 3260..327E 7102 COMMON, // 327F..32CF 7103 KATAKANA, // 32D0..32FE 7104 COMMON, // 32FF 7105 KATAKANA, // 3300..3357 7106 COMMON, // 3358..33FF 7107 HAN, // 3400..4DB5 7108 UNKNOWN, // 4DB6..4DBF 7109 COMMON, // 4DC0..4DFF 7110 HAN, // 4E00..9FEF 7111 UNKNOWN, // 9FF0..9FFF 7112 YI, // A000..A48C 7113 UNKNOWN, // A48D..A48F 7114 YI, // A490..A4C6 7115 UNKNOWN, // A4C7..A4CF 7116 LISU, // A4D0..A4FF 7117 VAI, // A500..A62B 7118 UNKNOWN, // A62C..A63F 7119 CYRILLIC, // A640..A69F 7120 BAMUM, // A6A0..A6F7 7121 UNKNOWN, // A6F8..A6FF 7122 COMMON, // A700..A721 7123 LATIN, // A722..A787 7124 COMMON, // A788..A78A 7125 LATIN, // A78B..A7B9 7126 UNKNOWN, // A7C0..A7F6 7127 LATIN, // A7F7..A7FF 7128 SYLOTI_NAGRI, // A800..A82B 7129 UNKNOWN, // A82C..A82F 7130 COMMON, // A830..A839 7131 UNKNOWN, // A83A..A83F 7132 PHAGS_PA, // A840..A877 7133 UNKNOWN, // A878..A87F 7134 SAURASHTRA, // A880..A8C5 7135 UNKNOWN, // A8C6..A8CD 7136 SAURASHTRA, // A8CE..A8D9 7137 UNKNOWN, // A8DA..A8DF 7138 DEVANAGARI, // A8E0..A8FF 7139 KAYAH_LI, // A900..A92D 7140 COMMON, // A92E 7141 KAYAH_LI, // A92F 7142 REJANG, // A930..A953 7143 UNKNOWN, // A954..A95E 7144 REJANG, // A95F 7145 HANGUL, // A960..A97C 7146 UNKNOWN, // A97D..A97F 7147 JAVANESE, // A980..A9CD 7148 UNKNOWN, // A9CE 7149 COMMON, // A9CF 7150 JAVANESE, // A9D0..A9D9 7151 UNKNOWN, // A9DA..A9DD 7152 JAVANESE, // A9DE..A9DF 7153 MYANMAR, // A9E0..A9FE 7154 UNKNOWN, // A9FF 7155 CHAM, // AA00..AA36 7156 UNKNOWN, // AA37..AA3F 7157 CHAM, // AA40..AA4D 7158 UNKNOWN, // AA4E..AA4F 7159 CHAM, // AA50..AA59 7160 UNKNOWN, // AA5A..AA5B 7161 CHAM, // AA5C..AA5F 7162 MYANMAR, // AA60..AA7F 7163 TAI_VIET, // AA80..AAC2 7164 UNKNOWN, // AAC3..AADA 7165 TAI_VIET, // AADB..AADF 7166 MEETEI_MAYEK, // AAE0..AAF6 7167 UNKNOWN, // AAF7..AB00 7168 ETHIOPIC, // AB01..AB06 7169 UNKNOWN, // AB07..AB08 7170 ETHIOPIC, // AB09..AB0E 7171 UNKNOWN, // AB0F..AB10 7172 ETHIOPIC, // AB11..AB16 7173 UNKNOWN, // AB17..AB1F 7174 ETHIOPIC, // AB20..AB26 7175 UNKNOWN, // AB27 7176 ETHIOPIC, // AB28..AB2E 7177 UNKNOWN, // AB2F 7178 LATIN, // AB30..AB5A 7179 COMMON, // AB5B 7180 LATIN, // AB5C..AB64 7181 GREEK, // AB65 7182 UNKNOWN, // AB66..AB6F 7183 CHEROKEE, // AB70..ABBF 7184 MEETEI_MAYEK, // ABC0..ABED 7185 UNKNOWN, // ABEE..ABEF 7186 MEETEI_MAYEK, // ABF0..ABF9 7187 UNKNOWN, // ABFA..ABFF 7188 HANGUL, // AC00..D7A3 7189 UNKNOWN, // D7A4..D7AF 7190 HANGUL, // D7B0..D7C6 7191 UNKNOWN, // D7C7..D7CA 7192 HANGUL, // D7CB..D7FB 7193 UNKNOWN, // D7FC..F8FF 7194 HAN, // F900..FA6D 7195 UNKNOWN, // FA6E..FA6F 7196 HAN, // FA70..FAD9 7197 UNKNOWN, // FADA..FAFF 7198 LATIN, // FB00..FB06 7199 UNKNOWN, // FB07..FB12 7200 ARMENIAN, // FB13..FB17 7201 UNKNOWN, // FB18..FB1C 7202 HEBREW, // FB1D..FB36 7203 UNKNOWN, // FB37 7204 HEBREW, // FB38..FB3C 7205 UNKNOWN, // FB3D 7206 HEBREW, // FB3E 7207 UNKNOWN, // FB3F 7208 HEBREW, // FB40..FB41 7209 UNKNOWN, // FB42 7210 HEBREW, // FB43..FB44 7211 UNKNOWN, // FB45 7212 HEBREW, // FB46..FB4F 7213 ARABIC, // FB50..FBC1 7214 UNKNOWN, // FBC2..FBD2 7215 ARABIC, // FBD3..FD3D 7216 COMMON, // FD3E..FD3F 7217 UNKNOWN, // FD40..FD4F 7218 ARABIC, // FD50..FD8F 7219 UNKNOWN, // FD90..FD91 7220 ARABIC, // FD92..FDC7 7221 UNKNOWN, // FDC8..FDEF 7222 ARABIC, // FDF0..FDFD 7223 UNKNOWN, // FDFE..FDFF 7224 INHERITED, // FE00..FE0F 7225 COMMON, // FE10..FE19 7226 UNKNOWN, // FE1A..FE1F 7227 INHERITED, // FE20..FE2D 7228 CYRILLIC, // FE2E..FE2F 7229 COMMON, // FE30..FE52 7230 UNKNOWN, // FE53 7231 COMMON, // FE54..FE66 7232 UNKNOWN, // FE67 7233 COMMON, // FE68..FE6B 7234 UNKNOWN, // FE6C..FE6F 7235 ARABIC, // FE70..FE74 7236 UNKNOWN, // FE75 7237 ARABIC, // FE76..FEFC 7238 UNKNOWN, // FEFD..FEFE 7239 COMMON, // FEFF 7240 UNKNOWN, // FF00 7241 COMMON, // FF01..FF20 7242 LATIN, // FF21..FF3A 7243 COMMON, // FF3B..FF40 7244 LATIN, // FF41..FF5A 7245 COMMON, // FF5B..FF65 7246 KATAKANA, // FF66..FF6F 7247 COMMON, // FF70 7248 KATAKANA, // FF71..FF9D 7249 COMMON, // FF9E..FF9F 7250 HANGUL, // FFA0..FFBE 7251 UNKNOWN, // FFBF..FFC1 7252 HANGUL, // FFC2..FFC7 7253 UNKNOWN, // FFC8..FFC9 7254 HANGUL, // FFCA..FFCF 7255 UNKNOWN, // FFD0..FFD1 7256 HANGUL, // FFD2..FFD7 7257 UNKNOWN, // FFD8..FFD9 7258 HANGUL, // FFDA..FFDC 7259 UNKNOWN, // FFDD..FFDF 7260 COMMON, // FFE0..FFE6 7261 UNKNOWN, // FFE7 7262 COMMON, // FFE8..FFEE 7263 UNKNOWN, // FFEF..FFF8 7264 COMMON, // FFF9..FFFD 7265 UNKNOWN, // FFFE..FFFF 7266 LINEAR_B, // 10000..1000B 7267 UNKNOWN, // 1000C 7268 LINEAR_B, // 1000D..10026 7269 UNKNOWN, // 10027 7270 LINEAR_B, // 10028..1003A 7271 UNKNOWN, // 1003B 7272 LINEAR_B, // 1003C..1003D 7273 UNKNOWN, // 1003E 7274 LINEAR_B, // 1003F..1004D 7275 UNKNOWN, // 1004E..1004F 7276 LINEAR_B, // 10050..1005D 7277 UNKNOWN, // 1005E..1007F 7278 LINEAR_B, // 10080..100FA 7279 UNKNOWN, // 100FB..100FF 7280 COMMON, // 10100..10102 7281 UNKNOWN, // 10103..10106 7282 COMMON, // 10107..10133 7283 UNKNOWN, // 10134..10136 7284 COMMON, // 10137..1013F 7285 GREEK, // 10140..1018E 7286 UNKNOWN, // 1018F 7287 COMMON, // 10190..1019B 7288 UNKNOWN, // 1019C..1019F 7289 GREEK, // 101A0 7290 UNKNOWN, // 101A1..101CF 7291 COMMON, // 101D0..101FC 7292 INHERITED, // 101FD 7293 UNKNOWN, // 101FE..1027F 7294 LYCIAN, // 10280..1029C 7295 UNKNOWN, // 1029D..1029F 7296 CARIAN, // 102A0..102D0 7297 UNKNOWN, // 102D1..102DF 7298 INHERITED, // 102E0 7299 COMMON, // 102E1..102FB 7300 UNKNOWN, // 102FC..102FF 7301 OLD_ITALIC, // 10300..10323 7302 UNKNOWN, // 10324..1032C 7303 OLD_ITALIC, // 1032D..1032F 7304 GOTHIC, // 10330..1034A 7305 UNKNOWN, // 1034B..1034F 7306 OLD_PERMIC, // 10350..1037A 7307 UNKNOWN, // 1037B..1037F 7308 UGARITIC, // 10380..1039D 7309 UNKNOWN, // 1039E 7310 UGARITIC, // 1039F 7311 OLD_PERSIAN, // 103A0..103C3 7312 UNKNOWN, // 103C4..103C7 7313 OLD_PERSIAN, // 103C8..103D5 7314 UNKNOWN, // 103D6..103FF 7315 DESERET, // 10400..1044F 7316 SHAVIAN, // 10450..1047F 7317 OSMANYA, // 10480..1049D 7318 UNKNOWN, // 1049E..1049F 7319 OSMANYA, // 104A0..104A9 7320 UNKNOWN, // 104AA..104AF 7321 OSAGE, // 104B0..104D3; 7322 UNKNOWN, // 104D4..104D7; 7323 OSAGE, // 104D8..104FB; 7324 UNKNOWN, // 104FC..104FF; 7325 ELBASAN, // 10500..10527 7326 UNKNOWN, // 10528..1052F 7327 CAUCASIAN_ALBANIAN, // 10530..10563 7328 UNKNOWN, // 10564..1056E 7329 CAUCASIAN_ALBANIAN, // 1056F 7330 UNKNOWN, // 10570..105FF 7331 LINEAR_A, // 10600..10736 7332 UNKNOWN, // 10737..1073F 7333 LINEAR_A, // 10740..10755 7334 UNKNOWN, // 10756..1075F 7335 LINEAR_A, // 10760..10767 7336 UNKNOWN, // 10768..107FF 7337 CYPRIOT, // 10800..10805 7338 UNKNOWN, // 10806..10807 7339 CYPRIOT, // 10808 7340 UNKNOWN, // 10809 7341 CYPRIOT, // 1080A..10835 7342 UNKNOWN, // 10836 7343 CYPRIOT, // 10837..10838 7344 UNKNOWN, // 10839..1083B 7345 CYPRIOT, // 1083C 7346 UNKNOWN, // 1083D..1083E 7347 CYPRIOT, // 1083F 7348 IMPERIAL_ARAMAIC, // 10840..10855 7349 UNKNOWN, // 10856 7350 IMPERIAL_ARAMAIC, // 10857..1085F 7351 PALMYRENE, // 10860..1087F 7352 NABATAEAN, // 10880..1089E 7353 UNKNOWN, // 1089F..108A6 7354 NABATAEAN, // 108A7..108AF 7355 UNKNOWN, // 108B0..108DF 7356 HATRAN, // 108E0..108F2 7357 UNKNOWN, // 108F3 7358 HATRAN, // 108F4..108F5 7359 UNKNOWN, // 108F6..108FA 7360 HATRAN, // 108FB..108FF 7361 PHOENICIAN, // 10900..1091B 7362 UNKNOWN, // 1091C..1091E 7363 PHOENICIAN, // 1091F 7364 LYDIAN, // 10920..10939 7365 UNKNOWN, // 1093A..1093E 7366 LYDIAN, // 1093F 7367 UNKNOWN, // 10940..1097F 7368 MEROITIC_HIEROGLYPHS, // 10980..1099F 7369 MEROITIC_CURSIVE, // 109A0..109B7 7370 UNKNOWN, // 109B8..109BB 7371 MEROITIC_CURSIVE, // 109BC..109CF 7372 UNKNOWN, // 109D0..109D1 7373 MEROITIC_CURSIVE, // 109D2..109FF 7374 KHAROSHTHI, // 10A00..10A03 7375 UNKNOWN, // 10A04 7376 KHAROSHTHI, // 10A05..10A06 7377 UNKNOWN, // 10A07..10A0B 7378 KHAROSHTHI, // 10A0C..10A13 7379 UNKNOWN, // 10A14 7380 KHAROSHTHI, // 10A15..10A17 7381 UNKNOWN, // 10A18 7382 KHAROSHTHI, // 10A19..10A35 7383 UNKNOWN, // 10A36..10A37 7384 KHAROSHTHI, // 10A38..10A3A 7385 UNKNOWN, // 10A3B..10A3E 7386 KHAROSHTHI, // 10A3F..10A48 7387 UNKNOWN, // 10A49..10A4F 7388 KHAROSHTHI, // 10A50..10A58 7389 UNKNOWN, // 10A59..10A5F 7390 OLD_SOUTH_ARABIAN, // 10A60..10A7F 7391 OLD_NORTH_ARABIAN, // 10A80..10A9F 7392 UNKNOWN, // 10AA0..10ABF 7393 MANICHAEAN, // 10AC0..10AE6 7394 UNKNOWN, // 10AE7..10AEA 7395 MANICHAEAN, // 10AEB..10AF6 7396 UNKNOWN, // 10AF7..10AFF 7397 AVESTAN, // 10B00..10B35 7398 UNKNOWN, // 10B36..10B38 7399 AVESTAN, // 10B39..10B3F 7400 INSCRIPTIONAL_PARTHIAN, // 10B40..10B55 7401 UNKNOWN, // 10B56..10B57 7402 INSCRIPTIONAL_PARTHIAN, // 10B58..10B5F 7403 INSCRIPTIONAL_PAHLAVI, // 10B60..10B72 7404 UNKNOWN, // 10B73..10B77 7405 INSCRIPTIONAL_PAHLAVI, // 10B78..10B7F 7406 PSALTER_PAHLAVI, // 10B80..10B91 7407 UNKNOWN, // 10B92..10B98 7408 PSALTER_PAHLAVI, // 10B99..10B9C 7409 UNKNOWN, // 10B9D..10BA8 7410 PSALTER_PAHLAVI, // 10BA9..10BAF 7411 UNKNOWN, // 10BB0..10BFF 7412 OLD_TURKIC, // 10C00..10C48 7413 UNKNOWN, // 10C49..10C7F 7414 OLD_HUNGARIAN, // 10C80..10CB2 7415 UNKNOWN, // 10CB3..10CBF 7416 OLD_HUNGARIAN, // 10CC0..10CF2 7417 UNKNOWN, // 10CF3..10CF9 7418 OLD_HUNGARIAN, // 10CFA..10CFF 7419 HANIFI_ROHINGYA, // 10D00..10D27 7420 UNKNOWN, // 10D28..10D29 7421 HANIFI_ROHINGYA, // 10D30..10D39 7422 UNKNOWN, // 10D3A..10E5F 7423 ARABIC, // 10E60..10E7E 7424 UNKNOWN, // 10E7F..10EFF 7425 OLD_SOGDIAN, // 10F00..10F27 7426 UNKNOWN, // 10F28..10F2F 7427 SOGDIAN, // 10F30..10F59 7428 UNKNOWN, // 10F5A..10FFF 7429 BRAHMI, // 11000..1104D 7430 UNKNOWN, // 1104E..11051 7431 BRAHMI, // 11052..1106F 7432 UNKNOWN, // 11070..1107E 7433 BRAHMI, // 1107F 7434 KAITHI, // 11080..110C1 7435 UNKNOWN, // 110C2..110CC 7436 KAITHI, // 110CD 7437 UNKNOWN, // 110CE..110CF 7438 SORA_SOMPENG, // 110D0..110E8 7439 UNKNOWN, // 110E9..110EF 7440 SORA_SOMPENG, // 110F0..110F9 7441 UNKNOWN, // 110FA..110FF 7442 CHAKMA, // 11100..11134 7443 UNKNOWN, // 11135 7444 CHAKMA, // 11136..11146 7445 UNKNOWN, // 11147..1114F 7446 MAHAJANI, // 11150..11176 7447 UNKNOWN, // 11177..1117F 7448 SHARADA, // 11180..111CD 7449 UNKNOWN, // 111CE..111CF 7450 SHARADA, // 111D0..111DF 7451 UNKNOWN, // 111E0 7452 SINHALA, // 111E1..111F4 7453 UNKNOWN, // 111F5..111FF 7454 KHOJKI, // 11200..11211 7455 UNKNOWN, // 11212 7456 KHOJKI, // 11213..1123E 7457 UNKNOWN, // 1123F..1127F 7458 MULTANI, // 11280..11286 7459 UNKNOWN, // 11287 7460 MULTANI, // 11288 7461 UNKNOWN, // 11289 7462 MULTANI, // 1128A..1128D 7463 UNKNOWN, // 1128E 7464 MULTANI, // 1128F..1129D 7465 UNKNOWN, // 1129E 7466 MULTANI, // 1129F..112A9 7467 UNKNOWN, // 112AA..112AF 7468 KHUDAWADI, // 112B0..112EA 7469 UNKNOWN, // 112EB..112EF 7470 KHUDAWADI, // 112F0..112F9 7471 UNKNOWN, // 112FA..112FF 7472 GRANTHA, // 11300..11303 7473 UNKNOWN, // 11304 7474 GRANTHA, // 11305..1130C 7475 UNKNOWN, // 1130D..1130E 7476 GRANTHA, // 1130F..11310 7477 UNKNOWN, // 11311..11312 7478 GRANTHA, // 11313..11328 7479 UNKNOWN, // 11329 7480 GRANTHA, // 1132A..11330 7481 UNKNOWN, // 11331 7482 GRANTHA, // 11332..11333 7483 UNKNOWN, // 11334 7484 GRANTHA, // 11335..11339 7485 UNKNOWN, // 1133A 7486 INHERITED, // 1133B 7487 GRANTHA, // 1133C..11344 7488 UNKNOWN, // 11345..11346 7489 GRANTHA, // 11347..11348 7490 UNKNOWN, // 11349..1134A 7491 GRANTHA, // 1134B..1134D 7492 UNKNOWN, // 1134E..1134F 7493 GRANTHA, // 11350 7494 UNKNOWN, // 11351..11356 7495 GRANTHA, // 11357 7496 UNKNOWN, // 11358..1135C 7497 GRANTHA, // 1135D..11363 7498 UNKNOWN, // 11364..11365 7499 GRANTHA, // 11366..1136C 7500 UNKNOWN, // 1136D..1136F 7501 GRANTHA, // 11370..11374 7502 UNKNOWN, // 11375..113FF 7503 NEWA, // 11400..11459 7504 UNKNOWN, // 1145A 7505 NEWA, // 1145B 7506 UNKNOWN, // 1145C 7507 NEWA, // 1145D..1145E 7508 UNKNOWN, // 1145F..1147F 7509 TIRHUTA, // 11480..114C7 7510 UNKNOWN, // 114C8..114CF 7511 TIRHUTA, // 114D0..114D9 7512 UNKNOWN, // 114DA..1157F 7513 SIDDHAM, // 11580..115B5 7514 UNKNOWN, // 115B6..115B7 7515 SIDDHAM, // 115B8..115DD 7516 UNKNOWN, // 115DE..115FF 7517 MODI, // 11600..11644 7518 UNKNOWN, // 11645..1164F 7519 MODI, // 11650..11659 7520 UNKNOWN, // 1165A..1165F 7521 MONGOLIAN, // 11660..1166C 7522 UNKNOWN, // 1166D..1167F 7523 TAKRI, // 11680..116B7 7524 UNKNOWN, // 116B8..116BF 7525 TAKRI, // 116C0..116C9 7526 UNKNOWN, // 116CA..116FF 7527 AHOM, // 11700..1171A 7528 UNKNOWN, // 1171B..1171C 7529 AHOM, // 1171D..1172B 7530 UNKNOWN, // 1172C..1172F 7531 AHOM, // 11730..1173F 7532 UNKNOWN, // 11740..117FF 7533 DOGRA, // 11800..1183B 7534 UNKNOWN, // 1183C..1189F 7535 WARANG_CITI, // 118A0..118F2 7536 UNKNOWN, // 118F3..118FE 7537 WARANG_CITI, // 118FF 7538 UNKNOWN, // 11900..119FF 7539 ZANABAZAR_SQUARE, // 11A00..11A47 7540 UNKNOWN, // 11A48..11A4F 7541 SOYOMBO, // 11A50..11A83 7542 UNKNOWN, // 11A84..11A85 7543 SOYOMBO, // 11A86..11AA2 7544 UNKNOWN, // 11AA3..11ABF 7545 PAU_CIN_HAU, // 11AC0..11AF8 7546 UNKNOWN, // 11AF9..11BFF 7547 BHAIKSUKI, // 11C00..11C08 7548 UNKNOWN, // 11C09 7549 BHAIKSUKI, // 11C0A..11C36 7550 UNKNOWN, // 11C37 7551 BHAIKSUKI, // 11C38..11C45 7552 UNKNOWN, // 11C46..11C49 7553 BHAIKSUKI, // 11C50..11C6C 7554 UNKNOWN, // 11C6D..11C6F 7555 MARCHEN, // 11C70..11C8F 7556 UNKNOWN, // 11C90..11C91 7557 MARCHEN, // 11C92..11CA7 7558 UNKNOWN, // 11CA8 7559 MARCHEN, // 11CA9..11CB6 7560 UNKNOWN, // 11CB7..11CFF 7561 MASARAM_GONDI, // 11D00..11D06 7562 UNKNOWN, // 11D07 7563 MASARAM_GONDI, // 11D08..11D09 7564 UNKNOWN, // 11D0A 7565 MASARAM_GONDI, // 11D0B..11D36 7566 UNKNOWN, // 11D37..11D39 7567 MASARAM_GONDI, // 11D3A 7568 UNKNOWN, // 11D3B 7569 MASARAM_GONDI, // 11D3C..11D3D 7570 UNKNOWN, // 11D3E 7571 MASARAM_GONDI, // 11D3F..11D47 7572 UNKNOWN, // 11D48..11D49 7573 MASARAM_GONDI, // 11D50..11D59 7574 UNKNOWN, // 11D5A..11D5F 7575 GUNJALA_GONDI, // 11D60..11D68 7576 UNKNOWN, // 11D69 7577 GUNJALA_GONDI, // 11D6A..11D8E 7578 UNKNOWN, // 11D8F 7579 GUNJALA_GONDI, // 11D90..11D91 7580 UNKNOWN, // 11D92 7581 GUNJALA_GONDI, // 11D93..11D98 7582 UNKNOWN, // 11D99 7583 GUNJALA_GONDI, // 11DA0..11DA9 7584 UNKNOWN, // 11DAA..11DFF 7585 MAKASAR, // 11EE0..11EF8 7586 UNKNOWN, // 11EF9..11FFF 7587 CUNEIFORM, // 12000..12399 7588 UNKNOWN, // 1239A..123FF 7589 CUNEIFORM, // 12400..1246E 7590 UNKNOWN, // 1246F 7591 CUNEIFORM, // 12470..12474 7592 UNKNOWN, // 12475..1247F 7593 CUNEIFORM, // 12480..12543 7594 UNKNOWN, // 12544..12FFF 7595 EGYPTIAN_HIEROGLYPHS, // 13000..1342E 7596 UNKNOWN, // 1342F..143FF 7597 ANATOLIAN_HIEROGLYPHS, // 14400..14646 7598 UNKNOWN, // 14647..167FF 7599 BAMUM, // 16800..16A38 7600 UNKNOWN, // 16A39..16A3F 7601 MRO, // 16A40..16A5E 7602 UNKNOWN, // 16A5F 7603 MRO, // 16A60..16A69 7604 UNKNOWN, // 16A6A..16A6D 7605 MRO, // 16A6E..16A6F 7606 UNKNOWN, // 16A70..16ACF 7607 BASSA_VAH, // 16AD0..16AED 7608 UNKNOWN, // 16AEE..16AEF 7609 BASSA_VAH, // 16AF0..16AF5 7610 UNKNOWN, // 16AF6..16AFF 7611 PAHAWH_HMONG, // 16B00..16B45 7612 UNKNOWN, // 16B46..16B4F 7613 PAHAWH_HMONG, // 16B50..16B59 7614 UNKNOWN, // 16B5A 7615 PAHAWH_HMONG, // 16B5B..16B61 7616 UNKNOWN, // 16B62 7617 PAHAWH_HMONG, // 16B63..16B77 7618 UNKNOWN, // 16B78..16B7C 7619 PAHAWH_HMONG, // 16B7D..16B8F 7620 UNKNOWN, // 16B90..16E3F 7621 MEDEFAIDRIN, // 16E40..16E9A 7622 UNKNOWN, // 16E9B..16EFF 7623 MIAO, // 16F00..16F44 7624 UNKNOWN, // 16F45..16F4F 7625 MIAO, // 16F50..16F7E 7626 UNKNOWN, // 16F7F..16F8E 7627 MIAO, // 16F8F..16F9F 7628 UNKNOWN, // 16FA0..16FDF 7629 TANGUT, // 16FE0 7630 NUSHU, // 16FE1 7631 UNKNOWN, // 16FE2..16FFF 7632 TANGUT, // 17000..187F1 7633 UNKNOWN, // 187F2..187FF 7634 TANGUT, // 18800..18AF2 7635 UNKNOWN, // 18AF3..1AFFF 7636 KATAKANA, // 1B000 7637 HIRAGANA, // 1B001..1B11E 7638 UNKNOWN, // 1B11F..1B16F 7639 NUSHU, // 1B170..1B2FB 7640 UNKNOWN, // 1B2FC..1BBFF 7641 DUPLOYAN, // 1BC00..1BC6A 7642 UNKNOWN, // 1BC6B..1BC6F 7643 DUPLOYAN, // 1BC70..1BC7C 7644 UNKNOWN, // 1BC7D..1BC7F 7645 DUPLOYAN, // 1BC80..1BC88 7646 UNKNOWN, // 1BC89..1BC8F 7647 DUPLOYAN, // 1BC90..1BC99 7648 UNKNOWN, // 1BC9A..1BC9B 7649 DUPLOYAN, // 1BC9C..1BC9F 7650 COMMON, // 1BCA0..1BCA3 7651 UNKNOWN, // 1BCA4..1CFFF 7652 COMMON, // 1D000..1D0F5 7653 UNKNOWN, // 1D0F6..1D0FF 7654 COMMON, // 1D100..1D126 7655 UNKNOWN, // 1D127..1D128 7656 COMMON, // 1D129..1D166 7657 INHERITED, // 1D167..1D169 7658 COMMON, // 1D16A..1D17A 7659 INHERITED, // 1D17B..1D182 7660 COMMON, // 1D183..1D184 7661 INHERITED, // 1D185..1D18B 7662 COMMON, // 1D18C..1D1A9 7663 INHERITED, // 1D1AA..1D1AD 7664 COMMON, // 1D1AE..1D1E8 7665 UNKNOWN, // 1D1E9..1D1FF 7666 GREEK, // 1D200..1D245 7667 UNKNOWN, // 1D246..1D2DF 7668 COMMON, // 1D2E0..1D2F3 7669 UNKNOWN, // 1D2F4..1D2FF 7670 COMMON, // 1D300..1D356 7671 UNKNOWN, // 1D357..1D35F 7672 COMMON, // 1D360..1D378 7673 UNKNOWN, // 1D379..1D3FF 7674 COMMON, // 1D400..1D454 7675 UNKNOWN, // 1D455 7676 COMMON, // 1D456..1D49C 7677 UNKNOWN, // 1D49D 7678 COMMON, // 1D49E..1D49F 7679 UNKNOWN, // 1D4A0..1D4A1 7680 COMMON, // 1D4A2 7681 UNKNOWN, // 1D4A3..1D4A4 7682 COMMON, // 1D4A5..1D4A6 7683 UNKNOWN, // 1D4A7..1D4A8 7684 COMMON, // 1D4A9..1D4AC 7685 UNKNOWN, // 1D4AD 7686 COMMON, // 1D4AE..1D4B9 7687 UNKNOWN, // 1D4BA 7688 COMMON, // 1D4BB 7689 UNKNOWN, // 1D4BC 7690 COMMON, // 1D4BD..1D4C3 7691 UNKNOWN, // 1D4C4 7692 COMMON, // 1D4C5..1D505 7693 UNKNOWN, // 1D506 7694 COMMON, // 1D507..1D50A 7695 UNKNOWN, // 1D50B..1D50C 7696 COMMON, // 1D50D..1D514 7697 UNKNOWN, // 1D515 7698 COMMON, // 1D516..1D51C 7699 UNKNOWN, // 1D51D 7700 COMMON, // 1D51E..1D539 7701 UNKNOWN, // 1D53A 7702 COMMON, // 1D53B..1D53E 7703 UNKNOWN, // 1D53F 7704 COMMON, // 1D540..1D544 7705 UNKNOWN, // 1D545 7706 COMMON, // 1D546 7707 UNKNOWN, // 1D547..1D549 7708 COMMON, // 1D54A..1D550 7709 UNKNOWN, // 1D551 7710 COMMON, // 1D552..1D6A5 7711 UNKNOWN, // 1D6A6..1D6A7 7712 COMMON, // 1D6A8..1D7CB 7713 UNKNOWN, // 1D7CC..1D7CD 7714 COMMON, // 1D7CE..1D7FF 7715 SIGNWRITING, // 1D800..1DA8B 7716 UNKNOWN, // 1DA8C..1DA9A 7717 SIGNWRITING, // 1DA9B..1DA9F 7718 UNKNOWN, // 1DAA0 7719 SIGNWRITING, // 1DAA1..1DAAF 7720 UNKNOWN, // 1DAB0..1DFFF 7721 GLAGOLITIC, // 1E000..1E006 7722 UNKNOWN, // 1E007 7723 GLAGOLITIC, // 1E008..1E018 7724 UNKNOWN, // 1E019..1E01A 7725 GLAGOLITIC, // 1E01B..1E021 7726 UNKNOWN, // 1E022 7727 GLAGOLITIC, // 1E023..1E024 7728 UNKNOWN, // 1E025 7729 GLAGOLITIC, // 1E026..1E02A 7730 UNKNOWN, // 1E02B..1E7FF 7731 MENDE_KIKAKUI, // 1E800..1E8C4 7732 UNKNOWN, // 1E8C5..1E8C6 7733 MENDE_KIKAKUI, // 1E8C7..1E8D6 7734 UNKNOWN, // 1E8D7..1E8FF 7735 ADLAM, // 1E900..1E94A 7736 UNKNOWN, // 1E94B..1E94F 7737 ADLAM, // 1E950..1E959 7738 UNKNOWN, // 1E95A..1E95D 7739 ADLAM, // 1E95E..1E95F 7740 UNKNOWN, // 1E960..1EC70 7741 COMMON, // 1EC71..1ECB4 7742 UNKNOWN, // 1ECB5..1EDFF 7743 ARABIC, // 1EE00..1EE03 7744 UNKNOWN, // 1EE04 7745 ARABIC, // 1EE05..1EE1F 7746 UNKNOWN, // 1EE20 7747 ARABIC, // 1EE21..1EE22 7748 UNKNOWN, // 1EE23 7749 ARABIC, // 1EE24 7750 UNKNOWN, // 1EE25..1EE26 7751 ARABIC, // 1EE27 7752 UNKNOWN, // 1EE28 7753 ARABIC, // 1EE29..1EE32 7754 UNKNOWN, // 1EE33 7755 ARABIC, // 1EE34..1EE37 7756 UNKNOWN, // 1EE38 7757 ARABIC, // 1EE39 7758 UNKNOWN, // 1EE3A 7759 ARABIC, // 1EE3B 7760 UNKNOWN, // 1EE3C..1EE41 7761 ARABIC, // 1EE42 7762 UNKNOWN, // 1EE43..1EE46 7763 ARABIC, // 1EE47 7764 UNKNOWN, // 1EE48 7765 ARABIC, // 1EE49 7766 UNKNOWN, // 1EE4A 7767 ARABIC, // 1EE4B 7768 UNKNOWN, // 1EE4C 7769 ARABIC, // 1EE4D..1EE4F 7770 UNKNOWN, // 1EE50 7771 ARABIC, // 1EE51..1EE52 7772 UNKNOWN, // 1EE53 7773 ARABIC, // 1EE54 7774 UNKNOWN, // 1EE55..1EE56 7775 ARABIC, // 1EE57 7776 UNKNOWN, // 1EE58 7777 ARABIC, // 1EE59 7778 UNKNOWN, // 1EE5A 7779 ARABIC, // 1EE5B 7780 UNKNOWN, // 1EE5C 7781 ARABIC, // 1EE5D 7782 UNKNOWN, // 1EE5E 7783 ARABIC, // 1EE5F 7784 UNKNOWN, // 1EE60 7785 ARABIC, // 1EE61..1EE62 7786 UNKNOWN, // 1EE63 7787 ARABIC, // 1EE64 7788 UNKNOWN, // 1EE65..1EE66 7789 ARABIC, // 1EE67..1EE6A 7790 UNKNOWN, // 1EE6B 7791 ARABIC, // 1EE6C..1EE72 7792 UNKNOWN, // 1EE73 7793 ARABIC, // 1EE74..1EE77 7794 UNKNOWN, // 1EE78 7795 ARABIC, // 1EE79..1EE7C 7796 UNKNOWN, // 1EE7D 7797 ARABIC, // 1EE7E 7798 UNKNOWN, // 1EE7F 7799 ARABIC, // 1EE80..1EE89 7800 UNKNOWN, // 1EE8A 7801 ARABIC, // 1EE8B..1EE9B 7802 UNKNOWN, // 1EE9C..1EEA0 7803 ARABIC, // 1EEA1..1EEA3 7804 UNKNOWN, // 1EEA4 7805 ARABIC, // 1EEA5..1EEA9 7806 UNKNOWN, // 1EEAA 7807 ARABIC, // 1EEAB..1EEBB 7808 UNKNOWN, // 1EEBC..1EEEF 7809 ARABIC, // 1EEF0..1EEF1 7810 UNKNOWN, // 1EEF2..1EFFF 7811 COMMON, // 1F000..1F02B 7812 UNKNOWN, // 1F02C..1F02F 7813 COMMON, // 1F030..1F093 7814 UNKNOWN, // 1F094..1F09F 7815 COMMON, // 1F0A0..1F0AE 7816 UNKNOWN, // 1F0AF..1F0B0 7817 COMMON, // 1F0B1..1F0BF 7818 UNKNOWN, // 1F0C0 7819 COMMON, // 1F0C1..1F0CF 7820 UNKNOWN, // 1F0D0 7821 COMMON, // 1F0D1..1F0F5 7822 UNKNOWN, // 1F0F6..1F0FF 7823 COMMON, // 1F100..1F10C 7824 UNKNOWN, // 1F10D..1F10F 7825 COMMON, // 1F110..1F16B 7826 UNKNOWN, // 1F16C..1F16F 7827 COMMON, // 1F170..1F1AC 7828 UNKNOWN, // 1F1AD..1F1E5 7829 COMMON, // 1F1E6..1F1FF 7830 HIRAGANA, // 1F200 7831 COMMON, // 1F201..1F202 7832 UNKNOWN, // 1F203..1F20F 7833 COMMON, // 1F210..1F23B 7834 UNKNOWN, // 1F23C..1F23F 7835 COMMON, // 1F240..1F248 7836 UNKNOWN, // 1F249..1F24F 7837 COMMON, // 1F250..1F251 7838 UNKNOWN, // 1F252..1F25F 7839 COMMON, // 1F260..1F265 7840 UNKNOWN, // 1F266..1F2FF 7841 COMMON, // 1F300..1F6D4 7842 UNKNOWN, // 1F6D5..1F6DF 7843 COMMON, // 1F6E0..1F6EC 7844 UNKNOWN, // 1F6ED..1F6EF 7845 COMMON, // 1F6F0..1F6F9 7846 UNKNOWN, // 1F6FA..1F6FF 7847 COMMON, // 1F700..1F773 7848 UNKNOWN, // 1F774..1F77F 7849 COMMON, // 1F780..1F7D8 7850 UNKNOWN, // 1F7D9..1F7FF 7851 COMMON, // 1F800..1F80B 7852 UNKNOWN, // 1F80C..1F80F 7853 COMMON, // 1F810..1F847 7854 UNKNOWN, // 1F848..1F84F 7855 COMMON, // 1F850..1F859 7856 UNKNOWN, // 1F85A..1F85F 7857 COMMON, // 1F860..1F887 7858 UNKNOWN, // 1F888..1F88F 7859 COMMON, // 1F890..1F8AD 7860 UNKNOWN, // 1F8AE..1F8FF 7861 COMMON, // 1F900..1F90B 7862 UNKNOWN, // 1F90C..1F90F 7863 COMMON, // 1F910..1F93E 7864 UNKNOWN, // 1F93F 7865 COMMON, // 1F940..1F970 7866 UNKNOWN, // 1F971..1F972 7867 COMMON, // 1F973..1F976 7868 UNKNOWN, // 1F977..1F979 7869 COMMON, // 1F97A 7870 UNKNOWN, // 1F97B 7871 COMMON, // 1F97C..1F9A2 7872 UNKNOWN, // 1F9A3..1F9AF 7873 COMMON, // 1F9B0..1F9B9 7874 UNKNOWN, // 1F9BA..1F9BF 7875 COMMON, // 1F9C0..1F9C2 7876 UNKNOWN, // 1F9C3..1F9CF 7877 COMMON, // 1F9D0..1F9FF 7878 UNKNOWN, // 1FA00..1FA5F 7879 COMMON, // 1FA60..1FA6D 7880 UNKNOWN, // 1FA6E..1FFFF 7881 HAN, // 20000..2A6D6 7882 UNKNOWN, // 2A6D7..2A6FF 7883 HAN, // 2A700..2B734 7884 UNKNOWN, // 2B735..2B73F 7885 HAN, // 2B740..2B81D 7886 UNKNOWN, // 2B81E..2B81F 7887 HAN, // 2B820..2CEA1 7888 UNKNOWN, // 2CEA2..2CEAF 7889 HAN, // 2CEB0..2EBE0 7890 UNKNOWN, // 2EBE1..2F7FF 7891 HAN, // 2F800..2FA1D 7892 UNKNOWN, // 2FA1E..E0000 7893 COMMON, // E0001 7894 UNKNOWN, // E0002..E001F 7895 COMMON, // E0020..E007F 7896 UNKNOWN, // E0080..E00FF 7897 INHERITED, // E0100..E01EF 7898 UNKNOWN // E01F0..10FFFF 7899 }; 7900 7901 private static HashMap<String, Character.UnicodeScript> aliases; 7902 static { 7903 aliases = new HashMap<>((int)(149 / 0.75f + 1.0f)); 7904 aliases.put("ADLM", ADLAM); 7905 aliases.put("AGHB", CAUCASIAN_ALBANIAN); 7906 aliases.put("AHOM", AHOM); 7907 aliases.put("ARAB", ARABIC); 7908 aliases.put("ARMI", IMPERIAL_ARAMAIC); 7909 aliases.put("ARMN", ARMENIAN); 7910 aliases.put("AVST", AVESTAN); 7911 aliases.put("BALI", BALINESE); 7912 aliases.put("BAMU", BAMUM); 7913 aliases.put("BASS", BASSA_VAH); 7914 aliases.put("BATK", BATAK); 7915 aliases.put("BENG", BENGALI); 7916 aliases.put("BHKS", BHAIKSUKI); 7917 aliases.put("BOPO", BOPOMOFO); 7918 aliases.put("BRAH", BRAHMI); 7919 aliases.put("BRAI", BRAILLE); 7920 aliases.put("BUGI", BUGINESE); 7921 aliases.put("BUHD", BUHID); 7922 aliases.put("CAKM", CHAKMA); 7923 aliases.put("CANS", CANADIAN_ABORIGINAL); 7924 aliases.put("CARI", CARIAN); 7925 aliases.put("CHAM", CHAM); 7926 aliases.put("CHER", CHEROKEE); 7927 aliases.put("COPT", COPTIC); 7928 aliases.put("CPRT", CYPRIOT); 7929 aliases.put("CYRL", CYRILLIC); 7930 aliases.put("DEVA", DEVANAGARI); 7931 aliases.put("DOGR", DOGRA); 7932 aliases.put("DSRT", DESERET); 7933 aliases.put("DUPL", DUPLOYAN); 7934 aliases.put("EGYP", EGYPTIAN_HIEROGLYPHS); 7935 aliases.put("ELBA", ELBASAN); 7936 aliases.put("ETHI", ETHIOPIC); 7937 aliases.put("GEOR", GEORGIAN); 7938 aliases.put("GLAG", GLAGOLITIC); 7939 aliases.put("GONM", MASARAM_GONDI); 7940 aliases.put("GOTH", GOTHIC); 7941 aliases.put("GONG", GUNJALA_GONDI); 7942 aliases.put("GRAN", GRANTHA); 7943 aliases.put("GREK", GREEK); 7944 aliases.put("GUJR", GUJARATI); 7945 aliases.put("GURU", GURMUKHI); 7946 aliases.put("HANG", HANGUL); 7947 aliases.put("HANI", HAN); 7948 aliases.put("HANO", HANUNOO); 7949 aliases.put("HATR", HATRAN); 7950 aliases.put("HEBR", HEBREW); 7951 aliases.put("HIRA", HIRAGANA); 7952 aliases.put("HLUW", ANATOLIAN_HIEROGLYPHS); 7953 aliases.put("HMNG", PAHAWH_HMONG); 7954 // it appears we don't have the KATAKANA_OR_HIRAGANA 7955 //aliases.put("HRKT", KATAKANA_OR_HIRAGANA); 7956 aliases.put("HUNG", OLD_HUNGARIAN); 7957 aliases.put("ITAL", OLD_ITALIC); 7958 aliases.put("JAVA", JAVANESE); 7959 aliases.put("KALI", KAYAH_LI); 7960 aliases.put("KANA", KATAKANA); 7961 aliases.put("KHAR", KHAROSHTHI); 7962 aliases.put("KHMR", KHMER); 7963 aliases.put("KHOJ", KHOJKI); 7964 aliases.put("KNDA", KANNADA); 7965 aliases.put("KTHI", KAITHI); 7966 aliases.put("LANA", TAI_THAM); 7967 aliases.put("LAOO", LAO); 7968 aliases.put("LATN", LATIN); 7969 aliases.put("LEPC", LEPCHA); 7970 aliases.put("LIMB", LIMBU); 7971 aliases.put("LINA", LINEAR_A); 7972 aliases.put("LINB", LINEAR_B); 7973 aliases.put("LISU", LISU); 7974 aliases.put("LYCI", LYCIAN); 7975 aliases.put("LYDI", LYDIAN); 7976 aliases.put("MAHJ", MAHAJANI); 7977 aliases.put("MAKA", MAKASAR); 7978 aliases.put("MARC", MARCHEN); 7979 aliases.put("MAND", MANDAIC); 7980 aliases.put("MANI", MANICHAEAN); 7981 aliases.put("MEDF", MEDEFAIDRIN); 7982 aliases.put("MEND", MENDE_KIKAKUI); 7983 aliases.put("MERC", MEROITIC_CURSIVE); 7984 aliases.put("MERO", MEROITIC_HIEROGLYPHS); 7985 aliases.put("MLYM", MALAYALAM); 7986 aliases.put("MODI", MODI); 7987 aliases.put("MONG", MONGOLIAN); 7988 aliases.put("MROO", MRO); 7989 aliases.put("MTEI", MEETEI_MAYEK); 7990 aliases.put("MULT", MULTANI); 7991 aliases.put("MYMR", MYANMAR); 7992 aliases.put("NARB", OLD_NORTH_ARABIAN); 7993 aliases.put("NBAT", NABATAEAN); 7994 aliases.put("NEWA", NEWA); 7995 aliases.put("NKOO", NKO); 7996 aliases.put("NSHU", NUSHU); 7997 aliases.put("OGAM", OGHAM); 7998 aliases.put("OLCK", OL_CHIKI); 7999 aliases.put("ORKH", OLD_TURKIC); 8000 aliases.put("ORYA", ORIYA); 8001 aliases.put("OSGE", OSAGE); 8002 aliases.put("OSMA", OSMANYA); 8003 aliases.put("PALM", PALMYRENE); 8004 aliases.put("PAUC", PAU_CIN_HAU); 8005 aliases.put("PERM", OLD_PERMIC); 8006 aliases.put("PHAG", PHAGS_PA); 8007 aliases.put("PHLI", INSCRIPTIONAL_PAHLAVI); 8008 aliases.put("PHLP", PSALTER_PAHLAVI); 8009 aliases.put("PHNX", PHOENICIAN); 8010 aliases.put("PLRD", MIAO); 8011 aliases.put("PRTI", INSCRIPTIONAL_PARTHIAN); 8012 aliases.put("RJNG", REJANG); 8013 aliases.put("ROHG", HANIFI_ROHINGYA); 8014 aliases.put("RUNR", RUNIC); 8015 aliases.put("SAMR", SAMARITAN); 8016 aliases.put("SARB", OLD_SOUTH_ARABIAN); 8017 aliases.put("SAUR", SAURASHTRA); 8018 aliases.put("SGNW", SIGNWRITING); 8019 aliases.put("SHAW", SHAVIAN); 8020 aliases.put("SHRD", SHARADA); 8021 aliases.put("SIDD", SIDDHAM); 8022 aliases.put("SIND", KHUDAWADI); 8023 aliases.put("SINH", SINHALA); 8024 aliases.put("SOGD", SOGDIAN); 8025 aliases.put("SOGO", OLD_SOGDIAN); 8026 aliases.put("SORA", SORA_SOMPENG); 8027 aliases.put("SOYO", SOYOMBO); 8028 aliases.put("SUND", SUNDANESE); 8029 aliases.put("SYLO", SYLOTI_NAGRI); 8030 aliases.put("SYRC", SYRIAC); 8031 aliases.put("TAGB", TAGBANWA); 8032 aliases.put("TAKR", TAKRI); 8033 aliases.put("TALE", TAI_LE); 8034 aliases.put("TALU", NEW_TAI_LUE); 8035 aliases.put("TAML", TAMIL); 8036 aliases.put("TANG", TANGUT); 8037 aliases.put("TAVT", TAI_VIET); 8038 aliases.put("TELU", TELUGU); 8039 aliases.put("TFNG", TIFINAGH); 8040 aliases.put("TGLG", TAGALOG); 8041 aliases.put("THAA", THAANA); 8042 aliases.put("THAI", THAI); 8043 aliases.put("TIBT", TIBETAN); 8044 aliases.put("TIRH", TIRHUTA); 8045 aliases.put("UGAR", UGARITIC); 8046 aliases.put("VAII", VAI); 8047 aliases.put("WARA", WARANG_CITI); 8048 aliases.put("XPEO", OLD_PERSIAN); 8049 aliases.put("XSUX", CUNEIFORM); 8050 aliases.put("YIII", YI); 8051 aliases.put("ZANB", ZANABAZAR_SQUARE); 8052 aliases.put("ZINH", INHERITED); 8053 aliases.put("ZYYY", COMMON); 8054 aliases.put("ZZZZ", UNKNOWN); 8055 } 8056 8057 /** 8058 * Returns the enum constant representing the Unicode script of which 8059 * the given character (Unicode code point) is assigned to. 8060 * 8061 * @param codePoint the character (Unicode code point) in question. 8062 * @return The {@code UnicodeScript} constant representing the 8063 * Unicode script of which this character is assigned to. 8064 * 8065 * @throws IllegalArgumentException if the specified 8066 * {@code codePoint} is an invalid Unicode code point. 8067 * @see Character#isValidCodePoint(int) 8068 * 8069 */ 8070 public static UnicodeScript of(int codePoint) { 8071 if (!isValidCodePoint(codePoint)) 8072 throw new IllegalArgumentException( 8073 String.format("Not a valid Unicode code point: 0x%X", codePoint)); 8074 int type = getType(codePoint); 8075 // leave SURROGATE and PRIVATE_USE for table lookup 8076 if (type == UNASSIGNED) 8077 return UNKNOWN; 8078 int index = Arrays.binarySearch(scriptStarts, codePoint); 8079 if (index < 0) 8080 index = -index - 2; 8081 return scripts[index]; 8082 } 8083 8084 /** 8085 * Returns the UnicodeScript constant with the given Unicode script 8086 * name or the script name alias. Script names and their aliases are 8087 * determined by The Unicode Standard. The files {@code Scripts<version>.txt} 8088 * and {@code PropertyValueAliases<version>.txt} define script names 8089 * and the script name aliases for a particular version of the 8090 * standard. The {@link Character} class specifies the version of 8091 * the standard that it supports. 8092 * <p> 8093 * Character case is ignored for all of the valid script names. 8094 * The en_US locale's case mapping rules are used to provide 8095 * case-insensitive string comparisons for script name validation. 8096 * 8097 * @param scriptName A {@code UnicodeScript} name. 8098 * @return The {@code UnicodeScript} constant identified 8099 * by {@code scriptName} 8100 * @throws IllegalArgumentException if {@code scriptName} is an 8101 * invalid name 8102 * @throws NullPointerException if {@code scriptName} is null 8103 */ 8104 public static final UnicodeScript forName(String scriptName) { 8105 scriptName = scriptName.toUpperCase(Locale.ENGLISH); 8106 //.replace(' ', '_')); 8107 UnicodeScript sc = aliases.get(scriptName); 8108 if (sc != null) 8109 return sc; 8110 return valueOf(scriptName); 8111 } 8112 } 8113 8114 /** 8115 * The value of the {@code Character}. 8116 * 8117 * @serial 8118 */ 8119 private final char value; 8120 8121 /** use serialVersionUID from JDK 1.0.2 for interoperability */ 8122 private static final long serialVersionUID = 3786198910865385080L; 8123 8124 /** 8125 * Constructs a newly allocated {@code Character} object that 8126 * represents the specified {@code char} value. 8127 * 8128 * @param value the value to be represented by the 8129 * {@code Character} object. 8130 * 8131 * @deprecated 8132 * It is rarely appropriate to use this constructor. The static factory 8133 * {@link #valueOf(char)} is generally a better choice, as it is 8134 * likely to yield significantly better space and time performance. 8135 */ 8136 @Deprecated(since="9") 8137 public Character(char value) { 8138 this.value = value; 8139 } 8140 8141 private static class CharacterCache { 8142 private CharacterCache(){} 8143 8144 static final Character[] cache; 8145 static Character[] archivedCache; 8146 8147 static { 8148 int size = 127 + 1; 8149 8150 // Load and use the archived cache if it exists 8151 VM.initializeFromArchive(CharacterCache.class); 8152 if (archivedCache == null || archivedCache.length != size) { 8153 Character[] c = new Character[size]; 8154 for (int i = 0; i < size; i++) { 8155 c[i] = new Character((char) i); 8156 } 8157 archivedCache = c; 8158 } 8159 cache = archivedCache; 8160 } 8161 } 8162 8163 /** 8164 * Returns a {@code Character} instance representing the specified 8165 * {@code char} value. 8166 * If a new {@code Character} instance is not required, this method 8167 * should generally be used in preference to the constructor 8168 * {@link #Character(char)}, as this method is likely to yield 8169 * significantly better space and time performance by caching 8170 * frequently requested values. 8171 * 8172 * This method will always cache values in the range {@code 8173 * '\u005Cu0000'} to {@code '\u005Cu007F'}, inclusive, and may 8174 * cache other values outside of this range. 8175 * 8176 * @param c a char value. 8177 * @return a {@code Character} instance representing {@code c}. 8178 * @since 1.5 8179 */ 8180 @HotSpotIntrinsicCandidate 8181 public static Character valueOf(char c) { 8182 if (c <= 127) { // must cache 8183 return CharacterCache.cache[(int)c]; 8184 } 8185 return new Character(c); 8186 } 8187 8188 /** 8189 * Returns the value of this {@code Character} object. 8190 * @return the primitive {@code char} value represented by 8191 * this object. 8192 */ 8193 @HotSpotIntrinsicCandidate 8194 public char charValue() { 8195 return value; 8196 } 8197 8198 /** 8199 * Returns a hash code for this {@code Character}; equal to the result 8200 * of invoking {@code charValue()}. 8201 * 8202 * @return a hash code value for this {@code Character} 8203 */ 8204 @Override 8205 public int hashCode() { 8206 return Character.hashCode(value); 8207 } 8208 8209 /** 8210 * Returns a hash code for a {@code char} value; compatible with 8211 * {@code Character.hashCode()}. 8212 * 8213 * @since 1.8 8214 * 8215 * @param value The {@code char} for which to return a hash code. 8216 * @return a hash code value for a {@code char} value. 8217 */ 8218 public static int hashCode(char value) { 8219 return (int)value; 8220 } 8221 8222 /** 8223 * Compares this object against the specified object. 8224 * The result is {@code true} if and only if the argument is not 8225 * {@code null} and is a {@code Character} object that 8226 * represents the same {@code char} value as this object. 8227 * 8228 * @param obj the object to compare with. 8229 * @return {@code true} if the objects are the same; 8230 * {@code false} otherwise. 8231 */ 8232 public boolean equals(Object obj) { 8233 if (obj instanceof Character) { 8234 return value == ((Character)obj).charValue(); 8235 } 8236 return false; 8237 } 8238 8239 /** 8240 * Returns a {@code String} object representing this 8241 * {@code Character}'s value. The result is a string of 8242 * length 1 whose sole component is the primitive 8243 * {@code char} value represented by this 8244 * {@code Character} object. 8245 * 8246 * @return a string representation of this object. 8247 */ 8248 public String toString() { 8249 char buf[] = {value}; 8250 return String.valueOf(buf); 8251 } 8252 8253 /** 8254 * Returns a {@code String} object representing the 8255 * specified {@code char}. The result is a string of length 8256 * 1 consisting solely of the specified {@code char}. 8257 * 8258 * @apiNote This method cannot handle <a 8259 * href="#supplementary"> supplementary characters</a>. To support 8260 * all Unicode characters, including supplementary characters, use 8261 * the {@link #toString(int)} method. 8262 * 8263 * @param c the {@code char} to be converted 8264 * @return the string representation of the specified {@code char} 8265 * @since 1.4 8266 */ 8267 public static String toString(char c) { 8268 return String.valueOf(c); 8269 } 8270 8271 /** 8272 * Returns a {@code String} object representing the 8273 * specified character (Unicode code point). The result is a string of 8274 * length 1 or 2, consisting solely of the specified {@code codePoint}. 8275 * 8276 * @param codePoint the {@code codePoint} to be converted 8277 * @return the string representation of the specified {@code codePoint} 8278 * @throws IllegalArgumentException if the specified 8279 * {@code codePoint} is not a {@linkplain #isValidCodePoint 8280 * valid Unicode code point}. 8281 * @since 11 8282 */ 8283 public static String toString(int codePoint) { 8284 return String.valueOfCodePoint(codePoint); 8285 } 8286 8287 /** 8288 * Determines whether the specified code point is a valid 8289 * <a href="http://www.unicode.org/glossary/#code_point"> 8290 * Unicode code point value</a>. 8291 * 8292 * @param codePoint the Unicode code point to be tested 8293 * @return {@code true} if the specified code point value is between 8294 * {@link #MIN_CODE_POINT} and 8295 * {@link #MAX_CODE_POINT} inclusive; 8296 * {@code false} otherwise. 8297 * @since 1.5 8298 */ 8299 public static boolean isValidCodePoint(int codePoint) { 8300 // Optimized form of: 8301 // codePoint >= MIN_CODE_POINT && codePoint <= MAX_CODE_POINT 8302 int plane = codePoint >>> 16; 8303 return plane < ((MAX_CODE_POINT + 1) >>> 16); 8304 } 8305 8306 /** 8307 * Determines whether the specified character (Unicode code point) 8308 * is in the <a href="#BMP">Basic Multilingual Plane (BMP)</a>. 8309 * Such code points can be represented using a single {@code char}. 8310 * 8311 * @param codePoint the character (Unicode code point) to be tested 8312 * @return {@code true} if the specified code point is between 8313 * {@link #MIN_VALUE} and {@link #MAX_VALUE} inclusive; 8314 * {@code false} otherwise. 8315 * @since 1.7 8316 */ 8317 public static boolean isBmpCodePoint(int codePoint) { 8318 return codePoint >>> 16 == 0; 8319 // Optimized form of: 8320 // codePoint >= MIN_VALUE && codePoint <= MAX_VALUE 8321 // We consistently use logical shift (>>>) to facilitate 8322 // additional runtime optimizations. 8323 } 8324 8325 /** 8326 * Determines whether the specified character (Unicode code point) 8327 * is in the <a href="#supplementary">supplementary character</a> range. 8328 * 8329 * @param codePoint the character (Unicode code point) to be tested 8330 * @return {@code true} if the specified code point is between 8331 * {@link #MIN_SUPPLEMENTARY_CODE_POINT} and 8332 * {@link #MAX_CODE_POINT} inclusive; 8333 * {@code false} otherwise. 8334 * @since 1.5 8335 */ 8336 public static boolean isSupplementaryCodePoint(int codePoint) { 8337 return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT 8338 && codePoint < MAX_CODE_POINT + 1; 8339 } 8340 8341 /** 8342 * Determines if the given {@code char} value is a 8343 * <a href="http://www.unicode.org/glossary/#high_surrogate_code_unit"> 8344 * Unicode high-surrogate code unit</a> 8345 * (also known as <i>leading-surrogate code unit</i>). 8346 * 8347 * <p>Such values do not represent characters by themselves, 8348 * but are used in the representation of 8349 * <a href="#supplementary">supplementary characters</a> 8350 * in the UTF-16 encoding. 8351 * 8352 * @param ch the {@code char} value to be tested. 8353 * @return {@code true} if the {@code char} value is between 8354 * {@link #MIN_HIGH_SURROGATE} and 8355 * {@link #MAX_HIGH_SURROGATE} inclusive; 8356 * {@code false} otherwise. 8357 * @see Character#isLowSurrogate(char) 8358 * @see Character.UnicodeBlock#of(int) 8359 * @since 1.5 8360 */ 8361 public static boolean isHighSurrogate(char ch) { 8362 // Help VM constant-fold; MAX_HIGH_SURROGATE + 1 == MIN_LOW_SURROGATE 8363 return ch >= MIN_HIGH_SURROGATE && ch < (MAX_HIGH_SURROGATE + 1); 8364 } 8365 8366 /** 8367 * Determines if the given {@code char} value is a 8368 * <a href="http://www.unicode.org/glossary/#low_surrogate_code_unit"> 8369 * Unicode low-surrogate code unit</a> 8370 * (also known as <i>trailing-surrogate code unit</i>). 8371 * 8372 * <p>Such values do not represent characters by themselves, 8373 * but are used in the representation of 8374 * <a href="#supplementary">supplementary characters</a> 8375 * in the UTF-16 encoding. 8376 * 8377 * @param ch the {@code char} value to be tested. 8378 * @return {@code true} if the {@code char} value is between 8379 * {@link #MIN_LOW_SURROGATE} and 8380 * {@link #MAX_LOW_SURROGATE} inclusive; 8381 * {@code false} otherwise. 8382 * @see Character#isHighSurrogate(char) 8383 * @since 1.5 8384 */ 8385 public static boolean isLowSurrogate(char ch) { 8386 return ch >= MIN_LOW_SURROGATE && ch < (MAX_LOW_SURROGATE + 1); 8387 } 8388 8389 /** 8390 * Determines if the given {@code char} value is a Unicode 8391 * <i>surrogate code unit</i>. 8392 * 8393 * <p>Such values do not represent characters by themselves, 8394 * but are used in the representation of 8395 * <a href="#supplementary">supplementary characters</a> 8396 * in the UTF-16 encoding. 8397 * 8398 * <p>A char value is a surrogate code unit if and only if it is either 8399 * a {@linkplain #isLowSurrogate(char) low-surrogate code unit} or 8400 * a {@linkplain #isHighSurrogate(char) high-surrogate code unit}. 8401 * 8402 * @param ch the {@code char} value to be tested. 8403 * @return {@code true} if the {@code char} value is between 8404 * {@link #MIN_SURROGATE} and 8405 * {@link #MAX_SURROGATE} inclusive; 8406 * {@code false} otherwise. 8407 * @since 1.7 8408 */ 8409 public static boolean isSurrogate(char ch) { 8410 return ch >= MIN_SURROGATE && ch < (MAX_SURROGATE + 1); 8411 } 8412 8413 /** 8414 * Determines whether the specified pair of {@code char} 8415 * values is a valid 8416 * <a href="http://www.unicode.org/glossary/#surrogate_pair"> 8417 * Unicode surrogate pair</a>. 8418 8419 * <p>This method is equivalent to the expression: 8420 * <blockquote><pre>{@code 8421 * isHighSurrogate(high) && isLowSurrogate(low) 8422 * }</pre></blockquote> 8423 * 8424 * @param high the high-surrogate code value to be tested 8425 * @param low the low-surrogate code value to be tested 8426 * @return {@code true} if the specified high and 8427 * low-surrogate code values represent a valid surrogate pair; 8428 * {@code false} otherwise. 8429 * @since 1.5 8430 */ 8431 public static boolean isSurrogatePair(char high, char low) { 8432 return isHighSurrogate(high) && isLowSurrogate(low); 8433 } 8434 8435 /** 8436 * Determines the number of {@code char} values needed to 8437 * represent the specified character (Unicode code point). If the 8438 * specified character is equal to or greater than 0x10000, then 8439 * the method returns 2. Otherwise, the method returns 1. 8440 * 8441 * <p>This method doesn't validate the specified character to be a 8442 * valid Unicode code point. The caller must validate the 8443 * character value using {@link #isValidCodePoint(int) isValidCodePoint} 8444 * if necessary. 8445 * 8446 * @param codePoint the character (Unicode code point) to be tested. 8447 * @return 2 if the character is a valid supplementary character; 1 otherwise. 8448 * @see Character#isSupplementaryCodePoint(int) 8449 * @since 1.5 8450 */ 8451 public static int charCount(int codePoint) { 8452 return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT ? 2 : 1; 8453 } 8454 8455 /** 8456 * Converts the specified surrogate pair to its supplementary code 8457 * point value. This method does not validate the specified 8458 * surrogate pair. The caller must validate it using {@link 8459 * #isSurrogatePair(char, char) isSurrogatePair} if necessary. 8460 * 8461 * @param high the high-surrogate code unit 8462 * @param low the low-surrogate code unit 8463 * @return the supplementary code point composed from the 8464 * specified surrogate pair. 8465 * @since 1.5 8466 */ 8467 public static int toCodePoint(char high, char low) { 8468 // Optimized form of: 8469 // return ((high - MIN_HIGH_SURROGATE) << 10) 8470 // + (low - MIN_LOW_SURROGATE) 8471 // + MIN_SUPPLEMENTARY_CODE_POINT; 8472 return ((high << 10) + low) + (MIN_SUPPLEMENTARY_CODE_POINT 8473 - (MIN_HIGH_SURROGATE << 10) 8474 - MIN_LOW_SURROGATE); 8475 } 8476 8477 /** 8478 * Returns the code point at the given index of the 8479 * {@code CharSequence}. If the {@code char} value at 8480 * the given index in the {@code CharSequence} is in the 8481 * high-surrogate range, the following index is less than the 8482 * length of the {@code CharSequence}, and the 8483 * {@code char} value at the following index is in the 8484 * low-surrogate range, then the supplementary code point 8485 * corresponding to this surrogate pair is returned. Otherwise, 8486 * the {@code char} value at the given index is returned. 8487 * 8488 * @param seq a sequence of {@code char} values (Unicode code 8489 * units) 8490 * @param index the index to the {@code char} values (Unicode 8491 * code units) in {@code seq} to be converted 8492 * @return the Unicode code point at the given index 8493 * @throws NullPointerException if {@code seq} is null. 8494 * @throws IndexOutOfBoundsException if the value 8495 * {@code index} is negative or not less than 8496 * {@link CharSequence#length() seq.length()}. 8497 * @since 1.5 8498 */ 8499 public static int codePointAt(CharSequence seq, int index) { 8500 char c1 = seq.charAt(index); 8501 if (isHighSurrogate(c1) && ++index < seq.length()) { 8502 char c2 = seq.charAt(index); 8503 if (isLowSurrogate(c2)) { 8504 return toCodePoint(c1, c2); 8505 } 8506 } 8507 return c1; 8508 } 8509 8510 /** 8511 * Returns the code point at the given index of the 8512 * {@code char} array. If the {@code char} value at 8513 * the given index in the {@code char} array is in the 8514 * high-surrogate range, the following index is less than the 8515 * length of the {@code char} array, and the 8516 * {@code char} value at the following index is in the 8517 * low-surrogate range, then the supplementary code point 8518 * corresponding to this surrogate pair is returned. Otherwise, 8519 * the {@code char} value at the given index is returned. 8520 * 8521 * @param a the {@code char} array 8522 * @param index the index to the {@code char} values (Unicode 8523 * code units) in the {@code char} array to be converted 8524 * @return the Unicode code point at the given index 8525 * @throws NullPointerException if {@code a} is null. 8526 * @throws IndexOutOfBoundsException if the value 8527 * {@code index} is negative or not less than 8528 * the length of the {@code char} array. 8529 * @since 1.5 8530 */ 8531 public static int codePointAt(char[] a, int index) { 8532 return codePointAtImpl(a, index, a.length); 8533 } 8534 8535 /** 8536 * Returns the code point at the given index of the 8537 * {@code char} array, where only array elements with 8538 * {@code index} less than {@code limit} can be used. If 8539 * the {@code char} value at the given index in the 8540 * {@code char} array is in the high-surrogate range, the 8541 * following index is less than the {@code limit}, and the 8542 * {@code char} value at the following index is in the 8543 * low-surrogate range, then the supplementary code point 8544 * corresponding to this surrogate pair is returned. Otherwise, 8545 * the {@code char} value at the given index is returned. 8546 * 8547 * @param a the {@code char} array 8548 * @param index the index to the {@code char} values (Unicode 8549 * code units) in the {@code char} array to be converted 8550 * @param limit the index after the last array element that 8551 * can be used in the {@code char} array 8552 * @return the Unicode code point at the given index 8553 * @throws NullPointerException if {@code a} is null. 8554 * @throws IndexOutOfBoundsException if the {@code index} 8555 * argument is negative or not less than the {@code limit} 8556 * argument, or if the {@code limit} argument is negative or 8557 * greater than the length of the {@code char} array. 8558 * @since 1.5 8559 */ 8560 public static int codePointAt(char[] a, int index, int limit) { 8561 if (index >= limit || limit < 0 || limit > a.length) { 8562 throw new IndexOutOfBoundsException(); 8563 } 8564 return codePointAtImpl(a, index, limit); 8565 } 8566 8567 // throws ArrayIndexOutOfBoundsException if index out of bounds 8568 static int codePointAtImpl(char[] a, int index, int limit) { 8569 char c1 = a[index]; 8570 if (isHighSurrogate(c1) && ++index < limit) { 8571 char c2 = a[index]; 8572 if (isLowSurrogate(c2)) { 8573 return toCodePoint(c1, c2); 8574 } 8575 } 8576 return c1; 8577 } 8578 8579 /** 8580 * Returns the code point preceding the given index of the 8581 * {@code CharSequence}. If the {@code char} value at 8582 * {@code (index - 1)} in the {@code CharSequence} is in 8583 * the low-surrogate range, {@code (index - 2)} is not 8584 * negative, and the {@code char} value at {@code (index - 2)} 8585 * in the {@code CharSequence} is in the 8586 * high-surrogate range, then the supplementary code point 8587 * corresponding to this surrogate pair is returned. Otherwise, 8588 * the {@code char} value at {@code (index - 1)} is 8589 * returned. 8590 * 8591 * @param seq the {@code CharSequence} instance 8592 * @param index the index following the code point that should be returned 8593 * @return the Unicode code point value before the given index. 8594 * @throws NullPointerException if {@code seq} is null. 8595 * @throws IndexOutOfBoundsException if the {@code index} 8596 * argument is less than 1 or greater than {@link 8597 * CharSequence#length() seq.length()}. 8598 * @since 1.5 8599 */ 8600 public static int codePointBefore(CharSequence seq, int index) { 8601 char c2 = seq.charAt(--index); 8602 if (isLowSurrogate(c2) && index > 0) { 8603 char c1 = seq.charAt(--index); 8604 if (isHighSurrogate(c1)) { 8605 return toCodePoint(c1, c2); 8606 } 8607 } 8608 return c2; 8609 } 8610 8611 /** 8612 * Returns the code point preceding the given index of the 8613 * {@code char} array. If the {@code char} value at 8614 * {@code (index - 1)} in the {@code char} array is in 8615 * the low-surrogate range, {@code (index - 2)} is not 8616 * negative, and the {@code char} value at {@code (index - 2)} 8617 * in the {@code char} array is in the 8618 * high-surrogate range, then the supplementary code point 8619 * corresponding to this surrogate pair is returned. Otherwise, 8620 * the {@code char} value at {@code (index - 1)} is 8621 * returned. 8622 * 8623 * @param a the {@code char} array 8624 * @param index the index following the code point that should be returned 8625 * @return the Unicode code point value before the given index. 8626 * @throws NullPointerException if {@code a} is null. 8627 * @throws IndexOutOfBoundsException if the {@code index} 8628 * argument is less than 1 or greater than the length of the 8629 * {@code char} array 8630 * @since 1.5 8631 */ 8632 public static int codePointBefore(char[] a, int index) { 8633 return codePointBeforeImpl(a, index, 0); 8634 } 8635 8636 /** 8637 * Returns the code point preceding the given index of the 8638 * {@code char} array, where only array elements with 8639 * {@code index} greater than or equal to {@code start} 8640 * can be used. If the {@code char} value at {@code (index - 1)} 8641 * in the {@code char} array is in the 8642 * low-surrogate range, {@code (index - 2)} is not less than 8643 * {@code start}, and the {@code char} value at 8644 * {@code (index - 2)} in the {@code char} array is in 8645 * the high-surrogate range, then the supplementary code point 8646 * corresponding to this surrogate pair is returned. Otherwise, 8647 * the {@code char} value at {@code (index - 1)} is 8648 * returned. 8649 * 8650 * @param a the {@code char} array 8651 * @param index the index following the code point that should be returned 8652 * @param start the index of the first array element in the 8653 * {@code char} array 8654 * @return the Unicode code point value before the given index. 8655 * @throws NullPointerException if {@code a} is null. 8656 * @throws IndexOutOfBoundsException if the {@code index} 8657 * argument is not greater than the {@code start} argument or 8658 * is greater than the length of the {@code char} array, or 8659 * if the {@code start} argument is negative or not less than 8660 * the length of the {@code char} array. 8661 * @since 1.5 8662 */ 8663 public static int codePointBefore(char[] a, int index, int start) { 8664 if (index <= start || start < 0 || start >= a.length) { 8665 throw new IndexOutOfBoundsException(); 8666 } 8667 return codePointBeforeImpl(a, index, start); 8668 } 8669 8670 // throws ArrayIndexOutOfBoundsException if index-1 out of bounds 8671 static int codePointBeforeImpl(char[] a, int index, int start) { 8672 char c2 = a[--index]; 8673 if (isLowSurrogate(c2) && index > start) { 8674 char c1 = a[--index]; 8675 if (isHighSurrogate(c1)) { 8676 return toCodePoint(c1, c2); 8677 } 8678 } 8679 return c2; 8680 } 8681 8682 /** 8683 * Returns the leading surrogate (a 8684 * <a href="http://www.unicode.org/glossary/#high_surrogate_code_unit"> 8685 * high surrogate code unit</a>) of the 8686 * <a href="http://www.unicode.org/glossary/#surrogate_pair"> 8687 * surrogate pair</a> 8688 * representing the specified supplementary character (Unicode 8689 * code point) in the UTF-16 encoding. If the specified character 8690 * is not a 8691 * <a href="Character.html#supplementary">supplementary character</a>, 8692 * an unspecified {@code char} is returned. 8693 * 8694 * <p>If 8695 * {@link #isSupplementaryCodePoint isSupplementaryCodePoint(x)} 8696 * is {@code true}, then 8697 * {@link #isHighSurrogate isHighSurrogate}{@code (highSurrogate(x))} and 8698 * {@link #toCodePoint toCodePoint}{@code (highSurrogate(x), }{@link #lowSurrogate lowSurrogate}{@code (x)) == x} 8699 * are also always {@code true}. 8700 * 8701 * @param codePoint a supplementary character (Unicode code point) 8702 * @return the leading surrogate code unit used to represent the 8703 * character in the UTF-16 encoding 8704 * @since 1.7 8705 */ 8706 public static char highSurrogate(int codePoint) { 8707 return (char) ((codePoint >>> 10) 8708 + (MIN_HIGH_SURROGATE - (MIN_SUPPLEMENTARY_CODE_POINT >>> 10))); 8709 } 8710 8711 /** 8712 * Returns the trailing surrogate (a 8713 * <a href="http://www.unicode.org/glossary/#low_surrogate_code_unit"> 8714 * low surrogate code unit</a>) of the 8715 * <a href="http://www.unicode.org/glossary/#surrogate_pair"> 8716 * surrogate pair</a> 8717 * representing the specified supplementary character (Unicode 8718 * code point) in the UTF-16 encoding. If the specified character 8719 * is not a 8720 * <a href="Character.html#supplementary">supplementary character</a>, 8721 * an unspecified {@code char} is returned. 8722 * 8723 * <p>If 8724 * {@link #isSupplementaryCodePoint isSupplementaryCodePoint(x)} 8725 * is {@code true}, then 8726 * {@link #isLowSurrogate isLowSurrogate}{@code (lowSurrogate(x))} and 8727 * {@link #toCodePoint toCodePoint}{@code (}{@link #highSurrogate highSurrogate}{@code (x), lowSurrogate(x)) == x} 8728 * are also always {@code true}. 8729 * 8730 * @param codePoint a supplementary character (Unicode code point) 8731 * @return the trailing surrogate code unit used to represent the 8732 * character in the UTF-16 encoding 8733 * @since 1.7 8734 */ 8735 public static char lowSurrogate(int codePoint) { 8736 return (char) ((codePoint & 0x3ff) + MIN_LOW_SURROGATE); 8737 } 8738 8739 /** 8740 * Converts the specified character (Unicode code point) to its 8741 * UTF-16 representation. If the specified code point is a BMP 8742 * (Basic Multilingual Plane or Plane 0) value, the same value is 8743 * stored in {@code dst[dstIndex]}, and 1 is returned. If the 8744 * specified code point is a supplementary character, its 8745 * surrogate values are stored in {@code dst[dstIndex]} 8746 * (high-surrogate) and {@code dst[dstIndex+1]} 8747 * (low-surrogate), and 2 is returned. 8748 * 8749 * @param codePoint the character (Unicode code point) to be converted. 8750 * @param dst an array of {@code char} in which the 8751 * {@code codePoint}'s UTF-16 value is stored. 8752 * @param dstIndex the start index into the {@code dst} 8753 * array where the converted value is stored. 8754 * @return 1 if the code point is a BMP code point, 2 if the 8755 * code point is a supplementary code point. 8756 * @throws IllegalArgumentException if the specified 8757 * {@code codePoint} is not a valid Unicode code point. 8758 * @throws NullPointerException if the specified {@code dst} is null. 8759 * @throws IndexOutOfBoundsException if {@code dstIndex} 8760 * is negative or not less than {@code dst.length}, or if 8761 * {@code dst} at {@code dstIndex} doesn't have enough 8762 * array element(s) to store the resulting {@code char} 8763 * value(s). (If {@code dstIndex} is equal to 8764 * {@code dst.length-1} and the specified 8765 * {@code codePoint} is a supplementary character, the 8766 * high-surrogate value is not stored in 8767 * {@code dst[dstIndex]}.) 8768 * @since 1.5 8769 */ 8770 public static int toChars(int codePoint, char[] dst, int dstIndex) { 8771 if (isBmpCodePoint(codePoint)) { 8772 dst[dstIndex] = (char) codePoint; 8773 return 1; 8774 } else if (isValidCodePoint(codePoint)) { 8775 toSurrogates(codePoint, dst, dstIndex); 8776 return 2; 8777 } else { 8778 throw new IllegalArgumentException( 8779 String.format("Not a valid Unicode code point: 0x%X", codePoint)); 8780 } 8781 } 8782 8783 /** 8784 * Converts the specified character (Unicode code point) to its 8785 * UTF-16 representation stored in a {@code char} array. If 8786 * the specified code point is a BMP (Basic Multilingual Plane or 8787 * Plane 0) value, the resulting {@code char} array has 8788 * the same value as {@code codePoint}. If the specified code 8789 * point is a supplementary code point, the resulting 8790 * {@code char} array has the corresponding surrogate pair. 8791 * 8792 * @param codePoint a Unicode code point 8793 * @return a {@code char} array having 8794 * {@code codePoint}'s UTF-16 representation. 8795 * @throws IllegalArgumentException if the specified 8796 * {@code codePoint} is not a valid Unicode code point. 8797 * @since 1.5 8798 */ 8799 public static char[] toChars(int codePoint) { 8800 if (isBmpCodePoint(codePoint)) { 8801 return new char[] { (char) codePoint }; 8802 } else if (isValidCodePoint(codePoint)) { 8803 char[] result = new char[2]; 8804 toSurrogates(codePoint, result, 0); 8805 return result; 8806 } else { 8807 throw new IllegalArgumentException( 8808 String.format("Not a valid Unicode code point: 0x%X", codePoint)); 8809 } 8810 } 8811 8812 static void toSurrogates(int codePoint, char[] dst, int index) { 8813 // We write elements "backwards" to guarantee all-or-nothing 8814 dst[index+1] = lowSurrogate(codePoint); 8815 dst[index] = highSurrogate(codePoint); 8816 } 8817 8818 /** 8819 * Returns the number of Unicode code points in the text range of 8820 * the specified char sequence. The text range begins at the 8821 * specified {@code beginIndex} and extends to the 8822 * {@code char} at index {@code endIndex - 1}. Thus the 8823 * length (in {@code char}s) of the text range is 8824 * {@code endIndex-beginIndex}. Unpaired surrogates within 8825 * the text range count as one code point each. 8826 * 8827 * @param seq the char sequence 8828 * @param beginIndex the index to the first {@code char} of 8829 * the text range. 8830 * @param endIndex the index after the last {@code char} of 8831 * the text range. 8832 * @return the number of Unicode code points in the specified text 8833 * range 8834 * @throws NullPointerException if {@code seq} is null. 8835 * @throws IndexOutOfBoundsException if the 8836 * {@code beginIndex} is negative, or {@code endIndex} 8837 * is larger than the length of the given sequence, or 8838 * {@code beginIndex} is larger than {@code endIndex}. 8839 * @since 1.5 8840 */ 8841 public static int codePointCount(CharSequence seq, int beginIndex, int endIndex) { 8842 int length = seq.length(); 8843 if (beginIndex < 0 || endIndex > length || beginIndex > endIndex) { 8844 throw new IndexOutOfBoundsException(); 8845 } 8846 int n = endIndex - beginIndex; 8847 for (int i = beginIndex; i < endIndex; ) { 8848 if (isHighSurrogate(seq.charAt(i++)) && i < endIndex && 8849 isLowSurrogate(seq.charAt(i))) { 8850 n--; 8851 i++; 8852 } 8853 } 8854 return n; 8855 } 8856 8857 /** 8858 * Returns the number of Unicode code points in a subarray of the 8859 * {@code char} array argument. The {@code offset} 8860 * argument is the index of the first {@code char} of the 8861 * subarray and the {@code count} argument specifies the 8862 * length of the subarray in {@code char}s. Unpaired 8863 * surrogates within the subarray count as one code point each. 8864 * 8865 * @param a the {@code char} array 8866 * @param offset the index of the first {@code char} in the 8867 * given {@code char} array 8868 * @param count the length of the subarray in {@code char}s 8869 * @return the number of Unicode code points in the specified subarray 8870 * @throws NullPointerException if {@code a} is null. 8871 * @throws IndexOutOfBoundsException if {@code offset} or 8872 * {@code count} is negative, or if {@code offset + 8873 * count} is larger than the length of the given array. 8874 * @since 1.5 8875 */ 8876 public static int codePointCount(char[] a, int offset, int count) { 8877 if (count > a.length - offset || offset < 0 || count < 0) { 8878 throw new IndexOutOfBoundsException(); 8879 } 8880 return codePointCountImpl(a, offset, count); 8881 } 8882 8883 static int codePointCountImpl(char[] a, int offset, int count) { 8884 int endIndex = offset + count; 8885 int n = count; 8886 for (int i = offset; i < endIndex; ) { 8887 if (isHighSurrogate(a[i++]) && i < endIndex && 8888 isLowSurrogate(a[i])) { 8889 n--; 8890 i++; 8891 } 8892 } 8893 return n; 8894 } 8895 8896 /** 8897 * Returns the index within the given char sequence that is offset 8898 * from the given {@code index} by {@code codePointOffset} 8899 * code points. Unpaired surrogates within the text range given by 8900 * {@code index} and {@code codePointOffset} count as 8901 * one code point each. 8902 * 8903 * @param seq the char sequence 8904 * @param index the index to be offset 8905 * @param codePointOffset the offset in code points 8906 * @return the index within the char sequence 8907 * @throws NullPointerException if {@code seq} is null. 8908 * @throws IndexOutOfBoundsException if {@code index} 8909 * is negative or larger then the length of the char sequence, 8910 * or if {@code codePointOffset} is positive and the 8911 * subsequence starting with {@code index} has fewer than 8912 * {@code codePointOffset} code points, or if 8913 * {@code codePointOffset} is negative and the subsequence 8914 * before {@code index} has fewer than the absolute value 8915 * of {@code codePointOffset} code points. 8916 * @since 1.5 8917 */ 8918 public static int offsetByCodePoints(CharSequence seq, int index, 8919 int codePointOffset) { 8920 int length = seq.length(); 8921 if (index < 0 || index > length) { 8922 throw new IndexOutOfBoundsException(); 8923 } 8924 8925 int x = index; 8926 if (codePointOffset >= 0) { 8927 int i; 8928 for (i = 0; x < length && i < codePointOffset; i++) { 8929 if (isHighSurrogate(seq.charAt(x++)) && x < length && 8930 isLowSurrogate(seq.charAt(x))) { 8931 x++; 8932 } 8933 } 8934 if (i < codePointOffset) { 8935 throw new IndexOutOfBoundsException(); 8936 } 8937 } else { 8938 int i; 8939 for (i = codePointOffset; x > 0 && i < 0; i++) { 8940 if (isLowSurrogate(seq.charAt(--x)) && x > 0 && 8941 isHighSurrogate(seq.charAt(x-1))) { 8942 x--; 8943 } 8944 } 8945 if (i < 0) { 8946 throw new IndexOutOfBoundsException(); 8947 } 8948 } 8949 return x; 8950 } 8951 8952 /** 8953 * Returns the index within the given {@code char} subarray 8954 * that is offset from the given {@code index} by 8955 * {@code codePointOffset} code points. The 8956 * {@code start} and {@code count} arguments specify a 8957 * subarray of the {@code char} array. Unpaired surrogates 8958 * within the text range given by {@code index} and 8959 * {@code codePointOffset} count as one code point each. 8960 * 8961 * @param a the {@code char} array 8962 * @param start the index of the first {@code char} of the 8963 * subarray 8964 * @param count the length of the subarray in {@code char}s 8965 * @param index the index to be offset 8966 * @param codePointOffset the offset in code points 8967 * @return the index within the subarray 8968 * @throws NullPointerException if {@code a} is null. 8969 * @throws IndexOutOfBoundsException 8970 * if {@code start} or {@code count} is negative, 8971 * or if {@code start + count} is larger than the length of 8972 * the given array, 8973 * or if {@code index} is less than {@code start} or 8974 * larger then {@code start + count}, 8975 * or if {@code codePointOffset} is positive and the text range 8976 * starting with {@code index} and ending with {@code start + count - 1} 8977 * has fewer than {@code codePointOffset} code 8978 * points, 8979 * or if {@code codePointOffset} is negative and the text range 8980 * starting with {@code start} and ending with {@code index - 1} 8981 * has fewer than the absolute value of 8982 * {@code codePointOffset} code points. 8983 * @since 1.5 8984 */ 8985 public static int offsetByCodePoints(char[] a, int start, int count, 8986 int index, int codePointOffset) { 8987 if (count > a.length-start || start < 0 || count < 0 8988 || index < start || index > start+count) { 8989 throw new IndexOutOfBoundsException(); 8990 } 8991 return offsetByCodePointsImpl(a, start, count, index, codePointOffset); 8992 } 8993 8994 static int offsetByCodePointsImpl(char[]a, int start, int count, 8995 int index, int codePointOffset) { 8996 int x = index; 8997 if (codePointOffset >= 0) { 8998 int limit = start + count; 8999 int i; 9000 for (i = 0; x < limit && i < codePointOffset; i++) { 9001 if (isHighSurrogate(a[x++]) && x < limit && 9002 isLowSurrogate(a[x])) { 9003 x++; 9004 } 9005 } 9006 if (i < codePointOffset) { 9007 throw new IndexOutOfBoundsException(); 9008 } 9009 } else { 9010 int i; 9011 for (i = codePointOffset; x > start && i < 0; i++) { 9012 if (isLowSurrogate(a[--x]) && x > start && 9013 isHighSurrogate(a[x-1])) { 9014 x--; 9015 } 9016 } 9017 if (i < 0) { 9018 throw new IndexOutOfBoundsException(); 9019 } 9020 } 9021 return x; 9022 } 9023 9024 /** 9025 * Determines if the specified character is a lowercase character. 9026 * <p> 9027 * A character is lowercase if its general category type, provided 9028 * by {@code Character.getType(ch)}, is 9029 * {@code LOWERCASE_LETTER}, or it has contributory property 9030 * Other_Lowercase as defined by the Unicode Standard. 9031 * <p> 9032 * The following are examples of lowercase characters: 9033 * <blockquote><pre> 9034 * a b c d e f g h i j k l m n o p q r s t u v w x y z 9035 * '\u00DF' '\u00E0' '\u00E1' '\u00E2' '\u00E3' '\u00E4' '\u00E5' '\u00E6' 9036 * '\u00E7' '\u00E8' '\u00E9' '\u00EA' '\u00EB' '\u00EC' '\u00ED' '\u00EE' 9037 * '\u00EF' '\u00F0' '\u00F1' '\u00F2' '\u00F3' '\u00F4' '\u00F5' '\u00F6' 9038 * '\u00F8' '\u00F9' '\u00FA' '\u00FB' '\u00FC' '\u00FD' '\u00FE' '\u00FF' 9039 * </pre></blockquote> 9040 * <p> Many other Unicode characters are lowercase too. 9041 * 9042 * <p><b>Note:</b> This method cannot handle <a 9043 * href="#supplementary"> supplementary characters</a>. To support 9044 * all Unicode characters, including supplementary characters, use 9045 * the {@link #isLowerCase(int)} method. 9046 * 9047 * @param ch the character to be tested. 9048 * @return {@code true} if the character is lowercase; 9049 * {@code false} otherwise. 9050 * @see Character#isLowerCase(char) 9051 * @see Character#isTitleCase(char) 9052 * @see Character#toLowerCase(char) 9053 * @see Character#getType(char) 9054 */ 9055 public static boolean isLowerCase(char ch) { 9056 return isLowerCase((int)ch); 9057 } 9058 9059 /** 9060 * Determines if the specified character (Unicode code point) is a 9061 * lowercase character. 9062 * <p> 9063 * A character is lowercase if its general category type, provided 9064 * by {@link Character#getType getType(codePoint)}, is 9065 * {@code LOWERCASE_LETTER}, or it has contributory property 9066 * Other_Lowercase as defined by the Unicode Standard. 9067 * <p> 9068 * The following are examples of lowercase characters: 9069 * <blockquote><pre> 9070 * a b c d e f g h i j k l m n o p q r s t u v w x y z 9071 * '\u00DF' '\u00E0' '\u00E1' '\u00E2' '\u00E3' '\u00E4' '\u00E5' '\u00E6' 9072 * '\u00E7' '\u00E8' '\u00E9' '\u00EA' '\u00EB' '\u00EC' '\u00ED' '\u00EE' 9073 * '\u00EF' '\u00F0' '\u00F1' '\u00F2' '\u00F3' '\u00F4' '\u00F5' '\u00F6' 9074 * '\u00F8' '\u00F9' '\u00FA' '\u00FB' '\u00FC' '\u00FD' '\u00FE' '\u00FF' 9075 * </pre></blockquote> 9076 * <p> Many other Unicode characters are lowercase too. 9077 * 9078 * @param codePoint the character (Unicode code point) to be tested. 9079 * @return {@code true} if the character is lowercase; 9080 * {@code false} otherwise. 9081 * @see Character#isLowerCase(int) 9082 * @see Character#isTitleCase(int) 9083 * @see Character#toLowerCase(int) 9084 * @see Character#getType(int) 9085 * @since 1.5 9086 */ 9087 public static boolean isLowerCase(int codePoint) { 9088 return CharacterData.of(codePoint).isLowerCase(codePoint) || 9089 CharacterData.of(codePoint).isOtherLowercase(codePoint); 9090 } 9091 9092 /** 9093 * Determines if the specified character is an uppercase character. 9094 * <p> 9095 * A character is uppercase if its general category type, provided by 9096 * {@code Character.getType(ch)}, is {@code UPPERCASE_LETTER}. 9097 * or it has contributory property Other_Uppercase as defined by the Unicode Standard. 9098 * <p> 9099 * The following are examples of uppercase characters: 9100 * <blockquote><pre> 9101 * A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 9102 * '\u00C0' '\u00C1' '\u00C2' '\u00C3' '\u00C4' '\u00C5' '\u00C6' '\u00C7' 9103 * '\u00C8' '\u00C9' '\u00CA' '\u00CB' '\u00CC' '\u00CD' '\u00CE' '\u00CF' 9104 * '\u00D0' '\u00D1' '\u00D2' '\u00D3' '\u00D4' '\u00D5' '\u00D6' '\u00D8' 9105 * '\u00D9' '\u00DA' '\u00DB' '\u00DC' '\u00DD' '\u00DE' 9106 * </pre></blockquote> 9107 * <p> Many other Unicode characters are uppercase too. 9108 * 9109 * <p><b>Note:</b> This method cannot handle <a 9110 * href="#supplementary"> supplementary characters</a>. To support 9111 * all Unicode characters, including supplementary characters, use 9112 * the {@link #isUpperCase(int)} method. 9113 * 9114 * @param ch the character to be tested. 9115 * @return {@code true} if the character is uppercase; 9116 * {@code false} otherwise. 9117 * @see Character#isLowerCase(char) 9118 * @see Character#isTitleCase(char) 9119 * @see Character#toUpperCase(char) 9120 * @see Character#getType(char) 9121 * @since 1.0 9122 */ 9123 public static boolean isUpperCase(char ch) { 9124 return isUpperCase((int)ch); 9125 } 9126 9127 /** 9128 * Determines if the specified character (Unicode code point) is an uppercase character. 9129 * <p> 9130 * A character is uppercase if its general category type, provided by 9131 * {@link Character#getType(int) getType(codePoint)}, is {@code UPPERCASE_LETTER}, 9132 * or it has contributory property Other_Uppercase as defined by the Unicode Standard. 9133 * <p> 9134 * The following are examples of uppercase characters: 9135 * <blockquote><pre> 9136 * A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 9137 * '\u00C0' '\u00C1' '\u00C2' '\u00C3' '\u00C4' '\u00C5' '\u00C6' '\u00C7' 9138 * '\u00C8' '\u00C9' '\u00CA' '\u00CB' '\u00CC' '\u00CD' '\u00CE' '\u00CF' 9139 * '\u00D0' '\u00D1' '\u00D2' '\u00D3' '\u00D4' '\u00D5' '\u00D6' '\u00D8' 9140 * '\u00D9' '\u00DA' '\u00DB' '\u00DC' '\u00DD' '\u00DE' 9141 * </pre></blockquote> 9142 * <p> Many other Unicode characters are uppercase too. 9143 * 9144 * @param codePoint the character (Unicode code point) to be tested. 9145 * @return {@code true} if the character is uppercase; 9146 * {@code false} otherwise. 9147 * @see Character#isLowerCase(int) 9148 * @see Character#isTitleCase(int) 9149 * @see Character#toUpperCase(int) 9150 * @see Character#getType(int) 9151 * @since 1.5 9152 */ 9153 public static boolean isUpperCase(int codePoint) { 9154 return CharacterData.of(codePoint).isUpperCase(codePoint) || 9155 CharacterData.of(codePoint).isOtherUppercase(codePoint); 9156 } 9157 9158 /** 9159 * Determines if the specified character is a titlecase character. 9160 * <p> 9161 * A character is a titlecase character if its general 9162 * category type, provided by {@code Character.getType(ch)}, 9163 * is {@code TITLECASE_LETTER}. 9164 * <p> 9165 * Some characters look like pairs of Latin letters. For example, there 9166 * is an uppercase letter that looks like "LJ" and has a corresponding 9167 * lowercase letter that looks like "lj". A third form, which looks like "Lj", 9168 * is the appropriate form to use when rendering a word in lowercase 9169 * with initial capitals, as for a book title. 9170 * <p> 9171 * These are some of the Unicode characters for which this method returns 9172 * {@code true}: 9173 * <ul> 9174 * <li>{@code LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON} 9175 * <li>{@code LATIN CAPITAL LETTER L WITH SMALL LETTER J} 9176 * <li>{@code LATIN CAPITAL LETTER N WITH SMALL LETTER J} 9177 * <li>{@code LATIN CAPITAL LETTER D WITH SMALL LETTER Z} 9178 * </ul> 9179 * <p> Many other Unicode characters are titlecase too. 9180 * 9181 * <p><b>Note:</b> This method cannot handle <a 9182 * href="#supplementary"> supplementary characters</a>. To support 9183 * all Unicode characters, including supplementary characters, use 9184 * the {@link #isTitleCase(int)} method. 9185 * 9186 * @param ch the character to be tested. 9187 * @return {@code true} if the character is titlecase; 9188 * {@code false} otherwise. 9189 * @see Character#isLowerCase(char) 9190 * @see Character#isUpperCase(char) 9191 * @see Character#toTitleCase(char) 9192 * @see Character#getType(char) 9193 * @since 1.0.2 9194 */ 9195 public static boolean isTitleCase(char ch) { 9196 return isTitleCase((int)ch); 9197 } 9198 9199 /** 9200 * Determines if the specified character (Unicode code point) is a titlecase character. 9201 * <p> 9202 * A character is a titlecase character if its general 9203 * category type, provided by {@link Character#getType(int) getType(codePoint)}, 9204 * is {@code TITLECASE_LETTER}. 9205 * <p> 9206 * Some characters look like pairs of Latin letters. For example, there 9207 * is an uppercase letter that looks like "LJ" and has a corresponding 9208 * lowercase letter that looks like "lj". A third form, which looks like "Lj", 9209 * is the appropriate form to use when rendering a word in lowercase 9210 * with initial capitals, as for a book title. 9211 * <p> 9212 * These are some of the Unicode characters for which this method returns 9213 * {@code true}: 9214 * <ul> 9215 * <li>{@code LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON} 9216 * <li>{@code LATIN CAPITAL LETTER L WITH SMALL LETTER J} 9217 * <li>{@code LATIN CAPITAL LETTER N WITH SMALL LETTER J} 9218 * <li>{@code LATIN CAPITAL LETTER D WITH SMALL LETTER Z} 9219 * </ul> 9220 * <p> Many other Unicode characters are titlecase too. 9221 * 9222 * @param codePoint the character (Unicode code point) to be tested. 9223 * @return {@code true} if the character is titlecase; 9224 * {@code false} otherwise. 9225 * @see Character#isLowerCase(int) 9226 * @see Character#isUpperCase(int) 9227 * @see Character#toTitleCase(int) 9228 * @see Character#getType(int) 9229 * @since 1.5 9230 */ 9231 public static boolean isTitleCase(int codePoint) { 9232 return getType(codePoint) == Character.TITLECASE_LETTER; 9233 } 9234 9235 /** 9236 * Determines if the specified character is a digit. 9237 * <p> 9238 * A character is a digit if its general category type, provided 9239 * by {@code Character.getType(ch)}, is 9240 * {@code DECIMAL_DIGIT_NUMBER}. 9241 * <p> 9242 * Some Unicode character ranges that contain digits: 9243 * <ul> 9244 * <li>{@code '\u005Cu0030'} through {@code '\u005Cu0039'}, 9245 * ISO-LATIN-1 digits ({@code '0'} through {@code '9'}) 9246 * <li>{@code '\u005Cu0660'} through {@code '\u005Cu0669'}, 9247 * Arabic-Indic digits 9248 * <li>{@code '\u005Cu06F0'} through {@code '\u005Cu06F9'}, 9249 * Extended Arabic-Indic digits 9250 * <li>{@code '\u005Cu0966'} through {@code '\u005Cu096F'}, 9251 * Devanagari digits 9252 * <li>{@code '\u005CuFF10'} through {@code '\u005CuFF19'}, 9253 * Fullwidth digits 9254 * </ul> 9255 * 9256 * Many other character ranges contain digits as well. 9257 * 9258 * <p><b>Note:</b> This method cannot handle <a 9259 * href="#supplementary"> supplementary characters</a>. To support 9260 * all Unicode characters, including supplementary characters, use 9261 * the {@link #isDigit(int)} method. 9262 * 9263 * @param ch the character to be tested. 9264 * @return {@code true} if the character is a digit; 9265 * {@code false} otherwise. 9266 * @see Character#digit(char, int) 9267 * @see Character#forDigit(int, int) 9268 * @see Character#getType(char) 9269 */ 9270 public static boolean isDigit(char ch) { 9271 return isDigit((int)ch); 9272 } 9273 9274 /** 9275 * Determines if the specified character (Unicode code point) is a digit. 9276 * <p> 9277 * A character is a digit if its general category type, provided 9278 * by {@link Character#getType(int) getType(codePoint)}, is 9279 * {@code DECIMAL_DIGIT_NUMBER}. 9280 * <p> 9281 * Some Unicode character ranges that contain digits: 9282 * <ul> 9283 * <li>{@code '\u005Cu0030'} through {@code '\u005Cu0039'}, 9284 * ISO-LATIN-1 digits ({@code '0'} through {@code '9'}) 9285 * <li>{@code '\u005Cu0660'} through {@code '\u005Cu0669'}, 9286 * Arabic-Indic digits 9287 * <li>{@code '\u005Cu06F0'} through {@code '\u005Cu06F9'}, 9288 * Extended Arabic-Indic digits 9289 * <li>{@code '\u005Cu0966'} through {@code '\u005Cu096F'}, 9290 * Devanagari digits 9291 * <li>{@code '\u005CuFF10'} through {@code '\u005CuFF19'}, 9292 * Fullwidth digits 9293 * </ul> 9294 * 9295 * Many other character ranges contain digits as well. 9296 * 9297 * @param codePoint the character (Unicode code point) to be tested. 9298 * @return {@code true} if the character is a digit; 9299 * {@code false} otherwise. 9300 * @see Character#forDigit(int, int) 9301 * @see Character#getType(int) 9302 * @since 1.5 9303 */ 9304 public static boolean isDigit(int codePoint) { 9305 return CharacterData.of(codePoint).isDigit(codePoint); 9306 } 9307 9308 /** 9309 * Determines if a character is defined in Unicode. 9310 * <p> 9311 * A character is defined if at least one of the following is true: 9312 * <ul> 9313 * <li>It has an entry in the UnicodeData file. 9314 * <li>It has a value in a range defined by the UnicodeData file. 9315 * </ul> 9316 * 9317 * <p><b>Note:</b> This method cannot handle <a 9318 * href="#supplementary"> supplementary characters</a>. To support 9319 * all Unicode characters, including supplementary characters, use 9320 * the {@link #isDefined(int)} method. 9321 * 9322 * @param ch the character to be tested 9323 * @return {@code true} if the character has a defined meaning 9324 * in Unicode; {@code false} otherwise. 9325 * @see Character#isDigit(char) 9326 * @see Character#isLetter(char) 9327 * @see Character#isLetterOrDigit(char) 9328 * @see Character#isLowerCase(char) 9329 * @see Character#isTitleCase(char) 9330 * @see Character#isUpperCase(char) 9331 * @since 1.0.2 9332 */ 9333 public static boolean isDefined(char ch) { 9334 return isDefined((int)ch); 9335 } 9336 9337 /** 9338 * Determines if a character (Unicode code point) is defined in Unicode. 9339 * <p> 9340 * A character is defined if at least one of the following is true: 9341 * <ul> 9342 * <li>It has an entry in the UnicodeData file. 9343 * <li>It has a value in a range defined by the UnicodeData file. 9344 * </ul> 9345 * 9346 * @param codePoint the character (Unicode code point) to be tested. 9347 * @return {@code true} if the character has a defined meaning 9348 * in Unicode; {@code false} otherwise. 9349 * @see Character#isDigit(int) 9350 * @see Character#isLetter(int) 9351 * @see Character#isLetterOrDigit(int) 9352 * @see Character#isLowerCase(int) 9353 * @see Character#isTitleCase(int) 9354 * @see Character#isUpperCase(int) 9355 * @since 1.5 9356 */ 9357 public static boolean isDefined(int codePoint) { 9358 return getType(codePoint) != Character.UNASSIGNED; 9359 } 9360 9361 /** 9362 * Determines if the specified character is a letter. 9363 * <p> 9364 * A character is considered to be a letter if its general 9365 * category type, provided by {@code Character.getType(ch)}, 9366 * is any of the following: 9367 * <ul> 9368 * <li> {@code UPPERCASE_LETTER} 9369 * <li> {@code LOWERCASE_LETTER} 9370 * <li> {@code TITLECASE_LETTER} 9371 * <li> {@code MODIFIER_LETTER} 9372 * <li> {@code OTHER_LETTER} 9373 * </ul> 9374 * 9375 * Not all letters have case. Many characters are 9376 * letters but are neither uppercase nor lowercase nor titlecase. 9377 * 9378 * <p><b>Note:</b> This method cannot handle <a 9379 * href="#supplementary"> supplementary characters</a>. To support 9380 * all Unicode characters, including supplementary characters, use 9381 * the {@link #isLetter(int)} method. 9382 * 9383 * @param ch the character to be tested. 9384 * @return {@code true} if the character is a letter; 9385 * {@code false} otherwise. 9386 * @see Character#isDigit(char) 9387 * @see Character#isJavaIdentifierStart(char) 9388 * @see Character#isJavaLetter(char) 9389 * @see Character#isJavaLetterOrDigit(char) 9390 * @see Character#isLetterOrDigit(char) 9391 * @see Character#isLowerCase(char) 9392 * @see Character#isTitleCase(char) 9393 * @see Character#isUnicodeIdentifierStart(char) 9394 * @see Character#isUpperCase(char) 9395 */ 9396 public static boolean isLetter(char ch) { 9397 return isLetter((int)ch); 9398 } 9399 9400 /** 9401 * Determines if the specified character (Unicode code point) is a letter. 9402 * <p> 9403 * A character is considered to be a letter if its general 9404 * category type, provided by {@link Character#getType(int) getType(codePoint)}, 9405 * is any of the following: 9406 * <ul> 9407 * <li> {@code UPPERCASE_LETTER} 9408 * <li> {@code LOWERCASE_LETTER} 9409 * <li> {@code TITLECASE_LETTER} 9410 * <li> {@code MODIFIER_LETTER} 9411 * <li> {@code OTHER_LETTER} 9412 * </ul> 9413 * 9414 * Not all letters have case. Many characters are 9415 * letters but are neither uppercase nor lowercase nor titlecase. 9416 * 9417 * @param codePoint the character (Unicode code point) to be tested. 9418 * @return {@code true} if the character is a letter; 9419 * {@code false} otherwise. 9420 * @see Character#isDigit(int) 9421 * @see Character#isJavaIdentifierStart(int) 9422 * @see Character#isLetterOrDigit(int) 9423 * @see Character#isLowerCase(int) 9424 * @see Character#isTitleCase(int) 9425 * @see Character#isUnicodeIdentifierStart(int) 9426 * @see Character#isUpperCase(int) 9427 * @since 1.5 9428 */ 9429 public static boolean isLetter(int codePoint) { 9430 return ((((1 << Character.UPPERCASE_LETTER) | 9431 (1 << Character.LOWERCASE_LETTER) | 9432 (1 << Character.TITLECASE_LETTER) | 9433 (1 << Character.MODIFIER_LETTER) | 9434 (1 << Character.OTHER_LETTER)) >> getType(codePoint)) & 1) 9435 != 0; 9436 } 9437 9438 /** 9439 * Determines if the specified character is a letter or digit. 9440 * <p> 9441 * A character is considered to be a letter or digit if either 9442 * {@code Character.isLetter(char ch)} or 9443 * {@code Character.isDigit(char ch)} returns 9444 * {@code true} for the character. 9445 * 9446 * <p><b>Note:</b> This method cannot handle <a 9447 * href="#supplementary"> supplementary characters</a>. To support 9448 * all Unicode characters, including supplementary characters, use 9449 * the {@link #isLetterOrDigit(int)} method. 9450 * 9451 * @param ch the character to be tested. 9452 * @return {@code true} if the character is a letter or digit; 9453 * {@code false} otherwise. 9454 * @see Character#isDigit(char) 9455 * @see Character#isJavaIdentifierPart(char) 9456 * @see Character#isJavaLetter(char) 9457 * @see Character#isJavaLetterOrDigit(char) 9458 * @see Character#isLetter(char) 9459 * @see Character#isUnicodeIdentifierPart(char) 9460 * @since 1.0.2 9461 */ 9462 public static boolean isLetterOrDigit(char ch) { 9463 return isLetterOrDigit((int)ch); 9464 } 9465 9466 /** 9467 * Determines if the specified character (Unicode code point) is a letter or digit. 9468 * <p> 9469 * A character is considered to be a letter or digit if either 9470 * {@link #isLetter(int) isLetter(codePoint)} or 9471 * {@link #isDigit(int) isDigit(codePoint)} returns 9472 * {@code true} for the character. 9473 * 9474 * @param codePoint the character (Unicode code point) to be tested. 9475 * @return {@code true} if the character is a letter or digit; 9476 * {@code false} otherwise. 9477 * @see Character#isDigit(int) 9478 * @see Character#isJavaIdentifierPart(int) 9479 * @see Character#isLetter(int) 9480 * @see Character#isUnicodeIdentifierPart(int) 9481 * @since 1.5 9482 */ 9483 public static boolean isLetterOrDigit(int codePoint) { 9484 return ((((1 << Character.UPPERCASE_LETTER) | 9485 (1 << Character.LOWERCASE_LETTER) | 9486 (1 << Character.TITLECASE_LETTER) | 9487 (1 << Character.MODIFIER_LETTER) | 9488 (1 << Character.OTHER_LETTER) | 9489 (1 << Character.DECIMAL_DIGIT_NUMBER)) >> getType(codePoint)) & 1) 9490 != 0; 9491 } 9492 9493 /** 9494 * Determines if the specified character is permissible as the first 9495 * character in a Java identifier. 9496 * <p> 9497 * A character may start a Java identifier if and only if 9498 * one of the following is true: 9499 * <ul> 9500 * <li> {@link #isLetter(char) isLetter(ch)} returns {@code true} 9501 * <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER} 9502 * <li> {@code ch} is a currency symbol (such as {@code '$'}) 9503 * <li> {@code ch} is a connecting punctuation character (such as {@code '_'}). 9504 * </ul> 9505 * 9506 * @param ch the character to be tested. 9507 * @return {@code true} if the character may start a Java 9508 * identifier; {@code false} otherwise. 9509 * @see Character#isJavaLetterOrDigit(char) 9510 * @see Character#isJavaIdentifierStart(char) 9511 * @see Character#isJavaIdentifierPart(char) 9512 * @see Character#isLetter(char) 9513 * @see Character#isLetterOrDigit(char) 9514 * @see Character#isUnicodeIdentifierStart(char) 9515 * @since 1.0.2 9516 * @deprecated Replaced by isJavaIdentifierStart(char). 9517 */ 9518 @Deprecated(since="1.1") 9519 public static boolean isJavaLetter(char ch) { 9520 return isJavaIdentifierStart(ch); 9521 } 9522 9523 /** 9524 * Determines if the specified character may be part of a Java 9525 * identifier as other than the first character. 9526 * <p> 9527 * A character may be part of a Java identifier if and only if any 9528 * of the following are true: 9529 * <ul> 9530 * <li> it is a letter 9531 * <li> it is a currency symbol (such as {@code '$'}) 9532 * <li> it is a connecting punctuation character (such as {@code '_'}) 9533 * <li> it is a digit 9534 * <li> it is a numeric letter (such as a Roman numeral character) 9535 * <li> it is a combining mark 9536 * <li> it is a non-spacing mark 9537 * <li> {@code isIdentifierIgnorable} returns 9538 * {@code true} for the character. 9539 * </ul> 9540 * 9541 * @param ch the character to be tested. 9542 * @return {@code true} if the character may be part of a 9543 * Java identifier; {@code false} otherwise. 9544 * @see Character#isJavaLetter(char) 9545 * @see Character#isJavaIdentifierStart(char) 9546 * @see Character#isJavaIdentifierPart(char) 9547 * @see Character#isLetter(char) 9548 * @see Character#isLetterOrDigit(char) 9549 * @see Character#isUnicodeIdentifierPart(char) 9550 * @see Character#isIdentifierIgnorable(char) 9551 * @since 1.0.2 9552 * @deprecated Replaced by isJavaIdentifierPart(char). 9553 */ 9554 @Deprecated(since="1.1") 9555 public static boolean isJavaLetterOrDigit(char ch) { 9556 return isJavaIdentifierPart(ch); 9557 } 9558 9559 /** 9560 * Determines if the specified character (Unicode code point) is an alphabet. 9561 * <p> 9562 * A character is considered to be alphabetic if its general category type, 9563 * provided by {@link Character#getType(int) getType(codePoint)}, is any of 9564 * the following: 9565 * <ul> 9566 * <li> {@code UPPERCASE_LETTER} 9567 * <li> {@code LOWERCASE_LETTER} 9568 * <li> {@code TITLECASE_LETTER} 9569 * <li> {@code MODIFIER_LETTER} 9570 * <li> {@code OTHER_LETTER} 9571 * <li> {@code LETTER_NUMBER} 9572 * </ul> 9573 * or it has contributory property Other_Alphabetic as defined by the 9574 * Unicode Standard. 9575 * 9576 * @param codePoint the character (Unicode code point) to be tested. 9577 * @return {@code true} if the character is a Unicode alphabet 9578 * character, {@code false} otherwise. 9579 * @since 1.7 9580 */ 9581 public static boolean isAlphabetic(int codePoint) { 9582 return (((((1 << Character.UPPERCASE_LETTER) | 9583 (1 << Character.LOWERCASE_LETTER) | 9584 (1 << Character.TITLECASE_LETTER) | 9585 (1 << Character.MODIFIER_LETTER) | 9586 (1 << Character.OTHER_LETTER) | 9587 (1 << Character.LETTER_NUMBER)) >> getType(codePoint)) & 1) != 0) || 9588 CharacterData.of(codePoint).isOtherAlphabetic(codePoint); 9589 } 9590 9591 /** 9592 * Determines if the specified character (Unicode code point) is a CJKV 9593 * (Chinese, Japanese, Korean and Vietnamese) ideograph, as defined by 9594 * the Unicode Standard. 9595 * 9596 * @param codePoint the character (Unicode code point) to be tested. 9597 * @return {@code true} if the character is a Unicode ideograph 9598 * character, {@code false} otherwise. 9599 * @since 1.7 9600 */ 9601 public static boolean isIdeographic(int codePoint) { 9602 return CharacterData.of(codePoint).isIdeographic(codePoint); 9603 } 9604 9605 /** 9606 * Determines if the specified character is 9607 * permissible as the first character in a Java identifier. 9608 * <p> 9609 * A character may start a Java identifier if and only if 9610 * one of the following conditions is true: 9611 * <ul> 9612 * <li> {@link #isLetter(char) isLetter(ch)} returns {@code true} 9613 * <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER} 9614 * <li> {@code ch} is a currency symbol (such as {@code '$'}) 9615 * <li> {@code ch} is a connecting punctuation character (such as {@code '_'}). 9616 * </ul> 9617 * 9618 * <p><b>Note:</b> This method cannot handle <a 9619 * href="#supplementary"> supplementary characters</a>. To support 9620 * all Unicode characters, including supplementary characters, use 9621 * the {@link #isJavaIdentifierStart(int)} method. 9622 * 9623 * @param ch the character to be tested. 9624 * @return {@code true} if the character may start a Java identifier; 9625 * {@code false} otherwise. 9626 * @see Character#isJavaIdentifierPart(char) 9627 * @see Character#isLetter(char) 9628 * @see Character#isUnicodeIdentifierStart(char) 9629 * @see javax.lang.model.SourceVersion#isIdentifier(CharSequence) 9630 * @since 1.1 9631 */ 9632 public static boolean isJavaIdentifierStart(char ch) { 9633 return isJavaIdentifierStart((int)ch); 9634 } 9635 9636 /** 9637 * Determines if the character (Unicode code point) is 9638 * permissible as the first character in a Java identifier. 9639 * <p> 9640 * A character may start a Java identifier if and only if 9641 * one of the following conditions is true: 9642 * <ul> 9643 * <li> {@link #isLetter(int) isLetter(codePoint)} 9644 * returns {@code true} 9645 * <li> {@link #getType(int) getType(codePoint)} 9646 * returns {@code LETTER_NUMBER} 9647 * <li> the referenced character is a currency symbol (such as {@code '$'}) 9648 * <li> the referenced character is a connecting punctuation character 9649 * (such as {@code '_'}). 9650 * </ul> 9651 * 9652 * @param codePoint the character (Unicode code point) to be tested. 9653 * @return {@code true} if the character may start a Java identifier; 9654 * {@code false} otherwise. 9655 * @see Character#isJavaIdentifierPart(int) 9656 * @see Character#isLetter(int) 9657 * @see Character#isUnicodeIdentifierStart(int) 9658 * @see javax.lang.model.SourceVersion#isIdentifier(CharSequence) 9659 * @since 1.5 9660 */ 9661 public static boolean isJavaIdentifierStart(int codePoint) { 9662 return CharacterData.of(codePoint).isJavaIdentifierStart(codePoint); 9663 } 9664 9665 /** 9666 * Determines if the specified character may be part of a Java 9667 * identifier as other than the first character. 9668 * <p> 9669 * A character may be part of a Java identifier if any of the following 9670 * are true: 9671 * <ul> 9672 * <li> it is a letter 9673 * <li> it is a currency symbol (such as {@code '$'}) 9674 * <li> it is a connecting punctuation character (such as {@code '_'}) 9675 * <li> it is a digit 9676 * <li> it is a numeric letter (such as a Roman numeral character) 9677 * <li> it is a combining mark 9678 * <li> it is a non-spacing mark 9679 * <li> {@code isIdentifierIgnorable} returns 9680 * {@code true} for the character 9681 * </ul> 9682 * 9683 * <p><b>Note:</b> This method cannot handle <a 9684 * href="#supplementary"> supplementary characters</a>. To support 9685 * all Unicode characters, including supplementary characters, use 9686 * the {@link #isJavaIdentifierPart(int)} method. 9687 * 9688 * @param ch the character to be tested. 9689 * @return {@code true} if the character may be part of a 9690 * Java identifier; {@code false} otherwise. 9691 * @see Character#isIdentifierIgnorable(char) 9692 * @see Character#isJavaIdentifierStart(char) 9693 * @see Character#isLetterOrDigit(char) 9694 * @see Character#isUnicodeIdentifierPart(char) 9695 * @see javax.lang.model.SourceVersion#isIdentifier(CharSequence) 9696 * @since 1.1 9697 */ 9698 public static boolean isJavaIdentifierPart(char ch) { 9699 return isJavaIdentifierPart((int)ch); 9700 } 9701 9702 /** 9703 * Determines if the character (Unicode code point) may be part of a Java 9704 * identifier as other than the first character. 9705 * <p> 9706 * A character may be part of a Java identifier if any of the following 9707 * are true: 9708 * <ul> 9709 * <li> it is a letter 9710 * <li> it is a currency symbol (such as {@code '$'}) 9711 * <li> it is a connecting punctuation character (such as {@code '_'}) 9712 * <li> it is a digit 9713 * <li> it is a numeric letter (such as a Roman numeral character) 9714 * <li> it is a combining mark 9715 * <li> it is a non-spacing mark 9716 * <li> {@link #isIdentifierIgnorable(int) 9717 * isIdentifierIgnorable(codePoint)} returns {@code true} for 9718 * the character 9719 * </ul> 9720 * 9721 * @param codePoint the character (Unicode code point) to be tested. 9722 * @return {@code true} if the character may be part of a 9723 * Java identifier; {@code false} otherwise. 9724 * @see Character#isIdentifierIgnorable(int) 9725 * @see Character#isJavaIdentifierStart(int) 9726 * @see Character#isLetterOrDigit(int) 9727 * @see Character#isUnicodeIdentifierPart(int) 9728 * @see javax.lang.model.SourceVersion#isIdentifier(CharSequence) 9729 * @since 1.5 9730 */ 9731 public static boolean isJavaIdentifierPart(int codePoint) { 9732 return CharacterData.of(codePoint).isJavaIdentifierPart(codePoint); 9733 } 9734 9735 /** 9736 * Determines if the specified character is permissible as the 9737 * first character in a Unicode identifier. 9738 * <p> 9739 * A character may start a Unicode identifier if and only if 9740 * one of the following conditions is true: 9741 * <ul> 9742 * <li> {@link #isLetter(char) isLetter(ch)} returns {@code true} 9743 * <li> {@link #getType(char) getType(ch)} returns 9744 * {@code LETTER_NUMBER}. 9745 * </ul> 9746 * 9747 * <p><b>Note:</b> This method cannot handle <a 9748 * href="#supplementary"> supplementary characters</a>. To support 9749 * all Unicode characters, including supplementary characters, use 9750 * the {@link #isUnicodeIdentifierStart(int)} method. 9751 * 9752 * @param ch the character to be tested. 9753 * @return {@code true} if the character may start a Unicode 9754 * identifier; {@code false} otherwise. 9755 * @see Character#isJavaIdentifierStart(char) 9756 * @see Character#isLetter(char) 9757 * @see Character#isUnicodeIdentifierPart(char) 9758 * @since 1.1 9759 */ 9760 public static boolean isUnicodeIdentifierStart(char ch) { 9761 return isUnicodeIdentifierStart((int)ch); 9762 } 9763 9764 /** 9765 * Determines if the specified character (Unicode code point) is permissible as the 9766 * first character in a Unicode identifier. 9767 * <p> 9768 * A character may start a Unicode identifier if and only if 9769 * one of the following conditions is true: 9770 * <ul> 9771 * <li> {@link #isLetter(int) isLetter(codePoint)} 9772 * returns {@code true} 9773 * <li> {@link #getType(int) getType(codePoint)} 9774 * returns {@code LETTER_NUMBER}. 9775 * </ul> 9776 * @param codePoint the character (Unicode code point) to be tested. 9777 * @return {@code true} if the character may start a Unicode 9778 * identifier; {@code false} otherwise. 9779 * @see Character#isJavaIdentifierStart(int) 9780 * @see Character#isLetter(int) 9781 * @see Character#isUnicodeIdentifierPart(int) 9782 * @since 1.5 9783 */ 9784 public static boolean isUnicodeIdentifierStart(int codePoint) { 9785 return CharacterData.of(codePoint).isUnicodeIdentifierStart(codePoint); 9786 } 9787 9788 /** 9789 * Determines if the specified character may be part of a Unicode 9790 * identifier as other than the first character. 9791 * <p> 9792 * A character may be part of a Unicode identifier if and only if 9793 * one of the following statements is true: 9794 * <ul> 9795 * <li> it is a letter 9796 * <li> it is a connecting punctuation character (such as {@code '_'}) 9797 * <li> it is a digit 9798 * <li> it is a numeric letter (such as a Roman numeral character) 9799 * <li> it is a combining mark 9800 * <li> it is a non-spacing mark 9801 * <li> {@code isIdentifierIgnorable} returns 9802 * {@code true} for this character. 9803 * </ul> 9804 * 9805 * <p><b>Note:</b> This method cannot handle <a 9806 * href="#supplementary"> supplementary characters</a>. To support 9807 * all Unicode characters, including supplementary characters, use 9808 * the {@link #isUnicodeIdentifierPart(int)} method. 9809 * 9810 * @param ch the character to be tested. 9811 * @return {@code true} if the character may be part of a 9812 * Unicode identifier; {@code false} otherwise. 9813 * @see Character#isIdentifierIgnorable(char) 9814 * @see Character#isJavaIdentifierPart(char) 9815 * @see Character#isLetterOrDigit(char) 9816 * @see Character#isUnicodeIdentifierStart(char) 9817 * @since 1.1 9818 */ 9819 public static boolean isUnicodeIdentifierPart(char ch) { 9820 return isUnicodeIdentifierPart((int)ch); 9821 } 9822 9823 /** 9824 * Determines if the specified character (Unicode code point) may be part of a Unicode 9825 * identifier as other than the first character. 9826 * <p> 9827 * A character may be part of a Unicode identifier if and only if 9828 * one of the following statements is true: 9829 * <ul> 9830 * <li> it is a letter 9831 * <li> it is a connecting punctuation character (such as {@code '_'}) 9832 * <li> it is a digit 9833 * <li> it is a numeric letter (such as a Roman numeral character) 9834 * <li> it is a combining mark 9835 * <li> it is a non-spacing mark 9836 * <li> {@code isIdentifierIgnorable} returns 9837 * {@code true} for this character. 9838 * </ul> 9839 * @param codePoint the character (Unicode code point) to be tested. 9840 * @return {@code true} if the character may be part of a 9841 * Unicode identifier; {@code false} otherwise. 9842 * @see Character#isIdentifierIgnorable(int) 9843 * @see Character#isJavaIdentifierPart(int) 9844 * @see Character#isLetterOrDigit(int) 9845 * @see Character#isUnicodeIdentifierStart(int) 9846 * @since 1.5 9847 */ 9848 public static boolean isUnicodeIdentifierPart(int codePoint) { 9849 return CharacterData.of(codePoint).isUnicodeIdentifierPart(codePoint); 9850 } 9851 9852 /** 9853 * Determines if the specified character should be regarded as 9854 * an ignorable character in a Java identifier or a Unicode identifier. 9855 * <p> 9856 * The following Unicode characters are ignorable in a Java identifier 9857 * or a Unicode identifier: 9858 * <ul> 9859 * <li>ISO control characters that are not whitespace 9860 * <ul> 9861 * <li>{@code '\u005Cu0000'} through {@code '\u005Cu0008'} 9862 * <li>{@code '\u005Cu000E'} through {@code '\u005Cu001B'} 9863 * <li>{@code '\u005Cu007F'} through {@code '\u005Cu009F'} 9864 * </ul> 9865 * 9866 * <li>all characters that have the {@code FORMAT} general 9867 * category value 9868 * </ul> 9869 * 9870 * <p><b>Note:</b> This method cannot handle <a 9871 * href="#supplementary"> supplementary characters</a>. To support 9872 * all Unicode characters, including supplementary characters, use 9873 * the {@link #isIdentifierIgnorable(int)} method. 9874 * 9875 * @param ch the character to be tested. 9876 * @return {@code true} if the character is an ignorable control 9877 * character that may be part of a Java or Unicode identifier; 9878 * {@code false} otherwise. 9879 * @see Character#isJavaIdentifierPart(char) 9880 * @see Character#isUnicodeIdentifierPart(char) 9881 * @since 1.1 9882 */ 9883 public static boolean isIdentifierIgnorable(char ch) { 9884 return isIdentifierIgnorable((int)ch); 9885 } 9886 9887 /** 9888 * Determines if the specified character (Unicode code point) should be regarded as 9889 * an ignorable character in a Java identifier or a Unicode identifier. 9890 * <p> 9891 * The following Unicode characters are ignorable in a Java identifier 9892 * or a Unicode identifier: 9893 * <ul> 9894 * <li>ISO control characters that are not whitespace 9895 * <ul> 9896 * <li>{@code '\u005Cu0000'} through {@code '\u005Cu0008'} 9897 * <li>{@code '\u005Cu000E'} through {@code '\u005Cu001B'} 9898 * <li>{@code '\u005Cu007F'} through {@code '\u005Cu009F'} 9899 * </ul> 9900 * 9901 * <li>all characters that have the {@code FORMAT} general 9902 * category value 9903 * </ul> 9904 * 9905 * @param codePoint the character (Unicode code point) to be tested. 9906 * @return {@code true} if the character is an ignorable control 9907 * character that may be part of a Java or Unicode identifier; 9908 * {@code false} otherwise. 9909 * @see Character#isJavaIdentifierPart(int) 9910 * @see Character#isUnicodeIdentifierPart(int) 9911 * @since 1.5 9912 */ 9913 public static boolean isIdentifierIgnorable(int codePoint) { 9914 return CharacterData.of(codePoint).isIdentifierIgnorable(codePoint); 9915 } 9916 9917 /** 9918 * Converts the character argument to lowercase using case 9919 * mapping information from the UnicodeData file. 9920 * <p> 9921 * Note that 9922 * {@code Character.isLowerCase(Character.toLowerCase(ch))} 9923 * does not always return {@code true} for some ranges of 9924 * characters, particularly those that are symbols or ideographs. 9925 * 9926 * <p>In general, {@link String#toLowerCase()} should be used to map 9927 * characters to lowercase. {@code String} case mapping methods 9928 * have several benefits over {@code Character} case mapping methods. 9929 * {@code String} case mapping methods can perform locale-sensitive 9930 * mappings, context-sensitive mappings, and 1:M character mappings, whereas 9931 * the {@code Character} case mapping methods cannot. 9932 * 9933 * <p><b>Note:</b> This method cannot handle <a 9934 * href="#supplementary"> supplementary characters</a>. To support 9935 * all Unicode characters, including supplementary characters, use 9936 * the {@link #toLowerCase(int)} method. 9937 * 9938 * @param ch the character to be converted. 9939 * @return the lowercase equivalent of the character, if any; 9940 * otherwise, the character itself. 9941 * @see Character#isLowerCase(char) 9942 * @see String#toLowerCase() 9943 */ 9944 public static char toLowerCase(char ch) { 9945 return (char)toLowerCase((int)ch); 9946 } 9947 9948 /** 9949 * Converts the character (Unicode code point) argument to 9950 * lowercase using case mapping information from the UnicodeData 9951 * file. 9952 * 9953 * <p> Note that 9954 * {@code Character.isLowerCase(Character.toLowerCase(codePoint))} 9955 * does not always return {@code true} for some ranges of 9956 * characters, particularly those that are symbols or ideographs. 9957 * 9958 * <p>In general, {@link String#toLowerCase()} should be used to map 9959 * characters to lowercase. {@code String} case mapping methods 9960 * have several benefits over {@code Character} case mapping methods. 9961 * {@code String} case mapping methods can perform locale-sensitive 9962 * mappings, context-sensitive mappings, and 1:M character mappings, whereas 9963 * the {@code Character} case mapping methods cannot. 9964 * 9965 * @param codePoint the character (Unicode code point) to be converted. 9966 * @return the lowercase equivalent of the character (Unicode code 9967 * point), if any; otherwise, the character itself. 9968 * @see Character#isLowerCase(int) 9969 * @see String#toLowerCase() 9970 * 9971 * @since 1.5 9972 */ 9973 public static int toLowerCase(int codePoint) { 9974 return CharacterData.of(codePoint).toLowerCase(codePoint); 9975 } 9976 9977 /** 9978 * Converts the character argument to uppercase using case mapping 9979 * information from the UnicodeData file. 9980 * <p> 9981 * Note that 9982 * {@code Character.isUpperCase(Character.toUpperCase(ch))} 9983 * does not always return {@code true} for some ranges of 9984 * characters, particularly those that are symbols or ideographs. 9985 * 9986 * <p>In general, {@link String#toUpperCase()} should be used to map 9987 * characters to uppercase. {@code String} case mapping methods 9988 * have several benefits over {@code Character} case mapping methods. 9989 * {@code String} case mapping methods can perform locale-sensitive 9990 * mappings, context-sensitive mappings, and 1:M character mappings, whereas 9991 * the {@code Character} case mapping methods cannot. 9992 * 9993 * <p><b>Note:</b> This method cannot handle <a 9994 * href="#supplementary"> supplementary characters</a>. To support 9995 * all Unicode characters, including supplementary characters, use 9996 * the {@link #toUpperCase(int)} method. 9997 * 9998 * @param ch the character to be converted. 9999 * @return the uppercase equivalent of the character, if any; 10000 * otherwise, the character itself. 10001 * @see Character#isUpperCase(char) 10002 * @see String#toUpperCase() 10003 */ 10004 public static char toUpperCase(char ch) { 10005 return (char)toUpperCase((int)ch); 10006 } 10007 10008 /** 10009 * Converts the character (Unicode code point) argument to 10010 * uppercase using case mapping information from the UnicodeData 10011 * file. 10012 * 10013 * <p>Note that 10014 * {@code Character.isUpperCase(Character.toUpperCase(codePoint))} 10015 * does not always return {@code true} for some ranges of 10016 * characters, particularly those that are symbols or ideographs. 10017 * 10018 * <p>In general, {@link String#toUpperCase()} should be used to map 10019 * characters to uppercase. {@code String} case mapping methods 10020 * have several benefits over {@code Character} case mapping methods. 10021 * {@code String} case mapping methods can perform locale-sensitive 10022 * mappings, context-sensitive mappings, and 1:M character mappings, whereas 10023 * the {@code Character} case mapping methods cannot. 10024 * 10025 * @param codePoint the character (Unicode code point) to be converted. 10026 * @return the uppercase equivalent of the character, if any; 10027 * otherwise, the character itself. 10028 * @see Character#isUpperCase(int) 10029 * @see String#toUpperCase() 10030 * 10031 * @since 1.5 10032 */ 10033 public static int toUpperCase(int codePoint) { 10034 return CharacterData.of(codePoint).toUpperCase(codePoint); 10035 } 10036 10037 /** 10038 * Converts the character argument to titlecase using case mapping 10039 * information from the UnicodeData file. If a character has no 10040 * explicit titlecase mapping and is not itself a titlecase char 10041 * according to UnicodeData, then the uppercase mapping is 10042 * returned as an equivalent titlecase mapping. If the 10043 * {@code char} argument is already a titlecase 10044 * {@code char}, the same {@code char} value will be 10045 * returned. 10046 * <p> 10047 * Note that 10048 * {@code Character.isTitleCase(Character.toTitleCase(ch))} 10049 * does not always return {@code true} for some ranges of 10050 * characters. 10051 * 10052 * <p><b>Note:</b> This method cannot handle <a 10053 * href="#supplementary"> supplementary characters</a>. To support 10054 * all Unicode characters, including supplementary characters, use 10055 * the {@link #toTitleCase(int)} method. 10056 * 10057 * @param ch the character to be converted. 10058 * @return the titlecase equivalent of the character, if any; 10059 * otherwise, the character itself. 10060 * @see Character#isTitleCase(char) 10061 * @see Character#toLowerCase(char) 10062 * @see Character#toUpperCase(char) 10063 * @since 1.0.2 10064 */ 10065 public static char toTitleCase(char ch) { 10066 return (char)toTitleCase((int)ch); 10067 } 10068 10069 /** 10070 * Converts the character (Unicode code point) argument to titlecase using case mapping 10071 * information from the UnicodeData file. If a character has no 10072 * explicit titlecase mapping and is not itself a titlecase char 10073 * according to UnicodeData, then the uppercase mapping is 10074 * returned as an equivalent titlecase mapping. If the 10075 * character argument is already a titlecase 10076 * character, the same character value will be 10077 * returned. 10078 * 10079 * <p>Note that 10080 * {@code Character.isTitleCase(Character.toTitleCase(codePoint))} 10081 * does not always return {@code true} for some ranges of 10082 * characters. 10083 * 10084 * @param codePoint the character (Unicode code point) to be converted. 10085 * @return the titlecase equivalent of the character, if any; 10086 * otherwise, the character itself. 10087 * @see Character#isTitleCase(int) 10088 * @see Character#toLowerCase(int) 10089 * @see Character#toUpperCase(int) 10090 * @since 1.5 10091 */ 10092 public static int toTitleCase(int codePoint) { 10093 return CharacterData.of(codePoint).toTitleCase(codePoint); 10094 } 10095 10096 /** 10097 * Returns the numeric value of the character {@code ch} in the 10098 * specified radix. 10099 * <p> 10100 * If the radix is not in the range {@code MIN_RADIX} ≤ 10101 * {@code radix} ≤ {@code MAX_RADIX} or if the 10102 * value of {@code ch} is not a valid digit in the specified 10103 * radix, {@code -1} is returned. A character is a valid digit 10104 * if at least one of the following is true: 10105 * <ul> 10106 * <li>The method {@code isDigit} is {@code true} of the character 10107 * and the Unicode decimal digit value of the character (or its 10108 * single-character decomposition) is less than the specified radix. 10109 * In this case the decimal digit value is returned. 10110 * <li>The character is one of the uppercase Latin letters 10111 * {@code 'A'} through {@code 'Z'} and its code is less than 10112 * {@code radix + 'A' - 10}. 10113 * In this case, {@code ch - 'A' + 10} 10114 * is returned. 10115 * <li>The character is one of the lowercase Latin letters 10116 * {@code 'a'} through {@code 'z'} and its code is less than 10117 * {@code radix + 'a' - 10}. 10118 * In this case, {@code ch - 'a' + 10} 10119 * is returned. 10120 * <li>The character is one of the fullwidth uppercase Latin letters A 10121 * ({@code '\u005CuFF21'}) through Z ({@code '\u005CuFF3A'}) 10122 * and its code is less than 10123 * {@code radix + '\u005CuFF21' - 10}. 10124 * In this case, {@code ch - '\u005CuFF21' + 10} 10125 * is returned. 10126 * <li>The character is one of the fullwidth lowercase Latin letters a 10127 * ({@code '\u005CuFF41'}) through z ({@code '\u005CuFF5A'}) 10128 * and its code is less than 10129 * {@code radix + '\u005CuFF41' - 10}. 10130 * In this case, {@code ch - '\u005CuFF41' + 10} 10131 * is returned. 10132 * </ul> 10133 * 10134 * <p><b>Note:</b> This method cannot handle <a 10135 * href="#supplementary"> supplementary characters</a>. To support 10136 * all Unicode characters, including supplementary characters, use 10137 * the {@link #digit(int, int)} method. 10138 * 10139 * @param ch the character to be converted. 10140 * @param radix the radix. 10141 * @return the numeric value represented by the character in the 10142 * specified radix. 10143 * @see Character#forDigit(int, int) 10144 * @see Character#isDigit(char) 10145 */ 10146 public static int digit(char ch, int radix) { 10147 return digit((int)ch, radix); 10148 } 10149 10150 /** 10151 * Returns the numeric value of the specified character (Unicode 10152 * code point) in the specified radix. 10153 * 10154 * <p>If the radix is not in the range {@code MIN_RADIX} ≤ 10155 * {@code radix} ≤ {@code MAX_RADIX} or if the 10156 * character is not a valid digit in the specified 10157 * radix, {@code -1} is returned. A character is a valid digit 10158 * if at least one of the following is true: 10159 * <ul> 10160 * <li>The method {@link #isDigit(int) isDigit(codePoint)} is {@code true} of the character 10161 * and the Unicode decimal digit value of the character (or its 10162 * single-character decomposition) is less than the specified radix. 10163 * In this case the decimal digit value is returned. 10164 * <li>The character is one of the uppercase Latin letters 10165 * {@code 'A'} through {@code 'Z'} and its code is less than 10166 * {@code radix + 'A' - 10}. 10167 * In this case, {@code codePoint - 'A' + 10} 10168 * is returned. 10169 * <li>The character is one of the lowercase Latin letters 10170 * {@code 'a'} through {@code 'z'} and its code is less than 10171 * {@code radix + 'a' - 10}. 10172 * In this case, {@code codePoint - 'a' + 10} 10173 * is returned. 10174 * <li>The character is one of the fullwidth uppercase Latin letters A 10175 * ({@code '\u005CuFF21'}) through Z ({@code '\u005CuFF3A'}) 10176 * and its code is less than 10177 * {@code radix + '\u005CuFF21' - 10}. 10178 * In this case, 10179 * {@code codePoint - '\u005CuFF21' + 10} 10180 * is returned. 10181 * <li>The character is one of the fullwidth lowercase Latin letters a 10182 * ({@code '\u005CuFF41'}) through z ({@code '\u005CuFF5A'}) 10183 * and its code is less than 10184 * {@code radix + '\u005CuFF41'- 10}. 10185 * In this case, 10186 * {@code codePoint - '\u005CuFF41' + 10} 10187 * is returned. 10188 * </ul> 10189 * 10190 * @param codePoint the character (Unicode code point) to be converted. 10191 * @param radix the radix. 10192 * @return the numeric value represented by the character in the 10193 * specified radix. 10194 * @see Character#forDigit(int, int) 10195 * @see Character#isDigit(int) 10196 * @since 1.5 10197 */ 10198 public static int digit(int codePoint, int radix) { 10199 return CharacterData.of(codePoint).digit(codePoint, radix); 10200 } 10201 10202 /** 10203 * Returns the {@code int} value that the specified Unicode 10204 * character represents. For example, the character 10205 * {@code '\u005Cu216C'} (the roman numeral fifty) will return 10206 * an int with a value of 50. 10207 * <p> 10208 * The letters A-Z in their uppercase ({@code '\u005Cu0041'} through 10209 * {@code '\u005Cu005A'}), lowercase 10210 * ({@code '\u005Cu0061'} through {@code '\u005Cu007A'}), and 10211 * full width variant ({@code '\u005CuFF21'} through 10212 * {@code '\u005CuFF3A'} and {@code '\u005CuFF41'} through 10213 * {@code '\u005CuFF5A'}) forms have numeric values from 10 10214 * through 35. This is independent of the Unicode specification, 10215 * which does not assign numeric values to these {@code char} 10216 * values. 10217 * <p> 10218 * If the character does not have a numeric value, then -1 is returned. 10219 * If the character has a numeric value that cannot be represented as a 10220 * nonnegative integer (for example, a fractional value), then -2 10221 * is returned. 10222 * 10223 * <p><b>Note:</b> This method cannot handle <a 10224 * href="#supplementary"> supplementary characters</a>. To support 10225 * all Unicode characters, including supplementary characters, use 10226 * the {@link #getNumericValue(int)} method. 10227 * 10228 * @param ch the character to be converted. 10229 * @return the numeric value of the character, as a nonnegative {@code int} 10230 * value; -2 if the character has a numeric value but the value 10231 * can not be represented as a nonnegative {@code int} value; 10232 * -1 if the character has no numeric value. 10233 * @see Character#forDigit(int, int) 10234 * @see Character#isDigit(char) 10235 * @since 1.1 10236 */ 10237 public static int getNumericValue(char ch) { 10238 return getNumericValue((int)ch); 10239 } 10240 10241 /** 10242 * Returns the {@code int} value that the specified 10243 * character (Unicode code point) represents. For example, the character 10244 * {@code '\u005Cu216C'} (the Roman numeral fifty) will return 10245 * an {@code int} with a value of 50. 10246 * <p> 10247 * The letters A-Z in their uppercase ({@code '\u005Cu0041'} through 10248 * {@code '\u005Cu005A'}), lowercase 10249 * ({@code '\u005Cu0061'} through {@code '\u005Cu007A'}), and 10250 * full width variant ({@code '\u005CuFF21'} through 10251 * {@code '\u005CuFF3A'} and {@code '\u005CuFF41'} through 10252 * {@code '\u005CuFF5A'}) forms have numeric values from 10 10253 * through 35. This is independent of the Unicode specification, 10254 * which does not assign numeric values to these {@code char} 10255 * values. 10256 * <p> 10257 * If the character does not have a numeric value, then -1 is returned. 10258 * If the character has a numeric value that cannot be represented as a 10259 * nonnegative integer (for example, a fractional value), then -2 10260 * is returned. 10261 * 10262 * @param codePoint the character (Unicode code point) to be converted. 10263 * @return the numeric value of the character, as a nonnegative {@code int} 10264 * value; -2 if the character has a numeric value but the value 10265 * can not be represented as a nonnegative {@code int} value; 10266 * -1 if the character has no numeric value. 10267 * @see Character#forDigit(int, int) 10268 * @see Character#isDigit(int) 10269 * @since 1.5 10270 */ 10271 public static int getNumericValue(int codePoint) { 10272 return CharacterData.of(codePoint).getNumericValue(codePoint); 10273 } 10274 10275 /** 10276 * Determines if the specified character is ISO-LATIN-1 white space. 10277 * This method returns {@code true} for the following five 10278 * characters only: 10279 * <table class="striped"> 10280 * <caption style="display:none">truechars</caption> 10281 * <thead> 10282 * <tr><th scope="col">Character 10283 * <th scope="col">Code 10284 * <th scope="col">Name 10285 * </thead> 10286 * <tbody> 10287 * <tr><th scope="row">{@code '\t'}</th> <td>{@code U+0009}</td> 10288 * <td>{@code HORIZONTAL TABULATION}</td></tr> 10289 * <tr><th scope="row">{@code '\n'}</th> <td>{@code U+000A}</td> 10290 * <td>{@code NEW LINE}</td></tr> 10291 * <tr><th scope="row">{@code '\f'}</th> <td>{@code U+000C}</td> 10292 * <td>{@code FORM FEED}</td></tr> 10293 * <tr><th scope="row">{@code '\r'}</th> <td>{@code U+000D}</td> 10294 * <td>{@code CARRIAGE RETURN}</td></tr> 10295 * <tr><th scope="row">{@code ' '}</th> <td>{@code U+0020}</td> 10296 * <td>{@code SPACE}</td></tr> 10297 * </tbody> 10298 * </table> 10299 * 10300 * @param ch the character to be tested. 10301 * @return {@code true} if the character is ISO-LATIN-1 white 10302 * space; {@code false} otherwise. 10303 * @see Character#isSpaceChar(char) 10304 * @see Character#isWhitespace(char) 10305 * @deprecated Replaced by isWhitespace(char). 10306 */ 10307 @Deprecated(since="1.1") 10308 public static boolean isSpace(char ch) { 10309 return (ch <= 0x0020) && 10310 (((((1L << 0x0009) | 10311 (1L << 0x000A) | 10312 (1L << 0x000C) | 10313 (1L << 0x000D) | 10314 (1L << 0x0020)) >> ch) & 1L) != 0); 10315 } 10316 10317 10318 /** 10319 * Determines if the specified character is a Unicode space character. 10320 * A character is considered to be a space character if and only if 10321 * it is specified to be a space character by the Unicode Standard. This 10322 * method returns true if the character's general category type is any of 10323 * the following: 10324 * <ul> 10325 * <li> {@code SPACE_SEPARATOR} 10326 * <li> {@code LINE_SEPARATOR} 10327 * <li> {@code PARAGRAPH_SEPARATOR} 10328 * </ul> 10329 * 10330 * <p><b>Note:</b> This method cannot handle <a 10331 * href="#supplementary"> supplementary characters</a>. To support 10332 * all Unicode characters, including supplementary characters, use 10333 * the {@link #isSpaceChar(int)} method. 10334 * 10335 * @param ch the character to be tested. 10336 * @return {@code true} if the character is a space character; 10337 * {@code false} otherwise. 10338 * @see Character#isWhitespace(char) 10339 * @since 1.1 10340 */ 10341 public static boolean isSpaceChar(char ch) { 10342 return isSpaceChar((int)ch); 10343 } 10344 10345 /** 10346 * Determines if the specified character (Unicode code point) is a 10347 * Unicode space character. A character is considered to be a 10348 * space character if and only if it is specified to be a space 10349 * character by the Unicode Standard. This method returns true if 10350 * the character's general category type is any of the following: 10351 * 10352 * <ul> 10353 * <li> {@link #SPACE_SEPARATOR} 10354 * <li> {@link #LINE_SEPARATOR} 10355 * <li> {@link #PARAGRAPH_SEPARATOR} 10356 * </ul> 10357 * 10358 * @param codePoint the character (Unicode code point) to be tested. 10359 * @return {@code true} if the character is a space character; 10360 * {@code false} otherwise. 10361 * @see Character#isWhitespace(int) 10362 * @since 1.5 10363 */ 10364 public static boolean isSpaceChar(int codePoint) { 10365 return ((((1 << Character.SPACE_SEPARATOR) | 10366 (1 << Character.LINE_SEPARATOR) | 10367 (1 << Character.PARAGRAPH_SEPARATOR)) >> getType(codePoint)) & 1) 10368 != 0; 10369 } 10370 10371 /** 10372 * Determines if the specified character is white space according to Java. 10373 * A character is a Java whitespace character if and only if it satisfies 10374 * one of the following criteria: 10375 * <ul> 10376 * <li> It is a Unicode space character ({@code SPACE_SEPARATOR}, 10377 * {@code LINE_SEPARATOR}, or {@code PARAGRAPH_SEPARATOR}) 10378 * but is not also a non-breaking space ({@code '\u005Cu00A0'}, 10379 * {@code '\u005Cu2007'}, {@code '\u005Cu202F'}). 10380 * <li> It is {@code '\u005Ct'}, U+0009 HORIZONTAL TABULATION. 10381 * <li> It is {@code '\u005Cn'}, U+000A LINE FEED. 10382 * <li> It is {@code '\u005Cu000B'}, U+000B VERTICAL TABULATION. 10383 * <li> It is {@code '\u005Cf'}, U+000C FORM FEED. 10384 * <li> It is {@code '\u005Cr'}, U+000D CARRIAGE RETURN. 10385 * <li> It is {@code '\u005Cu001C'}, U+001C FILE SEPARATOR. 10386 * <li> It is {@code '\u005Cu001D'}, U+001D GROUP SEPARATOR. 10387 * <li> It is {@code '\u005Cu001E'}, U+001E RECORD SEPARATOR. 10388 * <li> It is {@code '\u005Cu001F'}, U+001F UNIT SEPARATOR. 10389 * </ul> 10390 * 10391 * <p><b>Note:</b> This method cannot handle <a 10392 * href="#supplementary"> supplementary characters</a>. To support 10393 * all Unicode characters, including supplementary characters, use 10394 * the {@link #isWhitespace(int)} method. 10395 * 10396 * @param ch the character to be tested. 10397 * @return {@code true} if the character is a Java whitespace 10398 * character; {@code false} otherwise. 10399 * @see Character#isSpaceChar(char) 10400 * @since 1.1 10401 */ 10402 public static boolean isWhitespace(char ch) { 10403 return isWhitespace((int)ch); 10404 } 10405 10406 /** 10407 * Determines if the specified character (Unicode code point) is 10408 * white space according to Java. A character is a Java 10409 * whitespace character if and only if it satisfies one of the 10410 * following criteria: 10411 * <ul> 10412 * <li> It is a Unicode space character ({@link #SPACE_SEPARATOR}, 10413 * {@link #LINE_SEPARATOR}, or {@link #PARAGRAPH_SEPARATOR}) 10414 * but is not also a non-breaking space ({@code '\u005Cu00A0'}, 10415 * {@code '\u005Cu2007'}, {@code '\u005Cu202F'}). 10416 * <li> It is {@code '\u005Ct'}, U+0009 HORIZONTAL TABULATION. 10417 * <li> It is {@code '\u005Cn'}, U+000A LINE FEED. 10418 * <li> It is {@code '\u005Cu000B'}, U+000B VERTICAL TABULATION. 10419 * <li> It is {@code '\u005Cf'}, U+000C FORM FEED. 10420 * <li> It is {@code '\u005Cr'}, U+000D CARRIAGE RETURN. 10421 * <li> It is {@code '\u005Cu001C'}, U+001C FILE SEPARATOR. 10422 * <li> It is {@code '\u005Cu001D'}, U+001D GROUP SEPARATOR. 10423 * <li> It is {@code '\u005Cu001E'}, U+001E RECORD SEPARATOR. 10424 * <li> It is {@code '\u005Cu001F'}, U+001F UNIT SEPARATOR. 10425 * </ul> 10426 * 10427 * @param codePoint the character (Unicode code point) to be tested. 10428 * @return {@code true} if the character is a Java whitespace 10429 * character; {@code false} otherwise. 10430 * @see Character#isSpaceChar(int) 10431 * @since 1.5 10432 */ 10433 public static boolean isWhitespace(int codePoint) { 10434 return CharacterData.of(codePoint).isWhitespace(codePoint); 10435 } 10436 10437 /** 10438 * Determines if the specified character is an ISO control 10439 * character. A character is considered to be an ISO control 10440 * character if its code is in the range {@code '\u005Cu0000'} 10441 * through {@code '\u005Cu001F'} or in the range 10442 * {@code '\u005Cu007F'} through {@code '\u005Cu009F'}. 10443 * 10444 * <p><b>Note:</b> This method cannot handle <a 10445 * href="#supplementary"> supplementary characters</a>. To support 10446 * all Unicode characters, including supplementary characters, use 10447 * the {@link #isISOControl(int)} method. 10448 * 10449 * @param ch the character to be tested. 10450 * @return {@code true} if the character is an ISO control character; 10451 * {@code false} otherwise. 10452 * 10453 * @see Character#isSpaceChar(char) 10454 * @see Character#isWhitespace(char) 10455 * @since 1.1 10456 */ 10457 public static boolean isISOControl(char ch) { 10458 return isISOControl((int)ch); 10459 } 10460 10461 /** 10462 * Determines if the referenced character (Unicode code point) is an ISO control 10463 * character. A character is considered to be an ISO control 10464 * character if its code is in the range {@code '\u005Cu0000'} 10465 * through {@code '\u005Cu001F'} or in the range 10466 * {@code '\u005Cu007F'} through {@code '\u005Cu009F'}. 10467 * 10468 * @param codePoint the character (Unicode code point) to be tested. 10469 * @return {@code true} if the character is an ISO control character; 10470 * {@code false} otherwise. 10471 * @see Character#isSpaceChar(int) 10472 * @see Character#isWhitespace(int) 10473 * @since 1.5 10474 */ 10475 public static boolean isISOControl(int codePoint) { 10476 // Optimized form of: 10477 // (codePoint >= 0x00 && codePoint <= 0x1F) || 10478 // (codePoint >= 0x7F && codePoint <= 0x9F); 10479 return codePoint <= 0x9F && 10480 (codePoint >= 0x7F || (codePoint >>> 5 == 0)); 10481 } 10482 10483 /** 10484 * Returns a value indicating a character's general category. 10485 * 10486 * <p><b>Note:</b> This method cannot handle <a 10487 * href="#supplementary"> supplementary characters</a>. To support 10488 * all Unicode characters, including supplementary characters, use 10489 * the {@link #getType(int)} method. 10490 * 10491 * @param ch the character to be tested. 10492 * @return a value of type {@code int} representing the 10493 * character's general category. 10494 * @see Character#COMBINING_SPACING_MARK 10495 * @see Character#CONNECTOR_PUNCTUATION 10496 * @see Character#CONTROL 10497 * @see Character#CURRENCY_SYMBOL 10498 * @see Character#DASH_PUNCTUATION 10499 * @see Character#DECIMAL_DIGIT_NUMBER 10500 * @see Character#ENCLOSING_MARK 10501 * @see Character#END_PUNCTUATION 10502 * @see Character#FINAL_QUOTE_PUNCTUATION 10503 * @see Character#FORMAT 10504 * @see Character#INITIAL_QUOTE_PUNCTUATION 10505 * @see Character#LETTER_NUMBER 10506 * @see Character#LINE_SEPARATOR 10507 * @see Character#LOWERCASE_LETTER 10508 * @see Character#MATH_SYMBOL 10509 * @see Character#MODIFIER_LETTER 10510 * @see Character#MODIFIER_SYMBOL 10511 * @see Character#NON_SPACING_MARK 10512 * @see Character#OTHER_LETTER 10513 * @see Character#OTHER_NUMBER 10514 * @see Character#OTHER_PUNCTUATION 10515 * @see Character#OTHER_SYMBOL 10516 * @see Character#PARAGRAPH_SEPARATOR 10517 * @see Character#PRIVATE_USE 10518 * @see Character#SPACE_SEPARATOR 10519 * @see Character#START_PUNCTUATION 10520 * @see Character#SURROGATE 10521 * @see Character#TITLECASE_LETTER 10522 * @see Character#UNASSIGNED 10523 * @see Character#UPPERCASE_LETTER 10524 * @since 1.1 10525 */ 10526 public static int getType(char ch) { 10527 return getType((int)ch); 10528 } 10529 10530 /** 10531 * Returns a value indicating a character's general category. 10532 * 10533 * @param codePoint the character (Unicode code point) to be tested. 10534 * @return a value of type {@code int} representing the 10535 * character's general category. 10536 * @see Character#COMBINING_SPACING_MARK COMBINING_SPACING_MARK 10537 * @see Character#CONNECTOR_PUNCTUATION CONNECTOR_PUNCTUATION 10538 * @see Character#CONTROL CONTROL 10539 * @see Character#CURRENCY_SYMBOL CURRENCY_SYMBOL 10540 * @see Character#DASH_PUNCTUATION DASH_PUNCTUATION 10541 * @see Character#DECIMAL_DIGIT_NUMBER DECIMAL_DIGIT_NUMBER 10542 * @see Character#ENCLOSING_MARK ENCLOSING_MARK 10543 * @see Character#END_PUNCTUATION END_PUNCTUATION 10544 * @see Character#FINAL_QUOTE_PUNCTUATION FINAL_QUOTE_PUNCTUATION 10545 * @see Character#FORMAT FORMAT 10546 * @see Character#INITIAL_QUOTE_PUNCTUATION INITIAL_QUOTE_PUNCTUATION 10547 * @see Character#LETTER_NUMBER LETTER_NUMBER 10548 * @see Character#LINE_SEPARATOR LINE_SEPARATOR 10549 * @see Character#LOWERCASE_LETTER LOWERCASE_LETTER 10550 * @see Character#MATH_SYMBOL MATH_SYMBOL 10551 * @see Character#MODIFIER_LETTER MODIFIER_LETTER 10552 * @see Character#MODIFIER_SYMBOL MODIFIER_SYMBOL 10553 * @see Character#NON_SPACING_MARK NON_SPACING_MARK 10554 * @see Character#OTHER_LETTER OTHER_LETTER 10555 * @see Character#OTHER_NUMBER OTHER_NUMBER 10556 * @see Character#OTHER_PUNCTUATION OTHER_PUNCTUATION 10557 * @see Character#OTHER_SYMBOL OTHER_SYMBOL 10558 * @see Character#PARAGRAPH_SEPARATOR PARAGRAPH_SEPARATOR 10559 * @see Character#PRIVATE_USE PRIVATE_USE 10560 * @see Character#SPACE_SEPARATOR SPACE_SEPARATOR 10561 * @see Character#START_PUNCTUATION START_PUNCTUATION 10562 * @see Character#SURROGATE SURROGATE 10563 * @see Character#TITLECASE_LETTER TITLECASE_LETTER 10564 * @see Character#UNASSIGNED UNASSIGNED 10565 * @see Character#UPPERCASE_LETTER UPPERCASE_LETTER 10566 * @since 1.5 10567 */ 10568 public static int getType(int codePoint) { 10569 return CharacterData.of(codePoint).getType(codePoint); 10570 } 10571 10572 /** 10573 * Determines the character representation for a specific digit in 10574 * the specified radix. If the value of {@code radix} is not a 10575 * valid radix, or the value of {@code digit} is not a valid 10576 * digit in the specified radix, the null character 10577 * ({@code '\u005Cu0000'}) is returned. 10578 * <p> 10579 * The {@code radix} argument is valid if it is greater than or 10580 * equal to {@code MIN_RADIX} and less than or equal to 10581 * {@code MAX_RADIX}. The {@code digit} argument is valid if 10582 * {@code 0 <= digit < radix}. 10583 * <p> 10584 * If the digit is less than 10, then 10585 * {@code '0' + digit} is returned. Otherwise, the value 10586 * {@code 'a' + digit - 10} is returned. 10587 * 10588 * @param digit the number to convert to a character. 10589 * @param radix the radix. 10590 * @return the {@code char} representation of the specified digit 10591 * in the specified radix. 10592 * @see Character#MIN_RADIX 10593 * @see Character#MAX_RADIX 10594 * @see Character#digit(char, int) 10595 */ 10596 public static char forDigit(int digit, int radix) { 10597 if ((digit >= radix) || (digit < 0)) { 10598 return '\0'; 10599 } 10600 if ((radix < Character.MIN_RADIX) || (radix > Character.MAX_RADIX)) { 10601 return '\0'; 10602 } 10603 if (digit < 10) { 10604 return (char)('0' + digit); 10605 } 10606 return (char)('a' - 10 + digit); 10607 } 10608 10609 /** 10610 * Returns the Unicode directionality property for the given 10611 * character. Character directionality is used to calculate the 10612 * visual ordering of text. The directionality value of undefined 10613 * {@code char} values is {@code DIRECTIONALITY_UNDEFINED}. 10614 * 10615 * <p><b>Note:</b> This method cannot handle <a 10616 * href="#supplementary"> supplementary characters</a>. To support 10617 * all Unicode characters, including supplementary characters, use 10618 * the {@link #getDirectionality(int)} method. 10619 * 10620 * @param ch {@code char} for which the directionality property 10621 * is requested. 10622 * @return the directionality property of the {@code char} value. 10623 * 10624 * @see Character#DIRECTIONALITY_UNDEFINED 10625 * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT 10626 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT 10627 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC 10628 * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER 10629 * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR 10630 * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR 10631 * @see Character#DIRECTIONALITY_ARABIC_NUMBER 10632 * @see Character#DIRECTIONALITY_COMMON_NUMBER_SEPARATOR 10633 * @see Character#DIRECTIONALITY_NONSPACING_MARK 10634 * @see Character#DIRECTIONALITY_BOUNDARY_NEUTRAL 10635 * @see Character#DIRECTIONALITY_PARAGRAPH_SEPARATOR 10636 * @see Character#DIRECTIONALITY_SEGMENT_SEPARATOR 10637 * @see Character#DIRECTIONALITY_WHITESPACE 10638 * @see Character#DIRECTIONALITY_OTHER_NEUTRALS 10639 * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING 10640 * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE 10641 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING 10642 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE 10643 * @see Character#DIRECTIONALITY_POP_DIRECTIONAL_FORMAT 10644 * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_ISOLATE 10645 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_ISOLATE 10646 * @see Character#DIRECTIONALITY_FIRST_STRONG_ISOLATE 10647 * @see Character#DIRECTIONALITY_POP_DIRECTIONAL_ISOLATE 10648 * @since 1.4 10649 */ 10650 public static byte getDirectionality(char ch) { 10651 return getDirectionality((int)ch); 10652 } 10653 10654 /** 10655 * Returns the Unicode directionality property for the given 10656 * character (Unicode code point). Character directionality is 10657 * used to calculate the visual ordering of text. The 10658 * directionality value of undefined character is {@link 10659 * #DIRECTIONALITY_UNDEFINED}. 10660 * 10661 * @param codePoint the character (Unicode code point) for which 10662 * the directionality property is requested. 10663 * @return the directionality property of the character. 10664 * 10665 * @see Character#DIRECTIONALITY_UNDEFINED DIRECTIONALITY_UNDEFINED 10666 * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT DIRECTIONALITY_LEFT_TO_RIGHT 10667 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT DIRECTIONALITY_RIGHT_TO_LEFT 10668 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC 10669 * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER DIRECTIONALITY_EUROPEAN_NUMBER 10670 * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR 10671 * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR 10672 * @see Character#DIRECTIONALITY_ARABIC_NUMBER DIRECTIONALITY_ARABIC_NUMBER 10673 * @see Character#DIRECTIONALITY_COMMON_NUMBER_SEPARATOR DIRECTIONALITY_COMMON_NUMBER_SEPARATOR 10674 * @see Character#DIRECTIONALITY_NONSPACING_MARK DIRECTIONALITY_NONSPACING_MARK 10675 * @see Character#DIRECTIONALITY_BOUNDARY_NEUTRAL DIRECTIONALITY_BOUNDARY_NEUTRAL 10676 * @see Character#DIRECTIONALITY_PARAGRAPH_SEPARATOR DIRECTIONALITY_PARAGRAPH_SEPARATOR 10677 * @see Character#DIRECTIONALITY_SEGMENT_SEPARATOR DIRECTIONALITY_SEGMENT_SEPARATOR 10678 * @see Character#DIRECTIONALITY_WHITESPACE DIRECTIONALITY_WHITESPACE 10679 * @see Character#DIRECTIONALITY_OTHER_NEUTRALS DIRECTIONALITY_OTHER_NEUTRALS 10680 * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING 10681 * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE 10682 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING 10683 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE 10684 * @see Character#DIRECTIONALITY_POP_DIRECTIONAL_FORMAT DIRECTIONALITY_POP_DIRECTIONAL_FORMAT 10685 * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_ISOLATE DIRECTIONALITY_LEFT_TO_RIGHT_ISOLATE 10686 * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_ISOLATE DIRECTIONALITY_RIGHT_TO_LEFT_ISOLATE 10687 * @see Character#DIRECTIONALITY_FIRST_STRONG_ISOLATE DIRECTIONALITY_FIRST_STRONG_ISOLATE 10688 * @see Character#DIRECTIONALITY_POP_DIRECTIONAL_ISOLATE DIRECTIONALITY_POP_DIRECTIONAL_ISOLATE 10689 * @since 1.5 10690 */ 10691 public static byte getDirectionality(int codePoint) { 10692 return CharacterData.of(codePoint).getDirectionality(codePoint); 10693 } 10694 10695 /** 10696 * Determines whether the character is mirrored according to the 10697 * Unicode specification. Mirrored characters should have their 10698 * glyphs horizontally mirrored when displayed in text that is 10699 * right-to-left. For example, {@code '\u005Cu0028'} LEFT 10700 * PARENTHESIS is semantically defined to be an <i>opening 10701 * parenthesis</i>. This will appear as a "(" in text that is 10702 * left-to-right but as a ")" in text that is right-to-left. 10703 * 10704 * <p><b>Note:</b> This method cannot handle <a 10705 * href="#supplementary"> supplementary characters</a>. To support 10706 * all Unicode characters, including supplementary characters, use 10707 * the {@link #isMirrored(int)} method. 10708 * 10709 * @param ch {@code char} for which the mirrored property is requested 10710 * @return {@code true} if the char is mirrored, {@code false} 10711 * if the {@code char} is not mirrored or is not defined. 10712 * @since 1.4 10713 */ 10714 public static boolean isMirrored(char ch) { 10715 return isMirrored((int)ch); 10716 } 10717 10718 /** 10719 * Determines whether the specified character (Unicode code point) 10720 * is mirrored according to the Unicode specification. Mirrored 10721 * characters should have their glyphs horizontally mirrored when 10722 * displayed in text that is right-to-left. For example, 10723 * {@code '\u005Cu0028'} LEFT PARENTHESIS is semantically 10724 * defined to be an <i>opening parenthesis</i>. This will appear 10725 * as a "(" in text that is left-to-right but as a ")" in text 10726 * that is right-to-left. 10727 * 10728 * @param codePoint the character (Unicode code point) to be tested. 10729 * @return {@code true} if the character is mirrored, {@code false} 10730 * if the character is not mirrored or is not defined. 10731 * @since 1.5 10732 */ 10733 public static boolean isMirrored(int codePoint) { 10734 return CharacterData.of(codePoint).isMirrored(codePoint); 10735 } 10736 10737 /** 10738 * Compares two {@code Character} objects numerically. 10739 * 10740 * @param anotherCharacter the {@code Character} to be compared. 10741 10742 * @return the value {@code 0} if the argument {@code Character} 10743 * is equal to this {@code Character}; a value less than 10744 * {@code 0} if this {@code Character} is numerically less 10745 * than the {@code Character} argument; and a value greater than 10746 * {@code 0} if this {@code Character} is numerically greater 10747 * than the {@code Character} argument (unsigned comparison). 10748 * Note that this is strictly a numerical comparison; it is not 10749 * locale-dependent. 10750 * @since 1.2 10751 */ 10752 public int compareTo(Character anotherCharacter) { 10753 return compare(this.value, anotherCharacter.value); 10754 } 10755 10756 /** 10757 * Compares two {@code char} values numerically. 10758 * The value returned is identical to what would be returned by: 10759 * <pre> 10760 * Character.valueOf(x).compareTo(Character.valueOf(y)) 10761 * </pre> 10762 * 10763 * @param x the first {@code char} to compare 10764 * @param y the second {@code char} to compare 10765 * @return the value {@code 0} if {@code x == y}; 10766 * a value less than {@code 0} if {@code x < y}; and 10767 * a value greater than {@code 0} if {@code x > y} 10768 * @since 1.7 10769 */ 10770 public static int compare(char x, char y) { 10771 return x - y; 10772 } 10773 10774 /** 10775 * Converts the character (Unicode code point) argument to uppercase using 10776 * information from the UnicodeData file. 10777 * 10778 * @param codePoint the character (Unicode code point) to be converted. 10779 * @return either the uppercase equivalent of the character, if 10780 * any, or an error flag ({@code Character.ERROR}) 10781 * that indicates that a 1:M {@code char} mapping exists. 10782 * @see Character#isLowerCase(char) 10783 * @see Character#isUpperCase(char) 10784 * @see Character#toLowerCase(char) 10785 * @see Character#toTitleCase(char) 10786 * @since 1.4 10787 */ 10788 static int toUpperCaseEx(int codePoint) { 10789 assert isValidCodePoint(codePoint); 10790 return CharacterData.of(codePoint).toUpperCaseEx(codePoint); 10791 } 10792 10793 /** 10794 * Converts the character (Unicode code point) argument to uppercase using case 10795 * mapping information from the SpecialCasing file in the Unicode 10796 * specification. If a character has no explicit uppercase 10797 * mapping, then the {@code char} itself is returned in the 10798 * {@code char[]}. 10799 * 10800 * @param codePoint the character (Unicode code point) to be converted. 10801 * @return a {@code char[]} with the uppercased character. 10802 * @since 1.4 10803 */ 10804 static char[] toUpperCaseCharArray(int codePoint) { 10805 // As of Unicode 6.0, 1:M uppercasings only happen in the BMP. 10806 assert isBmpCodePoint(codePoint); 10807 return CharacterData.of(codePoint).toUpperCaseCharArray(codePoint); 10808 } 10809 10810 /** 10811 * The number of bits used to represent a {@code char} value in unsigned 10812 * binary form, constant {@code 16}. 10813 * 10814 * @since 1.5 10815 */ 10816 public static final int SIZE = 16; 10817 10818 /** 10819 * The number of bytes used to represent a {@code char} value in unsigned 10820 * binary form. 10821 * 10822 * @since 1.8 10823 */ 10824 public static final int BYTES = SIZE / Byte.SIZE; 10825 10826 /** 10827 * Returns the value obtained by reversing the order of the bytes in the 10828 * specified {@code char} value. 10829 * 10830 * @param ch The {@code char} of which to reverse the byte order. 10831 * @return the value obtained by reversing (or, equivalently, swapping) 10832 * the bytes in the specified {@code char} value. 10833 * @since 1.5 10834 */ 10835 @HotSpotIntrinsicCandidate 10836 public static char reverseBytes(char ch) { 10837 return (char) (((ch & 0xFF00) >> 8) | (ch << 8)); 10838 } 10839 10840 /** 10841 * Returns the Unicode name of the specified character 10842 * {@code codePoint}, or null if the code point is 10843 * {@link #UNASSIGNED unassigned}. 10844 * <p> 10845 * Note: if the specified character is not assigned a name by 10846 * the <i>UnicodeData</i> file (part of the Unicode Character 10847 * Database maintained by the Unicode Consortium), the returned 10848 * name is the same as the result of expression. 10849 * 10850 * <blockquote>{@code 10851 * Character.UnicodeBlock.of(codePoint).toString().replace('_', ' ') 10852 * + " " 10853 * + Integer.toHexString(codePoint).toUpperCase(Locale.ROOT); 10854 * 10855 * }</blockquote> 10856 * 10857 * @param codePoint the character (Unicode code point) 10858 * 10859 * @return the Unicode name of the specified character, or null if 10860 * the code point is unassigned. 10861 * 10862 * @throws IllegalArgumentException if the specified 10863 * {@code codePoint} is not a valid Unicode 10864 * code point. 10865 * 10866 * @since 1.7 10867 */ 10868 public static String getName(int codePoint) { 10869 if (!isValidCodePoint(codePoint)) { 10870 throw new IllegalArgumentException( 10871 String.format("Not a valid Unicode code point: 0x%X", codePoint)); 10872 } 10873 String name = CharacterName.getInstance().getName(codePoint); 10874 if (name != null) 10875 return name; 10876 if (getType(codePoint) == UNASSIGNED) 10877 return null; 10878 UnicodeBlock block = UnicodeBlock.of(codePoint); 10879 if (block != null) 10880 return block.toString().replace('_', ' ') + " " 10881 + Integer.toHexString(codePoint).toUpperCase(Locale.ROOT); 10882 // should never come here 10883 return Integer.toHexString(codePoint).toUpperCase(Locale.ROOT); 10884 } 10885 10886 /** 10887 * Returns the code point value of the Unicode character specified by 10888 * the given Unicode character name. 10889 * <p> 10890 * Note: if a character is not assigned a name by the <i>UnicodeData</i> 10891 * file (part of the Unicode Character Database maintained by the Unicode 10892 * Consortium), its name is defined as the result of expression 10893 * 10894 * <blockquote>{@code 10895 * Character.UnicodeBlock.of(codePoint).toString().replace('_', ' ') 10896 * + " " 10897 * + Integer.toHexString(codePoint).toUpperCase(Locale.ROOT); 10898 * 10899 * }</blockquote> 10900 * <p> 10901 * The {@code name} matching is case insensitive, with any leading and 10902 * trailing whitespace character removed. 10903 * 10904 * @param name the Unicode character name 10905 * 10906 * @return the code point value of the character specified by its name. 10907 * 10908 * @throws IllegalArgumentException if the specified {@code name} 10909 * is not a valid Unicode character name. 10910 * @throws NullPointerException if {@code name} is {@code null} 10911 * 10912 * @since 9 10913 */ 10914 public static int codePointOf(String name) { 10915 name = name.trim().toUpperCase(Locale.ROOT); 10916 int cp = CharacterName.getInstance().getCodePoint(name); 10917 if (cp != -1) 10918 return cp; 10919 try { 10920 int off = name.lastIndexOf(' '); 10921 if (off != -1) { 10922 cp = Integer.parseInt(name, off + 1, name.length(), 16); 10923 if (isValidCodePoint(cp) && name.equals(getName(cp))) 10924 return cp; 10925 } 10926 } catch (Exception x) {} 10927 throw new IllegalArgumentException("Unrecognized character name :" + name); 10928 } 10929 }