1 /*
2 * Copyright (c) 1999, 2010, Oracle and/or its affiliates. All rights reserved.
3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
4 *
5 * This code is free software; you can redistribute it and/or modify it
6 * under the terms of the GNU General Public License version 2 only, as
7 * published by the Free Software Foundation. Oracle designates this
8 * particular file as subject to the "Classpath" exception as provided
9 * by Oracle in the LICENSE file that accompanied this code.
10 *
11 * This code is distributed in the hope that it will be useful, but WITHOUT
12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 * version 2 for more details (a copy is included in the LICENSE file that
15 * accompanied this code).
16 *
17 * You should have received a copy of the GNU General Public License version
18 * 2 along with this work; if not, write to the Free Software Foundation,
19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
20 *
21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
22 * or visit www.oracle.com if you need additional information or have any
429 * of additional bytes to be read. These bytes are then converted
430 * to characters by considering them in groups. The length of each
431 * group is computed from the value of the first byte of the
432 * group. The byte following a group, if any, is the first byte of
433 * the next group.
434 *
435 * <p> If the first byte of a group matches the bit pattern
436 * <code>0xxxxxxx</code> (where <code>x</code> means "may be
437 * <code>0</code> or <code>1</code>"), then the group consists of
438 * just that byte. The byte is zero-extended to form a character.
439 *
440 * <p> If the first byte of a group matches the bit pattern
441 * <code>110xxxxx</code>, then the group consists of that byte
442 * <code>a</code> and a second byte <code>b</code>. If there is no
443 * byte <code>b</code> (because byte <code>a</code> was the last
444 * of the bytes to be read), or if byte <code>b</code> does not
445 * match the bit pattern <code>10xxxxxx</code>, then a
446 * <code>UTFDataFormatException</code> is thrown. Otherwise, the
447 * group is converted to the character:
448 *
449 * <p> <pre><code>
450 * (char)(((a& 0x1F) << 6) | (b & 0x3F))
451 * </code></pre>
452 *
453 * If the first byte of a group matches the bit pattern
454 * <code>1110xxxx</code>, then the group consists of that byte
455 * <code>a</code> and two more bytes <code>b</code> and
456 * <code>c</code>. If there is no byte <code>c</code> (because
457 * byte <code>a</code> was one of the last two of the bytes to be
458 * read), or either byte <code>b</code> or byte <code>c</code>
459 * does not match the bit pattern <code>10xxxxxx</code>, then a
460 * <code>UTFDataFormatException</code> is thrown. Otherwise, the
461 * group is converted to the character:
462 *
463 * <p> <pre><code>
464 * (char)(((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))
465 * </code></pre>
466 *
467 * If the first byte of a group matches the pattern
468 * <code>1111xxxx</code> or the pattern <code>10xxxxxx</code>,
469 * then a <code>UTFDataFormatException</code> is thrown.
470 *
471 * <p> If end of file is encountered at any time during this
472 * entire process, then an <code>java.io.EOFException</code> is thrown.
473 *
474 * <p> After every group has been converted to a character by this
475 * process, the characters are gathered, in the same order in
476 * which their corresponding groups were read from the input
477 * stream, to form a <code>String</code>, which is returned.
478 *
479 * <p> The current byte order setting is ignored.
480 *
481 * <p> The bit offset within the stream is reset to zero before
482 * the read occurs.
483 *
|
1 /*
2 * Copyright (c) 1999, 2013, Oracle and/or its affiliates. All rights reserved.
3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
4 *
5 * This code is free software; you can redistribute it and/or modify it
6 * under the terms of the GNU General Public License version 2 only, as
7 * published by the Free Software Foundation. Oracle designates this
8 * particular file as subject to the "Classpath" exception as provided
9 * by Oracle in the LICENSE file that accompanied this code.
10 *
11 * This code is distributed in the hope that it will be useful, but WITHOUT
12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 * version 2 for more details (a copy is included in the LICENSE file that
15 * accompanied this code).
16 *
17 * You should have received a copy of the GNU General Public License version
18 * 2 along with this work; if not, write to the Free Software Foundation,
19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
20 *
21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
22 * or visit www.oracle.com if you need additional information or have any
429 * of additional bytes to be read. These bytes are then converted
430 * to characters by considering them in groups. The length of each
431 * group is computed from the value of the first byte of the
432 * group. The byte following a group, if any, is the first byte of
433 * the next group.
434 *
435 * <p> If the first byte of a group matches the bit pattern
436 * <code>0xxxxxxx</code> (where <code>x</code> means "may be
437 * <code>0</code> or <code>1</code>"), then the group consists of
438 * just that byte. The byte is zero-extended to form a character.
439 *
440 * <p> If the first byte of a group matches the bit pattern
441 * <code>110xxxxx</code>, then the group consists of that byte
442 * <code>a</code> and a second byte <code>b</code>. If there is no
443 * byte <code>b</code> (because byte <code>a</code> was the last
444 * of the bytes to be read), or if byte <code>b</code> does not
445 * match the bit pattern <code>10xxxxxx</code>, then a
446 * <code>UTFDataFormatException</code> is thrown. Otherwise, the
447 * group is converted to the character:
448 *
449 * <pre><code>
450 * (char)(((a& 0x1F) << 6) | (b & 0x3F))
451 * </code></pre>
452 *
453 * If the first byte of a group matches the bit pattern
454 * <code>1110xxxx</code>, then the group consists of that byte
455 * <code>a</code> and two more bytes <code>b</code> and
456 * <code>c</code>. If there is no byte <code>c</code> (because
457 * byte <code>a</code> was one of the last two of the bytes to be
458 * read), or either byte <code>b</code> or byte <code>c</code>
459 * does not match the bit pattern <code>10xxxxxx</code>, then a
460 * <code>UTFDataFormatException</code> is thrown. Otherwise, the
461 * group is converted to the character:
462 *
463 * <pre><code>
464 * (char)(((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))
465 * </code></pre>
466 *
467 * If the first byte of a group matches the pattern
468 * <code>1111xxxx</code> or the pattern <code>10xxxxxx</code>,
469 * then a <code>UTFDataFormatException</code> is thrown.
470 *
471 * <p> If end of file is encountered at any time during this
472 * entire process, then an <code>java.io.EOFException</code> is thrown.
473 *
474 * <p> After every group has been converted to a character by this
475 * process, the characters are gathered, in the same order in
476 * which their corresponding groups were read from the input
477 * stream, to form a <code>String</code>, which is returned.
478 *
479 * <p> The current byte order setting is ignored.
480 *
481 * <p> The bit offset within the stream is reset to zero before
482 * the read occurs.
483 *
|