--- old/src/java.base/share/classes/java/io/DataInput.java 2017-08-09 16:52:58.532674513 -0700 +++ new/src/java.base/share/classes/java/io/DataInput.java 2017-08-09 16:52:58.328665544 -0700 @@ -54,62 +54,67 @@ * Unicode strings in a format that is a slight modification of UTF-8. * (For information regarding the standard UTF-8 format, see section * 3.9 Unicode Encoding Forms of The Unicode Standard, Version - * 4.0). - * Note that in the following table, the most significant bit appears in the - * far left-hand column. + * 4.0) * - *
- *- *
+ *+ *
+ * + *- Characters in the range {@code '\u005Cu0001'} to + * {@code '\u005Cu007F'} are represented by a single byte. + *
- The null character {@code '\u005Cu0000'} and characters + * in the range {@code '\u005Cu0080'} to {@code '\u005Cu07FF'} are + * represented by a pair of bytes. + *
- Characters in the range {@code '\u005Cu0800'} + * to {@code '\u005CuFFFF'} are represented by three bytes. + *
+ *
- * + * *Encoding of UTF-8 values + * + *+ * + *Value + *Byte + *Bit Values + *+ * + * + * * *7 + *6 + *5 + *4 + *3 + *2 + *1 + *0 + *- * - *- * All characters in the range {@code '\u005Cu0001'} to - * {@code '\u005Cu007F'} are represented by a single byte: - *- * - *- * Bit Values - *- * *Byte 1 + *+ * {@code \u005Cu0001} to {@code \u005Cu007F} + *1 *0 * bits 6-0 * - * - *- * The null character {@code '\u005Cu0000'} and characters - * in the range {@code '\u005Cu0080'} to {@code '\u005Cu07FF'} are - * represented by a pair of bytes: - *- * - *- * Bit Values - *- * *Byte 1 + *+ * {@code \u005Cu0000}, + *
+ * {@code \u005Cu0080} to {@code \u005Cu07FF}1 *1 * 1 * 0 * bits 10-6 * - * *Byte 2 + * + *2 *1 * 0 * bits 5-0 * - * - *- * {@code char} values in the range {@code '\u005Cu0800'} - * to {@code '\u005CuFFFF'} are represented by three bytes: - *- * - *- * Bit Values - *- * *Byte 1 + *+ * {@code \u005Cu0800} to {@code \u005CuFFFF} + *1 *1 * 1 * 1 @@ -117,20 +122,22 @@ * bits 15-12 * - * *Byte 2 + * + *2 *1 * 0 * bits 11-6 * - * * *Byte 3 + * + *3 *1 * 0 * bits 5-0 * * The differences between this format and the * standard UTF-8 format are the following: