1 /* 2 * Copyright (c) 1995, 2016, Oracle and/or its affiliates. All rights reserved. 3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 4 * 5 * This code is free software; you can redistribute it and/or modify it 6 * under the terms of the GNU General Public License version 2 only, as 7 * published by the Free Software Foundation. Oracle designates this 8 * particular file as subject to the "Classpath" exception as provided 9 * by Oracle in the LICENSE file that accompanied this code. 10 * 11 * This code is distributed in the hope that it will be useful, but WITHOUT 12 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 13 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 14 * version 2 for more details (a copy is included in the LICENSE file that 15 * accompanied this code). 16 * 17 * You should have received a copy of the GNU General Public License version 18 * 2 along with this work; if not, write to the Free Software Foundation, 19 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 20 * 21 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 22 * or visit www.oracle.com if you need additional information or have any 23 * questions. 24 */ 25 26 package java.io; 27 28 /** 29 * The {@code DataInput} interface provides 30 * for reading bytes from a binary stream and 31 * reconstructing from them data in any of 32 * the Java primitive types. There is also 33 * a 34 * facility for reconstructing a {@code String} 35 * from data in 36 * <a href="#modified-utf-8">modified UTF-8</a> 37 * format. 38 * <p> 39 * It is generally true of all the reading 40 * routines in this interface that if end of 41 * file is reached before the desired number 42 * of bytes has been read, an {@code EOFException} 43 * (which is a kind of {@code IOException}) 44 * is thrown. If any byte cannot be read for 45 * any reason other than end of file, an {@code IOException} 46 * other than {@code EOFException} is 47 * thrown. In particular, an {@code IOException} 48 * may be thrown if the input stream has been 49 * closed. 50 * 51 * <h3><a name="modified-utf-8">Modified UTF-8</a></h3> 52 * <p> 53 * Implementations of the DataInput and DataOutput interfaces represent 54 * Unicode strings in a format that is a slight modification of UTF-8. 55 * (For information regarding the standard UTF-8 format, see section 56 * <i>3.9 Unicode Encoding Forms</i> of <i>The Unicode Standard, Version 57 * 4.0</i>). 58 * Note that in the following table, the most significant bit appears in the 59 * far left-hand column. 60 * 61 * <blockquote> 62 * <table border="1" cellspacing="0" cellpadding="8" 63 * summary="Bit values and bytes"> 64 * <tr> 65 * <th colspan="9"><span style="font-weight:normal"> 66 * All characters in the range {@code '\u005Cu0001'} to 67 * {@code '\u005Cu007F'} are represented by a single byte:</span></th> 68 * </tr> 69 * <tr> 70 * <td></td> 71 * <th colspan="8" id="bit_a">Bit Values</th> 72 * </tr> 73 * <tr> 74 * <th id="byte1_a">Byte 1</th> 75 * <td><center>0</center> 76 * <td colspan="7"><center>bits 6-0</center> 77 * </tr> 78 * <tr> 79 * <th colspan="9"><span style="font-weight:normal"> 80 * The null character {@code '\u005Cu0000'} and characters 81 * in the range {@code '\u005Cu0080'} to {@code '\u005Cu07FF'} are 82 * represented by a pair of bytes:</span></th> 83 * </tr> 84 * <tr> 85 * <td></td> 86 * <th colspan="8" id="bit_b">Bit Values</th> 87 * </tr> 88 * <tr> 89 * <th id="byte1_b">Byte 1</th> 90 * <td><center>1</center> 91 * <td><center>1</center> 92 * <td><center>0</center> 93 * <td colspan="5"><center>bits 10-6</center> 94 * </tr> 95 * <tr> 96 * <th id="byte2_a">Byte 2</th> 97 * <td><center>1</center> 98 * <td><center>0</center> 99 * <td colspan="6"><center>bits 5-0</center> 100 * </tr> 101 * <tr> 102 * <th colspan="9"><span style="font-weight:normal"> 103 * {@code char} values in the range {@code '\u005Cu0800'} 104 * to {@code '\u005CuFFFF'} are represented by three bytes:</span></th> 105 * </tr> 106 * <tr> 107 * <td></td> 108 * <th colspan="8"id="bit_c">Bit Values</th> 109 * </tr> 110 * <tr> 111 * <th id="byte1_c">Byte 1</th> 112 * <td><center>1</center> 113 * <td><center>1</center> 114 * <td><center>1</center> 115 * <td><center>0</center> 116 * <td colspan="4"><center>bits 15-12</center> 117 * </tr> 118 * <tr> 119 * <th id="byte2_b">Byte 2</th> 120 * <td><center>1</center> 121 * <td><center>0</center> 122 * <td colspan="6"><center>bits 11-6</center> 123 * </tr> 124 * <tr> 125 * <th id="byte3">Byte 3</th> 126 * <td><center>1</center> 127 * <td><center>0</center> 128 * <td colspan="6"><center>bits 5-0</center> 129 * </tr> 130 * </table> 131 * </blockquote> 132 * <p> 133 * The differences between this format and the 134 * standard UTF-8 format are the following: 135 * <ul> 136 * <li>The null byte {@code '\u005Cu0000'} is encoded in 2-byte format 137 * rather than 1-byte, so that the encoded strings never have 138 * embedded nulls. 139 * <li>Only the 1-byte, 2-byte, and 3-byte formats are used. 140 * <li><a href="../lang/Character.html#unicode">Supplementary characters</a> 141 * are represented in the form of surrogate pairs. 142 * </ul> 143 * @author Frank Yellin 144 * @see java.io.DataInputStream 145 * @see java.io.DataOutput 146 * @since 1.0 147 */ 148 public 149 interface DataInput { 150 /** 151 * Reads some bytes from an input 152 * stream and stores them into the buffer 153 * array {@code b}. The number of bytes 154 * read is equal 155 * to the length of {@code b}. 156 * <p> 157 * This method blocks until one of the 158 * following conditions occurs: 159 * <ul> 160 * <li>{@code b.length} 161 * bytes of input data are available, in which 162 * case a normal return is made. 163 * 164 * <li>End of 165 * file is detected, in which case an {@code EOFException} 166 * is thrown. 167 * 168 * <li>An I/O error occurs, in 169 * which case an {@code IOException} other 170 * than {@code EOFException} is thrown. 171 * </ul> 172 * <p> 173 * If {@code b} is {@code null}, 174 * a {@code NullPointerException} is thrown. 175 * If {@code b.length} is zero, then 176 * no bytes are read. Otherwise, the first 177 * byte read is stored into element {@code b[0]}, 178 * the next one into {@code b[1]}, and 179 * so on. 180 * If an exception is thrown from 181 * this method, then it may be that some but 182 * not all bytes of {@code b} have been 183 * updated with data from the input stream. 184 * 185 * @param b the buffer into which the data is read. 186 * @throws NullPointerException if {@code b} is {@code null}. 187 * @throws EOFException if this stream reaches the end before reading 188 * all the bytes. 189 * @throws IOException if an I/O error occurs. 190 */ 191 void readFully(byte b[]) throws IOException; 192 193 /** 194 * 195 * Reads {@code len} 196 * bytes from 197 * an input stream. 198 * <p> 199 * This method 200 * blocks until one of the following conditions 201 * occurs: 202 * <ul> 203 * <li>{@code len} bytes 204 * of input data are available, in which case 205 * a normal return is made. 206 * 207 * <li>End of file 208 * is detected, in which case an {@code EOFException} 209 * is thrown. 210 * 211 * <li>An I/O error occurs, in 212 * which case an {@code IOException} other 213 * than {@code EOFException} is thrown. 214 * </ul> 215 * <p> 216 * If {@code b} is {@code null}, 217 * a {@code NullPointerException} is thrown. 218 * If {@code off} is negative, or {@code len} 219 * is negative, or {@code off+len} is 220 * greater than the length of the array {@code b}, 221 * then an {@code IndexOutOfBoundsException} 222 * is thrown. 223 * If {@code len} is zero, 224 * then no bytes are read. Otherwise, the first 225 * byte read is stored into element {@code b[off]}, 226 * the next one into {@code b[off+1]}, 227 * and so on. The number of bytes read is, 228 * at most, equal to {@code len}. 229 * 230 * @param b the buffer into which the data is read. 231 * @param off an int specifying the offset in the data array {@code b}. 232 * @param len an int specifying the number of bytes to read. 233 * @throws NullPointerException if {@code b} is {@code null}. 234 * @throws IndexOutOfBoundsException if {@code off} is negative, 235 * {@code len} is negative, or {@code len} is greater than 236 * {@code b.length - off}. 237 * @throws EOFException if this stream reaches the end before reading 238 * all the bytes. 239 * @throws IOException if an I/O error occurs. 240 */ 241 void readFully(byte b[], int off, int len) throws IOException; 242 243 /** 244 * Makes an attempt to skip over 245 * {@code n} bytes 246 * of data from the input 247 * stream, discarding the skipped bytes. However, 248 * it may skip 249 * over some smaller number of 250 * bytes, possibly zero. This may result from 251 * any of a 252 * number of conditions; reaching 253 * end of file before {@code n} bytes 254 * have been skipped is 255 * only one possibility. 256 * This method never throws an {@code EOFException}. 257 * The actual 258 * number of bytes skipped is returned. 259 * 260 * @param n the number of bytes to be skipped. 261 * @return the number of bytes actually skipped. 262 * @exception IOException if an I/O error occurs. 263 */ 264 int skipBytes(int n) throws IOException; 265 266 /** 267 * Reads one input byte and returns 268 * {@code true} if that byte is nonzero, 269 * {@code false} if that byte is zero. 270 * This method is suitable for reading 271 * the byte written by the {@code writeBoolean} 272 * method of interface {@code DataOutput}. 273 * 274 * @return the {@code boolean} value read. 275 * @exception EOFException if this stream reaches the end before reading 276 * all the bytes. 277 * @exception IOException if an I/O error occurs. 278 */ 279 boolean readBoolean() throws IOException; 280 281 /** 282 * Reads and returns one input byte. 283 * The byte is treated as a signed value in 284 * the range {@code -128} through {@code 127}, 285 * inclusive. 286 * This method is suitable for 287 * reading the byte written by the {@code writeByte} 288 * method of interface {@code DataOutput}. 289 * 290 * @return the 8-bit value read. 291 * @exception EOFException if this stream reaches the end before reading 292 * all the bytes. 293 * @exception IOException if an I/O error occurs. 294 */ 295 byte readByte() throws IOException; 296 297 /** 298 * Reads one input byte, zero-extends 299 * it to type {@code int}, and returns 300 * the result, which is therefore in the range 301 * {@code 0} 302 * through {@code 255}. 303 * This method is suitable for reading 304 * the byte written by the {@code writeByte} 305 * method of interface {@code DataOutput} 306 * if the argument to {@code writeByte} 307 * was intended to be a value in the range 308 * {@code 0} through {@code 255}. 309 * 310 * @return the unsigned 8-bit value read. 311 * @exception EOFException if this stream reaches the end before reading 312 * all the bytes. 313 * @exception IOException if an I/O error occurs. 314 */ 315 int readUnsignedByte() throws IOException; 316 317 /** 318 * Reads two input bytes and returns 319 * a {@code short} value. Let {@code a} 320 * be the first byte read and {@code b} 321 * be the second byte. The value 322 * returned 323 * is: 324 * <pre>{@code (short)((a << 8) | (b & 0xff)) 325 * }</pre> 326 * This method 327 * is suitable for reading the bytes written 328 * by the {@code writeShort} method of 329 * interface {@code DataOutput}. 330 * 331 * @return the 16-bit value read. 332 * @exception EOFException if this stream reaches the end before reading 333 * all the bytes. 334 * @exception IOException if an I/O error occurs. 335 */ 336 short readShort() throws IOException; 337 338 /** 339 * Reads two input bytes and returns 340 * an {@code int} value in the range {@code 0} 341 * through {@code 65535}. Let {@code a} 342 * be the first byte read and 343 * {@code b} 344 * be the second byte. The value returned is: 345 * <pre>{@code (((a & 0xff) << 8) | (b & 0xff)) 346 * }</pre> 347 * This method is suitable for reading the bytes 348 * written by the {@code writeShort} method 349 * of interface {@code DataOutput} if 350 * the argument to {@code writeShort} 351 * was intended to be a value in the range 352 * {@code 0} through {@code 65535}. 353 * 354 * @return the unsigned 16-bit value read. 355 * @exception EOFException if this stream reaches the end before reading 356 * all the bytes. 357 * @exception IOException if an I/O error occurs. 358 */ 359 int readUnsignedShort() throws IOException; 360 361 /** 362 * Reads two input bytes and returns a {@code char} value. 363 * Let {@code a} 364 * be the first byte read and {@code b} 365 * be the second byte. The value 366 * returned is: 367 * <pre>{@code (char)((a << 8) | (b & 0xff)) 368 * }</pre> 369 * This method 370 * is suitable for reading bytes written by 371 * the {@code writeChar} method of interface 372 * {@code DataOutput}. 373 * 374 * @return the {@code char} value read. 375 * @exception EOFException if this stream reaches the end before reading 376 * all the bytes. 377 * @exception IOException if an I/O error occurs. 378 */ 379 char readChar() throws IOException; 380 381 /** 382 * Reads four input bytes and returns an 383 * {@code int} value. Let {@code a-d} 384 * be the first through fourth bytes read. The value returned is: 385 * <pre>{@code 386 * (((a & 0xff) << 24) | ((b & 0xff) << 16) | 387 * ((c & 0xff) << 8) | (d & 0xff)) 388 * }</pre> 389 * This method is suitable 390 * for reading bytes written by the {@code writeInt} 391 * method of interface {@code DataOutput}. 392 * 393 * @return the {@code int} value read. 394 * @exception EOFException if this stream reaches the end before reading 395 * all the bytes. 396 * @exception IOException if an I/O error occurs. 397 */ 398 int readInt() throws IOException; 399 400 /** 401 * Reads eight input bytes and returns 402 * a {@code long} value. Let {@code a-h} 403 * be the first through eighth bytes read. 404 * The value returned is: 405 * <pre>{@code 406 * (((long)(a & 0xff) << 56) | 407 * ((long)(b & 0xff) << 48) | 408 * ((long)(c & 0xff) << 40) | 409 * ((long)(d & 0xff) << 32) | 410 * ((long)(e & 0xff) << 24) | 411 * ((long)(f & 0xff) << 16) | 412 * ((long)(g & 0xff) << 8) | 413 * ((long)(h & 0xff))) 414 * }</pre> 415 * <p> 416 * This method is suitable 417 * for reading bytes written by the {@code writeLong} 418 * method of interface {@code DataOutput}. 419 * 420 * @return the {@code long} value read. 421 * @exception EOFException if this stream reaches the end before reading 422 * all the bytes. 423 * @exception IOException if an I/O error occurs. 424 */ 425 long readLong() throws IOException; 426 427 /** 428 * Reads four input bytes and returns 429 * a {@code float} value. It does this 430 * by first constructing an {@code int} 431 * value in exactly the manner 432 * of the {@code readInt} 433 * method, then converting this {@code int} 434 * value to a {@code float} in 435 * exactly the manner of the method {@code Float.intBitsToFloat}. 436 * This method is suitable for reading 437 * bytes written by the {@code writeFloat} 438 * method of interface {@code DataOutput}. 439 * 440 * @return the {@code float} value read. 441 * @exception EOFException if this stream reaches the end before reading 442 * all the bytes. 443 * @exception IOException if an I/O error occurs. 444 */ 445 float readFloat() throws IOException; 446 447 /** 448 * Reads eight input bytes and returns 449 * a {@code double} value. It does this 450 * by first constructing a {@code long} 451 * value in exactly the manner 452 * of the {@code readLong} 453 * method, then converting this {@code long} 454 * value to a {@code double} in exactly 455 * the manner of the method {@code Double.longBitsToDouble}. 456 * This method is suitable for reading 457 * bytes written by the {@code writeDouble} 458 * method of interface {@code DataOutput}. 459 * 460 * @return the {@code double} value read. 461 * @exception EOFException if this stream reaches the end before reading 462 * all the bytes. 463 * @exception IOException if an I/O error occurs. 464 */ 465 double readDouble() throws IOException; 466 467 /** 468 * Reads the next line of text from the input stream. 469 * It reads successive bytes, converting 470 * each byte separately into a character, 471 * until it encounters a line terminator or 472 * end of 473 * file; the characters read are then 474 * returned as a {@code String}. Note 475 * that because this 476 * method processes bytes, 477 * it does not support input of the full Unicode 478 * character set. 479 * <p> 480 * If end of file is encountered 481 * before even one byte can be read, then {@code null} 482 * is returned. Otherwise, each byte that is 483 * read is converted to type {@code char} 484 * by zero-extension. If the character {@code '\n'} 485 * is encountered, it is discarded and reading 486 * ceases. If the character {@code '\r'} 487 * is encountered, it is discarded and, if 488 * the following byte converts to the 489 * character {@code '\n'}, then that is 490 * discarded also; reading then ceases. If 491 * end of file is encountered before either 492 * of the characters {@code '\n'} and 493 * {@code '\r'} is encountered, reading 494 * ceases. Once reading has ceased, a {@code String} 495 * is returned that contains all the characters 496 * read and not discarded, taken in order. 497 * Note that every character in this string 498 * will have a value less than {@code \u005Cu0100}, 499 * that is, {@code (char)256}. 500 * 501 * @return the next line of text from the input stream, 502 * or {@code null} if the end of file is 503 * encountered before a byte can be read. 504 * @exception IOException if an I/O error occurs. 505 */ 506 String readLine() throws IOException; 507 508 /** 509 * Reads in a string that has been encoded using a 510 * <a href="#modified-utf-8">modified UTF-8</a> 511 * format. 512 * The general contract of {@code readUTF} 513 * is that it reads a representation of a Unicode 514 * character string encoded in modified 515 * UTF-8 format; this string of characters 516 * is then returned as a {@code String}. 517 * <p> 518 * First, two bytes are read and used to 519 * construct an unsigned 16-bit integer in 520 * exactly the manner of the {@code readUnsignedShort} 521 * method . This integer value is called the 522 * <i>UTF length</i> and specifies the number 523 * of additional bytes to be read. These bytes 524 * are then converted to characters by considering 525 * them in groups. The length of each group 526 * is computed from the value of the first 527 * byte of the group. The byte following a 528 * group, if any, is the first byte of the 529 * next group. 530 * <p> 531 * If the first byte of a group 532 * matches the bit pattern {@code 0xxxxxxx} 533 * (where {@code x} means "may be {@code 0} 534 * or {@code 1}"), then the group consists 535 * of just that byte. The byte is zero-extended 536 * to form a character. 537 * <p> 538 * If the first byte 539 * of a group matches the bit pattern {@code 110xxxxx}, 540 * then the group consists of that byte {@code a} 541 * and a second byte {@code b}. If there 542 * is no byte {@code b} (because byte 543 * {@code a} was the last of the bytes 544 * to be read), or if byte {@code b} does 545 * not match the bit pattern {@code 10xxxxxx}, 546 * then a {@code UTFDataFormatException} 547 * is thrown. Otherwise, the group is converted 548 * to the character: 549 * <pre>{@code (char)(((a & 0x1F) << 6) | (b & 0x3F)) 550 * }</pre> 551 * If the first byte of a group 552 * matches the bit pattern {@code 1110xxxx}, 553 * then the group consists of that byte {@code a} 554 * and two more bytes {@code b} and {@code c}. 555 * If there is no byte {@code c} (because 556 * byte {@code a} was one of the last 557 * two of the bytes to be read), or either 558 * byte {@code b} or byte {@code c} 559 * does not match the bit pattern {@code 10xxxxxx}, 560 * then a {@code UTFDataFormatException} 561 * is thrown. Otherwise, the group is converted 562 * to the character: 563 * <pre>{@code 564 * (char)(((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F)) 565 * }</pre> 566 * If the first byte of a group matches the 567 * pattern {@code 1111xxxx} or the pattern 568 * {@code 10xxxxxx}, then a {@code UTFDataFormatException} 569 * is thrown. 570 * <p> 571 * If end of file is encountered 572 * at any time during this entire process, 573 * then an {@code EOFException} is thrown. 574 * <p> 575 * After every group has been converted to 576 * a character by this process, the characters 577 * are gathered, in the same order in which 578 * their corresponding groups were read from 579 * the input stream, to form a {@code String}, 580 * which is returned. 581 * <p> 582 * The {@code writeUTF} 583 * method of interface {@code DataOutput} 584 * may be used to write data that is suitable 585 * for reading by this method. 586 * @return a Unicode string. 587 * @exception EOFException if this stream reaches the end 588 * before reading all the bytes. 589 * @exception IOException if an I/O error occurs. 590 * @exception UTFDataFormatException if the bytes do not represent a 591 * valid modified UTF-8 encoding of a string. 592 */ 593 String readUTF() throws IOException; 594 }