src/share/classes/java/net/URI.java

Print this page




 241  *
 242  * RFC 2396 specifies precisely which characters are permitted in the
 243  * various components of a URI reference.  The following categories, most of
 244  * which are taken from that specification, are used below to describe these
 245  * constraints:
 246  *
 247  * <blockquote><table cellspacing=2 summary="Describes categories alpha,digit,alphanum,unreserved,punct,reserved,escaped,and other">
 248  *   <tr><th valign=top><i>alpha</i></th>
 249  *       <td>The US-ASCII alphabetic characters,
 250  *        <tt>'A'</tt>&nbsp;through&nbsp;<tt>'Z'</tt>
 251  *        and <tt>'a'</tt>&nbsp;through&nbsp;<tt>'z'</tt></td></tr>
 252  *   <tr><th valign=top><i>digit</i></th>
 253  *       <td>The US-ASCII decimal digit characters,
 254  *       <tt>'0'</tt>&nbsp;through&nbsp;<tt>'9'</tt></td></tr>
 255  *   <tr><th valign=top><i>alphanum</i></th>
 256  *       <td>All <i>alpha</i> and <i>digit</i> characters</td></tr>
 257  *   <tr><th valign=top><i>unreserved</i>&nbsp;&nbsp;&nbsp;&nbsp;</th>
 258  *       <td>All <i>alphanum</i> characters together with those in the string
 259  *        <tt>"_-!.~'()*"</tt></td></tr>
 260  *   <tr><th valign=top><i>punct</i></th>
 261  *       <td>The characters in the string <tt>",;:$&+="</tt></td></tr>
 262  *   <tr><th valign=top><i>reserved</i></th>
 263  *       <td>All <i>punct</i> characters together with those in the string
 264  *        <tt>"?/[]@"</tt></td></tr>
 265  *   <tr><th valign=top><i>escaped</i></th>
 266  *       <td>Escaped octets, that is, triplets consisting of the percent
 267  *           character (<tt>'%'</tt>) followed by two hexadecimal digits
 268  *           (<tt>'0'</tt>-<tt>'9'</tt>, <tt>'A'</tt>-<tt>'F'</tt>, and
 269  *           <tt>'a'</tt>-<tt>'f'</tt>)</td></tr>
 270  *   <tr><th valign=top><i>other</i></th>
 271  *       <td>The Unicode characters that are not in the US-ASCII character set,
 272  *           are not control characters (according to the {@link
 273  *           java.lang.Character#isISOControl(char) Character.isISOControl}
 274  *           method), and are not space characters (according to the {@link
 275  *           java.lang.Character#isSpaceChar(char) Character.isSpaceChar}
 276  *           method)&nbsp;&nbsp;<i>(<b>Deviation from RFC 2396</b>, which is
 277  *           limited to US-ASCII)</i></td></tr>
 278  * </table></blockquote>
 279  *
 280  * <p><a name="legal-chars"></a> The set of all legal URI characters consists of
 281  * the <i>unreserved</i>, <i>reserved</i>, <i>escaped</i>, and <i>other</i>




 241  *
 242  * RFC&nbsp;2396 specifies precisely which characters are permitted in the
 243  * various components of a URI reference.  The following categories, most of
 244  * which are taken from that specification, are used below to describe these
 245  * constraints:
 246  *
 247  * <blockquote><table cellspacing=2 summary="Describes categories alpha,digit,alphanum,unreserved,punct,reserved,escaped,and other">
 248  *   <tr><th valign=top><i>alpha</i></th>
 249  *       <td>The US-ASCII alphabetic characters,
 250  *        <tt>'A'</tt>&nbsp;through&nbsp;<tt>'Z'</tt>
 251  *        and <tt>'a'</tt>&nbsp;through&nbsp;<tt>'z'</tt></td></tr>
 252  *   <tr><th valign=top><i>digit</i></th>
 253  *       <td>The US-ASCII decimal digit characters,
 254  *       <tt>'0'</tt>&nbsp;through&nbsp;<tt>'9'</tt></td></tr>
 255  *   <tr><th valign=top><i>alphanum</i></th>
 256  *       <td>All <i>alpha</i> and <i>digit</i> characters</td></tr>
 257  *   <tr><th valign=top><i>unreserved</i>&nbsp;&nbsp;&nbsp;&nbsp;</th>
 258  *       <td>All <i>alphanum</i> characters together with those in the string
 259  *        <tt>"_-!.~'()*"</tt></td></tr>
 260  *   <tr><th valign=top><i>punct</i></th>
 261  *       <td>The characters in the string <tt>",;:$&amp;+="</tt></td></tr>
 262  *   <tr><th valign=top><i>reserved</i></th>
 263  *       <td>All <i>punct</i> characters together with those in the string
 264  *        <tt>"?/[]@"</tt></td></tr>
 265  *   <tr><th valign=top><i>escaped</i></th>
 266  *       <td>Escaped octets, that is, triplets consisting of the percent
 267  *           character (<tt>'%'</tt>) followed by two hexadecimal digits
 268  *           (<tt>'0'</tt>-<tt>'9'</tt>, <tt>'A'</tt>-<tt>'F'</tt>, and
 269  *           <tt>'a'</tt>-<tt>'f'</tt>)</td></tr>
 270  *   <tr><th valign=top><i>other</i></th>
 271  *       <td>The Unicode characters that are not in the US-ASCII character set,
 272  *           are not control characters (according to the {@link
 273  *           java.lang.Character#isISOControl(char) Character.isISOControl}
 274  *           method), and are not space characters (according to the {@link
 275  *           java.lang.Character#isSpaceChar(char) Character.isSpaceChar}
 276  *           method)&nbsp;&nbsp;<i>(<b>Deviation from RFC 2396</b>, which is
 277  *           limited to US-ASCII)</i></td></tr>
 278  * </table></blockquote>
 279  *
 280  * <p><a name="legal-chars"></a> The set of all legal URI characters consists of
 281  * the <i>unreserved</i>, <i>reserved</i>, <i>escaped</i>, and <i>other</i>