jdk Cdiff src/share/classes/java/util/regex/Pattern.java

src/share/classes/java/util/regex/Pattern.java


*** 43,54 ****
  /**
   * A compiled representation of a regular expression.
   *
   * <p> A regular expression, specified as a string, must first be compiled into
   * an instance of this class.  The resulting pattern can then be used to create
!  * a {@link Matcher} object that can match arbitrary {@link
!  * java.lang.CharSequence </code>character sequences<code>} against the regular
   * expression.  All of the state involved in performing a match resides in the
   * matcher, so many matchers can share the same pattern.
   *
   * <p> A typical invocation sequence is thus
   *
--- 43,54 ----
  /**
   * A compiled representation of a regular expression.
   *
   * <p> A regular expression, specified as a string, must first be compiled into
   * an instance of this class.  The resulting pattern can then be used to create
!  * a {@link Matcher} object that can match arbitrary {@linkplain
!  * java.lang.CharSequence character sequences} against the regular
   * expression.  All of the state involved in performing a match resides in the
   * matcher, so many matchers can share the same pattern.
   *
   * <p> A typical invocation sequence is thus
   *
*** 71,89 ****
   * <p> Instances of this class are immutable and are safe for use by multiple
   * concurrent threads.  Instances of the {@link Matcher} class are not safe for
   * such use.
   *
   *
!  * <a name="sum">
!  * <h4> Summary of regular-expression constructs </h4>
   *
   * <table border="0" cellpadding="1" cellspacing="0"
   *  summary="Regular expression constructs, and what they match">
   *
   * <tr align="left">
!  * <th bgcolor="#CCCCFF" align="left" id="construct">Construct</th>
!  * <th bgcolor="#CCCCFF" align="left" id="matches">Matches</th>
   * </tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2" id="characters">Characters</th></tr>
   *
--- 71,88 ----
   * <p> Instances of this class are immutable and are safe for use by multiple
   * concurrent threads.  Instances of the {@link Matcher} class are not safe for
   * such use.
   *
   *
!  * <h3><a name="sum">Summary of regular-expression constructs</a></h3>
   *
   * <table border="0" cellpadding="1" cellspacing="0"
   *  summary="Regular expression constructs, and what they match">
   *
   * <tr align="left">
!  * <th align="left" id="construct">Construct</th>
!  * <th align="left" id="matches">Matches</th>
   * </tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2" id="characters">Characters</th></tr>
   *
*** 126,153 ****
   *     <td headers="matches">The control character corresponding to <i>x</i></td></tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2" id="classes">Character classes</th></tr>
   *
!  * <tr><td valign="top" headers="construct classes"><tt>[abc]</tt></td>
!  *     <td headers="matches"><tt>a</tt>, <tt>b</tt>, or <tt>c</tt> (simple class)</td></tr>
!  * <tr><td valign="top" headers="construct classes"><tt>[^abc]</tt></td>
!  *     <td headers="matches">Any character except <tt>a</tt>, <tt>b</tt>, or <tt>c</tt> (negation)</td></tr>
!  * <tr><td valign="top" headers="construct classes"><tt>[a-zA-Z]</tt></td>
!  *     <td headers="matches"><tt>a</tt> through <tt>z</tt>
!  *         or <tt>A</tt> through <tt>Z</tt>, inclusive (range)</td></tr>
!  * <tr><td valign="top" headers="construct classes"><tt>[a-d[m-p]]</tt></td>
!  *     <td headers="matches"><tt>a</tt> through <tt>d</tt>,
!  *      or <tt>m</tt> through <tt>p</tt>: <tt>[a-dm-p]</tt> (union)</td></tr>
!  * <tr><td valign="top" headers="construct classes"><tt>[a-z&&[def]]</tt></td>
!  *     <td headers="matches"><tt>d</tt>, <tt>e</tt>, or <tt>f</tt> (intersection)</tr>
!  * <tr><td valign="top" headers="construct classes"><tt>[a-z&&[^bc]]</tt></td>
!  *     <td headers="matches"><tt>a</tt> through <tt>z</tt>,
!  *         except for <tt>b</tt> and <tt>c</tt>: <tt>[ad-z]</tt> (subtraction)</td></tr>
!  * <tr><td valign="top" headers="construct classes"><tt>[a-z&&[^m-p]]</tt></td>
!  *     <td headers="matches"><tt>a</tt> through <tt>z</tt>,
!  *          and not <tt>m</tt> through <tt>p</tt>: <tt>[a-lq-z]</tt>(subtraction)</td></tr>
   * <tr><th>&nbsp;</th></tr>
   *
   * <tr align="left"><th colspan="2" id="predef">Predefined character classes</th></tr>
   *
   * <tr><td valign="top" headers="construct predef"><tt>.</tt></td>
--- 125,152 ----
   *     <td headers="matches">The control character corresponding to <i>x</i></td></tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2" id="classes">Character classes</th></tr>
   *
!  * <tr><td valign="top" headers="construct classes">{@code [abc]}</td>
!  *     <td headers="matches">{@code a}, {@code b}, or {@code c} (simple class)</td></tr>
!  * <tr><td valign="top" headers="construct classes">{@code [^abc]}</td>
!  *     <td headers="matches">Any character except {@code a}, {@code b}, or {@code c} (negation)</td></tr>
!  * <tr><td valign="top" headers="construct classes">{@code [a-zA-Z]}</td>
!  *     <td headers="matches">{@code a} through {@code z}
!  *         or {@code A} through {@code Z}, inclusive (range)</td></tr>
!  * <tr><td valign="top" headers="construct classes">{@code [a-d[m-p]]}</td>
!  *     <td headers="matches">{@code a} through {@code d},
!  *      or {@code m} through {@code p}: {@code [a-dm-p]} (union)</td></tr>
!  * <tr><td valign="top" headers="construct classes">{@code [a-z&&[def]]}</td>
!  *     <td headers="matches">{@code d}, {@code e}, or {@code f} (intersection)</tr>
!  * <tr><td valign="top" headers="construct classes">{@code [a-z&&[^bc]]}</td>
!  *     <td headers="matches">{@code a} through {@code z},
!  *         except for {@code b} and {@code c}: {@code [ad-z]} (subtraction)</td></tr>
!  * <tr><td valign="top" headers="construct classes">{@code [a-z&&[^m-p]]}</td>
!  *     <td headers="matches">{@code a} through {@code z},
!  *          and not {@code m} through {@code p}: {@code [a-lq-z]}(subtraction)</td></tr>
   * <tr><th>&nbsp;</th></tr>
   *
   * <tr align="left"><th colspan="2" id="predef">Predefined character classes</th></tr>
   *
   * <tr><td valign="top" headers="construct predef"><tt>.</tt></td>
*** 173,212 ****
   * <tr><td valign="top" headers="construct predef"><tt>\w</tt></td>
   *     <td headers="matches">A word character: <tt>[a-zA-Z_0-9]</tt></td></tr>
   * <tr><td valign="top" headers="construct predef"><tt>\W</tt></td>
   *     <td headers="matches">A non-word character: <tt>[^\w]</tt></td></tr>
   * <tr><th>&nbsp;</th></tr>
!  * <tr align="left"><th colspan="2" id="posix">POSIX character classes</b> (US-ASCII only)<b></th></tr>
   *
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Lower}</tt></td>
!  *     <td headers="matches">A lower-case alphabetic character: <tt>[a-z]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Upper}</tt></td>
!  *     <td headers="matches">An upper-case alphabetic character:<tt>[A-Z]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{ASCII}</tt></td>
!  *     <td headers="matches">All ASCII:<tt>[\x00-\x7F]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Alpha}</tt></td>
!  *     <td headers="matches">An alphabetic character:<tt>[\p{Lower}\p{Upper}]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Digit}</tt></td>
!  *     <td headers="matches">A decimal digit: <tt>[0-9]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Alnum}</tt></td>
!  *     <td headers="matches">An alphanumeric character:<tt>[\p{Alpha}\p{Digit}]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Punct}</tt></td>
!  *     <td headers="matches">Punctuation: One of <tt>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</tt></td></tr>
!  *     <!-- <tt>[\!"#\$%&'\(\)\*\+,\-\./:;\<=\>\?@\[\\\]\^_`\{\|\}~]</tt>
!  *          <tt>[\X21-\X2F\X31-\X40\X5B-\X60\X7B-\X7E]</tt> -->
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Graph}</tt></td>
!  *     <td headers="matches">A visible character: <tt>[\p{Alnum}\p{Punct}]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Print}</tt></td>
!  *     <td headers="matches">A printable character: <tt>[\p{Graph}\x20]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Blank}</tt></td>
!  *     <td headers="matches">A space or a tab: <tt>[ \t]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Cntrl}</tt></td>
!  *     <td headers="matches">A control character: <tt>[\x00-\x1F\x7F]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{XDigit}</tt></td>
!  *     <td headers="matches">A hexadecimal digit: <tt>[0-9a-fA-F]</tt></td></tr>
!  * <tr><td valign="top" headers="construct posix"><tt>\p{Space}</tt></td>
!  *     <td headers="matches">A whitespace character: <tt>[ \t\n\x0B\f\r]</tt></td></tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2">java.lang.Character classes (simple <a href="#jcc">java character type</a>)</th></tr>
   *
   * <tr><td valign="top"><tt>\p{javaLowerCase}</tt></td>
--- 172,211 ----
   * <tr><td valign="top" headers="construct predef"><tt>\w</tt></td>
   *     <td headers="matches">A word character: <tt>[a-zA-Z_0-9]</tt></td></tr>
   * <tr><td valign="top" headers="construct predef"><tt>\W</tt></td>
   *     <td headers="matches">A non-word character: <tt>[^\w]</tt></td></tr>
   * <tr><th>&nbsp;</th></tr>
!  * <tr align="left"><th colspan="2" id="posix"><b>POSIX character classes (US-ASCII only)</b></th></tr>
   *
!  * <tr><td valign="top" headers="construct posix">{@code \p{Lower}}</td>
!  *     <td headers="matches">A lower-case alphabetic character: {@code [a-z]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Upper}}</td>
!  *     <td headers="matches">An upper-case alphabetic character:{@code [A-Z]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{ASCII}}</td>
!  *     <td headers="matches">All ASCII:{@code [\x00-\x7F]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Alpha}}</td>
!  *     <td headers="matches">An alphabetic character:{@code [\p{Lower}\p{Upper}]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Digit}}</td>
!  *     <td headers="matches">A decimal digit: {@code [0-9]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Alnum}}</td>
!  *     <td headers="matches">An alphanumeric character:{@code [\p{Alpha}\p{Digit}]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Punct}}</td>
!  *     <td headers="matches">Punctuation: One of {@code !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~}</td></tr>
!  *     <!-- {@code [\!"#\$%&'\(\)\*\+,\-\./:;\<=\>\?@\[\\\]\^_`\{\|\}~]}
!  *          {@code [\X21-\X2F\X31-\X40\X5B-\X60\X7B-\X7E]} -->
!  * <tr><td valign="top" headers="construct posix">{@code \p{Graph}}</td>
!  *     <td headers="matches">A visible character: {@code [\p{Alnum}\p{Punct}]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Print}}</td>
!  *     <td headers="matches">A printable character: {@code [\p{Graph}\x20]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Blank}}</td>
!  *     <td headers="matches">A space or a tab: {@code [ \t]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Cntrl}}</td>
!  *     <td headers="matches">A control character: {@code [\x00-\x1F\x7F]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{XDigit}}</td>
!  *     <td headers="matches">A hexadecimal digit: {@code [0-9a-fA-F]}</td></tr>
!  * <tr><td valign="top" headers="construct posix">{@code \p{Space}}</td>
!  *     <td headers="matches">A whitespace character: {@code [ \t\n\x0B\f\r]}</td></tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2">java.lang.Character classes (simple <a href="#jcc">java character type</a>)</th></tr>
   *
   * <tr><td valign="top"><tt>\p{javaLowerCase}</tt></td>
*** 218,240 ****
   * <tr><td valign="top"><tt>\p{javaMirrored}</tt></td>
   *     <td>Equivalent to java.lang.Character.isMirrored()</td></tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2" id="unicode">Classes for Unicode scripts, blocks, categories and binary properties</th></tr>
!  * * <tr><td valign="top" headers="construct unicode"><tt>\p{IsLatin}</tt></td>
   *     <td headers="matches">A Latin&nbsp;script character (<a href="#usc">script</a>)</td></tr>
!  * <tr><td valign="top" headers="construct unicode"><tt>\p{InGreek}</tt></td>
   *     <td headers="matches">A character in the Greek&nbsp;block (<a href="#ubc">block</a>)</td></tr>
!  * <tr><td valign="top" headers="construct unicode"><tt>\p{Lu}</tt></td>
   *     <td headers="matches">An uppercase letter (<a href="#ucc">category</a>)</td></tr>
!  * <tr><td valign="top" headers="construct unicode"><tt>\p{IsAlphabetic}</tt></td>
   *     <td headers="matches">An alphabetic character (<a href="#ubpc">binary property</a>)</td></tr>
!  * <tr><td valign="top" headers="construct unicode"><tt>\p{Sc}</tt></td>
   *     <td headers="matches">A currency symbol</td></tr>
!  * <tr><td valign="top" headers="construct unicode"><tt>\P{InGreek}</tt></td>
   *     <td headers="matches">Any character except one in the Greek block (negation)</td></tr>
!  * <tr><td valign="top" headers="construct unicode"><tt>[\p{L}&&[^\p{Lu}]]&nbsp;</tt></td>
   *     <td headers="matches">Any letter except an uppercase letter (subtraction)</td></tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2" id="bounds">Boundary matchers</th></tr>
   *
--- 217,239 ----
   * <tr><td valign="top"><tt>\p{javaMirrored}</tt></td>
   *     <td>Equivalent to java.lang.Character.isMirrored()</td></tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2" id="unicode">Classes for Unicode scripts, blocks, categories and binary properties</th></tr>
!  * * <tr><td valign="top" headers="construct unicode">{@code \p{IsLatin}}</td>
   *     <td headers="matches">A Latin&nbsp;script character (<a href="#usc">script</a>)</td></tr>
!  * <tr><td valign="top" headers="construct unicode">{@code \p{InGreek}}</td>
   *     <td headers="matches">A character in the Greek&nbsp;block (<a href="#ubc">block</a>)</td></tr>
!  * <tr><td valign="top" headers="construct unicode">{@code \p{Lu}}</td>
   *     <td headers="matches">An uppercase letter (<a href="#ucc">category</a>)</td></tr>
!  * <tr><td valign="top" headers="construct unicode">{@code \p{IsAlphabetic}}</td>
   *     <td headers="matches">An alphabetic character (<a href="#ubpc">binary property</a>)</td></tr>
!  * <tr><td valign="top" headers="construct unicode">{@code \p{Sc}}</td>
   *     <td headers="matches">A currency symbol</td></tr>
!  * <tr><td valign="top" headers="construct unicode">{@code \P{InGreek}}</td>
   *     <td headers="matches">Any character except one in the Greek block (negation)</td></tr>
!  * <tr><td valign="top" headers="construct unicode">{@code [\p{L}&&[^\p{Lu}]]}</td>
   *     <td headers="matches">Any letter except an uppercase letter (subtraction)</td></tr>
   *
   * <tr><th>&nbsp;</th></tr>
   * <tr align="left"><th colspan="2" id="bounds">Boundary matchers</th></tr>
   *
*** 374,385 ****
   * </table>
   *
   * <hr>
   *
   *
!  * <a name="bs">
!  * <h4> Backslashes, escapes, and quoting </h4>
   *
   * <p> The backslash character (<tt>'\'</tt>) serves to introduce escaped
   * constructs, as defined in the table above, as well as to quote characters
   * that otherwise would be interpreted as unescaped constructs.  Thus the
   * expression <tt>\\</tt> matches a single backslash and <tt>\{</tt> matches a
--- 373,383 ----
   * </table>
   *
   * <hr>
   *
   *
!  * <h3><a name="bs">Backslashes, escapes, and quoting</a></h3>
   *
   * <p> The backslash character (<tt>'\'</tt>) serves to introduce escaped
   * constructs, as defined in the table above, as well as to quote characters
   * that otherwise would be interpreted as unescaped constructs.  Thus the
   * expression <tt>\\</tt> matches a single backslash and <tt>\{</tt> matches a
*** 403,414 ****
   * word boundary.  The string literal <tt>"&#92;(hello&#92;)"</tt> is illegal
   * and leads to a compile-time error; in order to match the string
   * <tt>(hello)</tt> the string literal <tt>"&#92;&#92;(hello&#92;&#92;)"</tt>
   * must be used.
   *
!  * <a name="cc">
!  * <h4> Character Classes </h4>
   *
   *    <p> Character classes may appear within other character classes, and
   *    may be composed by the union operator (implicit) and the intersection
   *    operator (<tt>&amp;&amp;</tt>).
   *    The union operator denotes a class that contains every character that is
--- 401,411 ----
   * word boundary.  The string literal <tt>"&#92;(hello&#92;)"</tt> is illegal
   * and leads to a compile-time error; in order to match the string
   * <tt>(hello)</tt> the string literal <tt>"&#92;&#92;(hello&#92;&#92;)"</tt>
   * must be used.
   *
!  * <h3><a name="cc">Character Classes</a></h3>
   *
   *    <p> Character classes may appear within other character classes, and
   *    may be composed by the union operator (implicit) and the intersection
   *    operator (<tt>&amp;&amp;</tt>).
   *    The union operator denotes a class that contains every character that is
*** 433,453 ****
   *      <tr><th>4&nbsp;&nbsp;&nbsp;&nbsp;</th>
   *        <td>Union</td>
   *        <td><tt>[a-e][i-u]</tt></td></tr>
   *      <tr><th>5&nbsp;&nbsp;&nbsp;&nbsp;</th>
   *        <td>Intersection</td>
!  *        <td><tt>[a-z&&[aeiou]]</tt></td></tr>
   *    </table></blockquote>
   *
   *    <p> Note that a different set of metacharacters are in effect inside
   *    a character class than outside a character class. For instance, the
   *    regular expression <tt>.</tt> loses its special meaning inside a
   *    character class, while the expression <tt>-</tt> becomes a range
   *    forming metacharacter.
   *
!  * <a name="lt">
!  * <h4> Line terminators </h4>
   *
   * <p> A <i>line terminator</i> is a one- or two-character sequence that marks
   * the end of a line of the input character sequence.  The following are
   * recognized as line terminators:
   *
--- 430,449 ----
   *      <tr><th>4&nbsp;&nbsp;&nbsp;&nbsp;</th>
   *        <td>Union</td>
   *        <td><tt>[a-e][i-u]</tt></td></tr>
   *      <tr><th>5&nbsp;&nbsp;&nbsp;&nbsp;</th>
   *        <td>Intersection</td>
!  *        <td>{@code [a-z&&[aeiou]]}</td></tr>
   *    </table></blockquote>
   *
   *    <p> Note that a different set of metacharacters are in effect inside
   *    a character class than outside a character class. For instance, the
   *    regular expression <tt>.</tt> loses its special meaning inside a
   *    character class, while the expression <tt>-</tt> becomes a range
   *    forming metacharacter.
   *
!  * <h3><a name="lt">Line terminators</a></h3>
   *
   * <p> A <i>line terminator</i> is a one- or two-character sequence that marks
   * the end of a line of the input character sequence.  The following are
   * recognized as line terminators:
   *
*** 478,492 ****
   * of the entire input sequence. If {@link #MULTILINE} mode is activated then
   * <tt>^</tt> matches at the beginning of input and after any line terminator
   * except at the end of input. When in {@link #MULTILINE} mode <tt>$</tt>
   * matches just before a line terminator or the end of the input sequence.
   *
!  * <a name="cg">
!  * <h4> Groups and capturing </h4>
   *
!  * <a name="gnumber">
!  * <h5> Group number </h5>
   * <p> Capturing groups are numbered by counting their opening parentheses from
   * left to right.  In the expression <tt>((A)(B(C)))</tt>, for example, there
   * are four such groups: </p>
   *
   * <blockquote><table cellpadding=1 cellspacing=0 summary="Capturing group numberings">
--- 474,486 ----
   * of the entire input sequence. If {@link #MULTILINE} mode is activated then
   * <tt>^</tt> matches at the beginning of input and after any line terminator
   * except at the end of input. When in {@link #MULTILINE} mode <tt>$</tt>
   * matches just before a line terminator or the end of the input sequence.
   *
!  * <h3><a name="cg">Groups and capturing</a></h3>
   *
!  * <h4><a name="gnumber">Group number</a></h4>
   * <p> Capturing groups are numbered by counting their opening parentheses from
   * left to right.  In the expression <tt>((A)(B(C)))</tt>, for example, there
   * are four such groups: </p>
   *
   * <blockquote><table cellpadding=1 cellspacing=0 summary="Capturing group numberings">
*** 505,516 ****
   * <p> Capturing groups are so named because, during a match, each subsequence
   * of the input sequence that matches such a group is saved.  The captured
   * subsequence may be used later in the expression, via a back reference, and
   * may also be retrieved from the matcher once the match operation is complete.
   *
!  * <a name="groupname">
!  * <h5> Group name </h5>
   * <p>A capturing group can also be assigned a "name", a <tt>named-capturing group</tt>,
   * and then be back-referenced later by the "name". Group names are composed of
   * the following characters. The first character must be a <tt>letter</tt>.
   *
   * <ul>
--- 499,509 ----
   * <p> Capturing groups are so named because, during a match, each subsequence
   * of the input sequence that matches such a group is saved.  The captured
   * subsequence may be used later in the expression, via a back reference, and
   * may also be retrieved from the matcher once the match operation is complete.
   *
!  * <h4><a name="groupname">Group name</a></h4>
   * <p>A capturing group can also be assigned a "name", a <tt>named-capturing group</tt>,
   * and then be back-referenced later by the "name". Group names are composed of
   * the following characters. The first character must be a <tt>letter</tt>.
   *
   * <ul>
*** 535,545 ****
   *
   * <p> Groups beginning with <tt>(?</tt> are either pure, <i>non-capturing</i> groups
   * that do not capture text and do not count towards the group total, or
   * <i>named-capturing</i> group.
   *
!  * <h4> Unicode support </h4>
   *
   * <p> This class is in conformance with Level 1 of <a
   * href="http://www.unicode.org/reports/tr18/"><i>Unicode Technical
   * Standard #18: Unicode Regular Expression</i></a>, plus RL2.1
   * Canonical Equivalents.
--- 528,538 ----
   *
   * <p> Groups beginning with <tt>(?</tt> are either pure, <i>non-capturing</i> groups
   * that do not capture text and do not count towards the group total, or
   * <i>named-capturing</i> group.
   *
!  * <h3> Unicode support </h3>
   *
   * <p> This class is in conformance with Level 1 of <a
   * href="http://www.unicode.org/reports/tr18/"><i>Unicode Technical
   * Standard #18: Unicode Regular Expression</i></a>, plus RL2.1
   * Canonical Equivalents.
*** 566,596 ****
   * the input has the property <i>prop</i>, while <tt>\P{</tt><i>prop</i><tt>}</tt>
   * does not match if the input has that property.
   * <p>
   * Scripts, blocks, categories and binary properties can be used both inside
   * and outside of a character class.
!  * <a name="usc">
   * <p>
!  * <b>Scripts</b> are specified either with the prefix {@code Is}, as in
   * {@code IsHiragana}, or by using  the {@code script} keyword (or its short
   * form {@code sc})as in {@code script=Hiragana} or {@code sc=Hiragana}.
   * <p>
   * The script names supported by <code>Pattern</code> are the valid script names
   * accepted and defined by
   * {@link java.lang.Character.UnicodeScript#forName(String) UnicodeScript.forName}.
!  * <a name="ubc">
   * <p>
!  * <b>Blocks</b> are specified with the prefix {@code In}, as in
   * {@code InMongolian}, or by using the keyword {@code block} (or its short
   * form {@code blk}) as in {@code block=Mongolian} or {@code blk=Mongolian}.
   * <p>
   * The block names supported by <code>Pattern</code> are the valid block names
   * accepted and defined by
   * {@link java.lang.Character.UnicodeBlock#forName(String) UnicodeBlock.forName}.
   * <p>
!  * <a name="ucc">
!  * <b>Categories</b> may be specified with the optional prefix {@code Is}:
   * Both {@code \p{L}} and {@code \p{IsL}} denote the category of Unicode
   * letters. Same as scripts and blocks, categories can also be specified
   * by using the keyword {@code general_category} (or its short form
   * {@code gc}) as in {@code general_category=Lu} or {@code gc=Lu}.
   * <p>
--- 559,589 ----
   * the input has the property <i>prop</i>, while <tt>\P{</tt><i>prop</i><tt>}</tt>
   * does not match if the input has that property.
   * <p>
   * Scripts, blocks, categories and binary properties can be used both inside
   * and outside of a character class.
!  *
   * <p>
!  * <b><a name="usc">Scripts</a></b> are specified either with the prefix {@code Is}, as in
   * {@code IsHiragana}, or by using  the {@code script} keyword (or its short
   * form {@code sc})as in {@code script=Hiragana} or {@code sc=Hiragana}.
   * <p>
   * The script names supported by <code>Pattern</code> are the valid script names
   * accepted and defined by
   * {@link java.lang.Character.UnicodeScript#forName(String) UnicodeScript.forName}.
!  *
   * <p>
!  * <b><a name="ubc">Blocks</a></b> are specified with the prefix {@code In}, as in
   * {@code InMongolian}, or by using the keyword {@code block} (or its short
   * form {@code blk}) as in {@code block=Mongolian} or {@code blk=Mongolian}.
   * <p>
   * The block names supported by <code>Pattern</code> are the valid block names
   * accepted and defined by
   * {@link java.lang.Character.UnicodeBlock#forName(String) UnicodeBlock.forName}.
   * <p>
!  *
!  * <b><a name="ucc">Categories</a></b> may be specified with the optional prefix {@code Is}:
   * Both {@code \p{L}} and {@code \p{IsL}} denote the category of Unicode
   * letters. Same as scripts and blocks, categories can also be specified
   * by using the keyword {@code general_category} (or its short form
   * {@code gc}) as in {@code general_category=Lu} or {@code gc=Lu}.
   * <p>
*** 598,609 ****
   * <a href="http://www.unicode.org/unicode/standard/standard.html">
   * <i>The Unicode Standard</i></a> in the version specified by the
   * {@link java.lang.Character Character} class. The category names are those
   * defined in the Standard, both normative and informative.
   * <p>
!  * <a name="ubpc">
!  * <b>Binary properties</b> are specified with the prefix {@code Is}, as in
   * {@code IsAlphabetic}. The supported binary properties by <code>Pattern</code>
   * are
   * <ul>
   *   <li> Alphabetic
   *   <li> Ideographic
--- 591,602 ----
   * <a href="http://www.unicode.org/unicode/standard/standard.html">
   * <i>The Unicode Standard</i></a> in the version specified by the
   * {@link java.lang.Character Character} class. The category names are those
   * defined in the Standard, both normative and informative.
   * <p>
!  *
!  * <b><a name="ubpc">Binary properties</a></b> are specified with the prefix {@code Is}, as in
   * {@code IsAlphabetic}. The supported binary properties by <code>Pattern</code>
   * are
   * <ul>
   *   <li> Alphabetic
   *   <li> Ideographic
*** 627,638 ****
   * </i></a>, when {@link #UNICODE_CHARACTER_CLASS} flag is specified.
   * <p>
   * <table border="0" cellpadding="1" cellspacing="0"
   *  summary="predefined and posix character classes in Unicode mode">
   * <tr align="left">
!  * <th bgcolor="#CCCCFF" align="left" id="classes">Classes</th>
!  * <th bgcolor="#CCCCFF" align="left" id="matches">Matches</th>
   *</tr>
   * <tr><td><tt>\p{Lower}</tt></td>
   *     <td>A lowercase character:<tt>\p{IsLowercase}</tt></td></tr>
   * <tr><td><tt>\p{Upper}</tt></td>
   *     <td>An uppercase character:<tt>\p{IsUppercase}</tt></td></tr>
--- 620,631 ----
   * </i></a>, when {@link #UNICODE_CHARACTER_CLASS} flag is specified.
   * <p>
   * <table border="0" cellpadding="1" cellspacing="0"
   *  summary="predefined and posix character classes in Unicode mode">
   * <tr align="left">
!  * <th align="left" id="predef_classes">Classes</th>
!  * <th align="left" id="predef_matches">Matches</th>
   *</tr>
   * <tr><td><tt>\p{Lower}</tt></td>
   *     <td>A lowercase character:<tt>\p{IsLowercase}</tt></td></tr>
   * <tr><td><tt>\p{Upper}</tt></td>
   *     <td>An uppercase character:<tt>\p{IsUppercase}</tt></td></tr>
*** 647,659 ****
   * <tr><td><tt>\p{Punct}</tt></td>
   *     <td>A punctuation character:<tt>p{IsPunctuation}</tt></td></tr>
   * <tr><td><tt>\p{Graph}</tt></td>
   *     <td>A visible character: <tt>[^\p{IsWhite_Space}\p{gc=Cc}\p{gc=Cs}\p{gc=Cn}]</tt></td></tr>
   * <tr><td><tt>\p{Print}</tt></td>
!  *     <td>A printable character: <tt>[\p{Graph}\p{Blank}&&[^\p{Cntrl}]]</tt></td></tr>
   * <tr><td><tt>\p{Blank}</tt></td>
!  *     <td>A space or a tab: <tt>[\p{IsWhite_Space}&&[^\p{gc=Zl}\p{gc=Zp}\x0a\x0b\x0c\x0d\x85]]</tt></td></tr>
   * <tr><td><tt>\p{Cntrl}</tt></td>
   *     <td>A control character: <tt>\p{gc=Cc}</tt></td></tr>
   * <tr><td><tt>\p{XDigit}</tt></td>
   *     <td>A hexadecimal digit: <tt>[\p{gc=Nd}\p{IsHex_Digit}]</tt></td></tr>
   * <tr><td><tt>\p{Space}</tt></td>
--- 640,652 ----
   * <tr><td><tt>\p{Punct}</tt></td>
   *     <td>A punctuation character:<tt>p{IsPunctuation}</tt></td></tr>
   * <tr><td><tt>\p{Graph}</tt></td>
   *     <td>A visible character: <tt>[^\p{IsWhite_Space}\p{gc=Cc}\p{gc=Cs}\p{gc=Cn}]</tt></td></tr>
   * <tr><td><tt>\p{Print}</tt></td>
!  *     <td>A printable character: {@code [\p{Graph}\p{Blank}&&[^\p{Cntrl}]]}</td></tr>
   * <tr><td><tt>\p{Blank}</tt></td>
!  *     <td>A space or a tab: {@code [\p{IsWhite_Space}&&[^\p{gc=Zl}\p{gc=Zp}\x0a\x0b\x0c\x0d\x85]]}</td></tr>
   * <tr><td><tt>\p{Cntrl}</tt></td>
   *     <td>A control character: <tt>\p{gc=Cc}</tt></td></tr>
   * <tr><td><tt>\p{XDigit}</tt></td>
   *     <td>A hexadecimal digit: <tt>[\p{gc=Nd}\p{IsHex_Digit}]</tt></td></tr>
   * <tr><td><tt>\p{Space}</tt></td>
*** 674,686 ****
   * <p>
   * <a name="jcc">
   * Categories that behave like the java.lang.Character
   * boolean is<i>methodname</i> methods (except for the deprecated ones) are
   * available through the same <tt>\p{</tt><i>prop</i><tt>}</tt> syntax where
!  * the specified property has the name <tt>java<i>methodname</i></tt>.
   *
!  * <h4> Comparison to Perl 5 </h4>
   *
   * <p>The <code>Pattern</code> engine performs traditional NFA-based matching
   * with ordered alternation as occurs in Perl 5.
   *
   * <p> Perl constructs not supported by this class: </p>
--- 667,679 ----
   * <p>
   * <a name="jcc">
   * Categories that behave like the java.lang.Character
   * boolean is<i>methodname</i> methods (except for the deprecated ones) are
   * available through the same <tt>\p{</tt><i>prop</i><tt>}</tt> syntax where
!  * the specified property has the name <tt>java<i>methodname</i></tt></a>.
   *
!  * <h3> Comparison to Perl 5 </h3>
   *
   * <p>The <code>Pattern</code> engine performs traditional NFA-based matching
   * with ordered alternation as occurs in Perl 5.
   *
   * <p> Perl constructs not supported by this class: </p>
*** 1021,1045 ****
       * (2) There is complement node of Category or Block
       */
      private transient boolean hasSupplementary;
  
      /**
!      * Compiles the given regular expression into a pattern.  </p>
       *
       * @param  regex
       *         The expression to be compiled
!      *
       * @throws  PatternSyntaxException
       *          If the expression's syntax is invalid
       */
      public static Pattern compile(String regex) {
          return new Pattern(regex, 0);
      }
  
      /**
       * Compiles the given regular expression into a pattern with the given
!      * flags.  </p>
       *
       * @param  regex
       *         The expression to be compiled
       *
       * @param  flags
--- 1014,1038 ----
       * (2) There is complement node of Category or Block
       */
      private transient boolean hasSupplementary;
  
      /**
!      * Compiles the given regular expression into a pattern.
       *
       * @param  regex
       *         The expression to be compiled
!      * @return the given regular expression compiled into a pattern
       * @throws  PatternSyntaxException
       *          If the expression's syntax is invalid
       */
      public static Pattern compile(String regex) {
          return new Pattern(regex, 0);
      }
  
      /**
       * Compiles the given regular expression into a pattern with the given
!      * flags.
       *
       * @param  regex
       *         The expression to be compiled
       *
       * @param  flags
*** 1047,1056 ****
--- 1040,1050 ----
       *         {@link #CASE_INSENSITIVE}, {@link #MULTILINE}, {@link #DOTALL},
       *         {@link #UNICODE_CASE}, {@link #CANON_EQ}, {@link #UNIX_LINES},
       *         {@link #LITERAL}, {@link #UNICODE_CHARACTER_CLASS}
       *         and {@link #COMMENTS}
       *
+      * @return the given regular expression compiled into a pattern with the given flags
       * @throws  IllegalArgumentException
       *          If bit values other than those corresponding to the defined
       *          match flags are set in <tt>flags</tt>
       *
       * @throws  PatternSyntaxException
*** 1060,1070 ****
          return new Pattern(regex, flags);
      }
  
      /**
       * Returns the regular expression from which this pattern was compiled.
-      * </p>
       *
       * @return  The source of this pattern
       */
      public String pattern() {
          return pattern;
--- 1054,1063 ----
*** 1082,1092 ****
          return pattern;
      }
  
      /**
       * Creates a matcher that will match the given input against this pattern.
-      * </p>
       *
       * @param  input
       *         The character sequence to be matched
       *
       * @return  A new matcher for this pattern
--- 1075,1084 ----
*** 1101,1111 ****
          Matcher m = new Matcher(this, input);
          return m;
      }
  
      /**
!      * Returns this pattern's match flags.  </p>
       *
       * @return  The match flags specified when this pattern was compiled
       */
      public int flags() {
          return flags;
--- 1093,1103 ----
          Matcher m = new Matcher(this, input);
          return m;
      }
  
      /**
!      * Returns this pattern's match flags.
       *
       * @return  The match flags specified when this pattern was compiled
       */
      public int flags() {
          return flags;
*** 1131,1141 ****
       * @param  regex
       *         The expression to be compiled
       *
       * @param  input
       *         The character sequence to be matched
!      *
       * @throws  PatternSyntaxException
       *          If the expression's syntax is invalid
       */
      public static boolean matches(String regex, CharSequence input) {
          Pattern p = Pattern.compile(regex);
--- 1123,1133 ----
       * @param  regex
       *         The expression to be compiled
       *
       * @param  input
       *         The character sequence to be matched
!      * @return whether or not the regular expression matches on the input
       * @throws  PatternSyntaxException
       *          If the expression's syntax is invalid
       */
      public static boolean matches(String regex, CharSequence input) {
          Pattern p = Pattern.compile(regex);
*** 1168,1180 ****
       * <p> The input <tt>"boo:and:foo"</tt>, for example, yields the following
       * results with these parameters:
       *
       * <blockquote><table cellpadding=1 cellspacing=0
       *              summary="Split examples showing regex, limit, and result">
!      * <tr><th><P align="left"><i>Regex&nbsp;&nbsp;&nbsp;&nbsp;</i></th>
!      *     <th><P align="left"><i>Limit&nbsp;&nbsp;&nbsp;&nbsp;</i></th>
!      *     <th><P align="left"><i>Result&nbsp;&nbsp;&nbsp;&nbsp;</i></th></tr>
       * <tr><td align=center>:</td>
       *     <td align=center>2</td>
       *     <td><tt>{ "boo", "and:foo" }</tt></td></tr>
       * <tr><td align=center>:</td>
       *     <td align=center>5</td>
--- 1160,1172 ----
       * <p> The input <tt>"boo:and:foo"</tt>, for example, yields the following
       * results with these parameters:
       *
       * <blockquote><table cellpadding=1 cellspacing=0
       *              summary="Split examples showing regex, limit, and result">
!      * <tr><th align="left"><i>Regex&nbsp;&nbsp;&nbsp;&nbsp;</i></th>
!      *     <th align="left"><i>Limit&nbsp;&nbsp;&nbsp;&nbsp;</i></th>
!      *     <th align="left"><i>Result&nbsp;&nbsp;&nbsp;&nbsp;</i></th></tr>
       * <tr><td align=center>:</td>
       *     <td align=center>2</td>
       *     <td><tt>{ "boo", "and:foo" }</tt></td></tr>
       * <tr><td align=center>:</td>
       *     <td align=center>5</td>
*** 1251,1262 ****
       * <p> The input <tt>"boo:and:foo"</tt>, for example, yields the following
       * results with these expressions:
       *
       * <blockquote><table cellpadding=1 cellspacing=0
       *              summary="Split examples showing regex and result">
!      * <tr><th><P align="left"><i>Regex&nbsp;&nbsp;&nbsp;&nbsp;</i></th>
!      *     <th><P align="left"><i>Result</i></th></tr>
       * <tr><td align=center>:</td>
       *     <td><tt>{ "boo", "and", "foo" }</tt></td></tr>
       * <tr><td align=center>o</td>
       *     <td><tt>{ "b", "", ":and:f" }</tt></td></tr>
       * </table></blockquote>
--- 1243,1254 ----
       * <p> The input <tt>"boo:and:foo"</tt>, for example, yields the following
       * results with these expressions:
       *
       * <blockquote><table cellpadding=1 cellspacing=0
       *              summary="Split examples showing regex and result">
!      * <tr><th align="left"><i>Regex&nbsp;&nbsp;&nbsp;&nbsp;</i></th>
!      *     <th align="left"><i>Result</i></th></tr>
       * <tr><td align=center>:</td>
       *     <td><tt>{ "boo", "and", "foo" }</tt></td></tr>
       * <tr><td align=center>o</td>
       *     <td><tt>{ "b", "", ":and:f" }</tt></td></tr>
       * </table></blockquote>