--- old/src/share/classes/java/util/regex/MatchResult.java 2013-07-08 22:34:00.000000000 -0700 +++ new/src/share/classes/java/util/regex/MatchResult.java 2013-07-08 22:34:00.000000000 -0700 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2003, 2004, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it @@ -77,7 +77,7 @@ public int start(int group); /** - * Returns the offset after the last character matched.

+ * Returns the offset after the last character matched. * * @return The offset after the last character matched * --- old/src/share/classes/java/util/regex/Matcher.java 2013-07-08 22:34:01.000000000 -0700 +++ new/src/share/classes/java/util/regex/Matcher.java 2013-07-08 22:34:01.000000000 -0700 @@ -1,5 +1,5 @@ /* - * Copyright (c) 1999, 2012, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 1999, 2013, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it @@ -28,8 +28,8 @@ import java.util.Objects; /** - * An engine that performs match operations on a {@link java.lang.CharSequence - * character sequence} by interpreting a {@link Pattern}. + * An engine that performs match operations on a {@linkplain java.lang.CharSequence + * character sequence} by interpreting a {@link Pattern}. * *

A matcher is created from a pattern by invoking the pattern's {@link * Pattern#matcher matcher} method. Once created, a matcher can be used to @@ -330,7 +330,7 @@ } /** - * Returns the start index of the previous match.

+ * Returns the start index of the previous match. * * @return The index of the first character matched * @@ -402,7 +402,7 @@ } /** - * Returns the offset after the last character matched.

+ * Returns the offset after the last character matched. * * @return The offset after the last character matched * @@ -647,6 +647,7 @@ * invocations of the {@link #find()} method will start at the first * character not matched by this match.

* + * @param start the index to start searching for a match * @throws IndexOutOfBoundsException * If start is less than zero or if start is greater than the * length of the input sequence. @@ -736,8 +737,8 @@ * captured during the previous match: Each occurrence of * ${name} or $g * will be replaced by the result of evaluating the corresponding - * {@link #group(String) group(name)} or {@link #group(int) group(g)} - * respectively. For $g, + * {@link #group(String) group(name)} or {@link #group(int) group(g)} + * respectively. For $g, * the first number after the $ is always treated as part of * the group reference. Subsequent numbers are incorporated into g if * they would form a legal group reference. Only the numerals '0' --- old/src/share/classes/java/util/regex/Pattern.java 2013-07-08 22:34:01.000000000 -0700 +++ new/src/share/classes/java/util/regex/Pattern.java 2013-07-08 22:34:01.000000000 -0700 @@ -45,8 +45,8 @@ * *

A regular expression, specified as a string, must first be compiled into * an instance of this class. The resulting pattern can then be used to create - * a {@link Matcher} object that can match arbitrary {@link - * java.lang.CharSequence character sequences} against the regular + * a {@link Matcher} object that can match arbitrary {@linkplain + * java.lang.CharSequence character sequences} against the regular * expression. All of the state involved in performing a match resides in the * matcher, so many matchers can share the same pattern. * @@ -73,15 +73,14 @@ * such use. * * - * - *

Summary of regular-expression constructs

+ *

Summary of regular-expression constructs

* * * * - * - * + * + * * * * @@ -128,24 +127,24 @@ * * * - * - * - * - * - * - * - * - * - * - * - * - * - * - * + * + * + * + * + * + * + * + * + * + * + * + * + * + * * * * @@ -175,36 +174,36 @@ * * * - * + * * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * * * * @@ -220,19 +219,19 @@ * * * - * * + * * * - * + * * - * + * * - * + * * - * + * * - * + * * - * + * * * * @@ -376,8 +375,7 @@ *
* * - * - *

Backslashes, escapes, and quoting

+ *

Backslashes, escapes, and quoting

* *

The backslash character ('\') serves to introduce escaped * constructs, as defined in the table above, as well as to quote characters @@ -405,8 +403,7 @@ * (hello) the string literal "\\(hello\\)" * must be used. * - * - *

Character Classes

+ *

Character Classes

* *

Character classes may appear within other character classes, and * may be composed by the union operator (implicit) and the intersection @@ -435,7 +432,7 @@ *

* * - * + * *
ConstructMatchesConstructMatches
 
 
Character classes
[abc]a, b, or c (simple class)
[^abc]Any character except a, b, or c (negation)
[a-zA-Z]a through z - * or A through Z, inclusive (range)
[a-d[m-p]]a through d, - * or m through p: [a-dm-p] (union)
[a-z&&[def]]d, e, or f (intersection)
[a-z&&[^bc]]a through z, - * except for b and c: [ad-z] (subtraction)
[a-z&&[^m-p]]a through z, - * and not m through p: [a-lq-z](subtraction)
{@code [abc]}{@code a}, {@code b}, or {@code c} (simple class)
{@code [^abc]}Any character except {@code a}, {@code b}, or {@code c} (negation)
{@code [a-zA-Z]}{@code a} through {@code z} + * or {@code A} through {@code Z}, inclusive (range)
{@code [a-d[m-p]]}{@code a} through {@code d}, + * or {@code m} through {@code p}: {@code [a-dm-p]} (union)
{@code [a-z&&[def]]}{@code d}, {@code e}, or {@code f} (intersection)
{@code [a-z&&[^bc]]}{@code a} through {@code z}, + * except for {@code b} and {@code c}: {@code [ad-z]} (subtraction)
{@code [a-z&&[^m-p]]}{@code a} through {@code z}, + * and not {@code m} through {@code p}: {@code [a-lq-z]}(subtraction)
 
Predefined character classes
\WA non-word character: [^\w]
 
POSIX character classes (US-ASCII only)
POSIX character classes (US-ASCII only)
\p{Lower}A lower-case alphabetic character: [a-z]
\p{Upper}An upper-case alphabetic character:[A-Z]
\p{ASCII}All ASCII:[\x00-\x7F]
\p{Alpha}An alphabetic character:[\p{Lower}\p{Upper}]
\p{Digit}A decimal digit: [0-9]
\p{Alnum}An alphanumeric character:[\p{Alpha}\p{Digit}]
\p{Punct}Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
\p{Graph}A visible character: [\p{Alnum}\p{Punct}]
\p{Print}A printable character: [\p{Graph}\x20]
\p{Blank}A space or a tab: [ \t]
\p{Cntrl}A control character: [\x00-\x1F\x7F]
\p{XDigit}A hexadecimal digit: [0-9a-fA-F]
\p{Space}A whitespace character: [ \t\n\x0B\f\r]
{@code \p{Lower}}A lower-case alphabetic character: {@code [a-z]}
{@code \p{Upper}}An upper-case alphabetic character:{@code [A-Z]}
{@code \p{ASCII}}All ASCII:{@code [\x00-\x7F]}
{@code \p{Alpha}}An alphabetic character:{@code [\p{Lower}\p{Upper}]}
{@code \p{Digit}}A decimal digit: {@code [0-9]}
{@code \p{Alnum}}An alphanumeric character:{@code [\p{Alpha}\p{Digit}]}
{@code \p{Punct}}Punctuation: One of {@code !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~}
{@code \p{Graph}}A visible character: {@code [\p{Alnum}\p{Punct}]}
{@code \p{Print}}A printable character: {@code [\p{Graph}\x20]}
{@code \p{Blank}}A space or a tab: {@code [ \t]}
{@code \p{Cntrl}}A control character: {@code [\x00-\x1F\x7F]}
{@code \p{XDigit}}A hexadecimal digit: {@code [0-9a-fA-F]}
{@code \p{Space}}A whitespace character: {@code [ \t\n\x0B\f\r]}
 
java.lang.Character classes (simple java character type)
 
Classes for Unicode scripts, blocks, categories and binary properties
\p{IsLatin}
{@code \p{IsLatin}}A Latin script character (script)
\p{InGreek}
{@code \p{InGreek}}A character in the Greek block (block)
\p{Lu}
{@code \p{Lu}}An uppercase letter (category)
\p{IsAlphabetic}
{@code \p{IsAlphabetic}}An alphabetic character (binary property)
\p{Sc}
{@code \p{Sc}}A currency symbol
\P{InGreek}
{@code \P{InGreek}}Any character except one in the Greek block (negation)
[\p{L}&&[^\p{Lu}]] 
{@code [\p{L}&&[^\p{Lu}]]}Any letter except an uppercase letter (subtraction)
 
[a-e][i-u]
5    Intersection[a-z&&[aeiou]]
{@code [a-z&&[aeiou]]}
* *

Note that a different set of metacharacters are in effect inside @@ -444,8 +441,7 @@ * character class, while the expression - becomes a range * forming metacharacter. * - * - *

Line terminators

+ *

Line terminators

* *

A line terminator is a one- or two-character sequence that marks * the end of a line of the input character sequence. The following are @@ -480,11 +476,9 @@ * except at the end of input. When in {@link #MULTILINE} mode $ * matches just before a line terminator or the end of the input sequence. * - * - *

Groups and capturing

+ *

Groups and capturing

* - * - *
Group number
+ *

Group number

*

Capturing groups are numbered by counting their opening parentheses from * left to right. In the expression ((A)(B(C))), for example, there * are four such groups:

@@ -507,8 +501,7 @@ * subsequence may be used later in the expression, via a back reference, and * may also be retrieved from the matcher once the match operation is complete. * - * - *
Group name
+ *

Group name

*

A capturing group can also be assigned a "name", a named-capturing group, * and then be back-referenced later by the "name". Group names are composed of * the following characters. The first character must be a letter. @@ -537,7 +530,7 @@ * that do not capture text and do not count towards the group total, or * named-capturing group. * - *

Unicode support

+ *

Unicode support

* *

This class is in conformance with Level 1 of Unicode Technical @@ -568,18 +561,18 @@ *

* Scripts, blocks, categories and binary properties can be used both inside * and outside of a character class. - * + * *

- * Scripts are specified either with the prefix {@code Is}, as in + * Scripts are specified either with the prefix {@code Is}, as in * {@code IsHiragana}, or by using the {@code script} keyword (or its short * form {@code sc})as in {@code script=Hiragana} or {@code sc=Hiragana}. *

* The script names supported by Pattern are the valid script names * accepted and defined by * {@link java.lang.Character.UnicodeScript#forName(String) UnicodeScript.forName}. - * + * *

- * Blocks are specified with the prefix {@code In}, as in + * Blocks are specified with the prefix {@code In}, as in * {@code InMongolian}, or by using the keyword {@code block} (or its short * form {@code blk}) as in {@code block=Mongolian} or {@code blk=Mongolian}. *

@@ -587,8 +580,8 @@ * accepted and defined by * {@link java.lang.Character.UnicodeBlock#forName(String) UnicodeBlock.forName}. *

- * - * Categories may be specified with the optional prefix {@code Is}: + * + * Categories may be specified with the optional prefix {@code Is}: * Both {@code \p{L}} and {@code \p{IsL}} denote the category of Unicode * letters. Same as scripts and blocks, categories can also be specified * by using the keyword {@code general_category} (or its short form @@ -600,8 +593,8 @@ * {@link java.lang.Character Character} class. The category names are those * defined in the Standard, both normative and informative. *

- * - * Binary properties are specified with the prefix {@code Is}, as in + * + * Binary properties are specified with the prefix {@code Is}, as in * {@code IsAlphabetic}. The supported binary properties by Pattern * are *