[ ... ] |
Match any character in the set. e.g. [aeiou] matches any lower-case vowel. A contiguous set can be defined using a dash between the starting and ending characters. e.g. [a-z] matches any lower case character. To include a dash (-) in a set, use it as the first or last character of the set. To include a closing bracket in a set, use it as the first character of the set. e.g. [][] will match either [ or ]. Note that special characters do not retain their special meanings inside a set, with the exception of \\, \^, \-,\[ and \] match the escaped character inside a set. |
[^ ... ] |
Match any character not in the set. e.g. [^0-9] matches any non-digit. To include a caret (^) in a set, put it after the beginning of the set or escape it (\^). |
[:class:] |
Match a character in the given class of characters. Valid classes are: alpha (any alphabetic character), alnum (any alphanumeric character), lower (any lower-case letter), upper (any upper-case letter), digit (any decimal digit 0-9), xdigit (any hexadecimal digit, 0-9, A-F, a-f), space (any whitespace character), blank (only a space or tab), print (any printable character), graph (any printable character except spaces), cntrl (any control character [ascii 127 or <32]) or punct (any punctuation character). So [0-9] is equivalent to [[:digit:]]. |
[^:class:] |
Match any character not in the class, but only if the first character. |
( ... ) |
Group. The elements in the group are treated in order and can be repeated together. e.g. (ab)+ will match "ab" or "abab", but not "aba". A group will also store the text matched for use in back-references and in the array returned by the function, depending on flag value. |
(?i) |
Case-insensitivity flag. This does not operate as a group. It tells the regular expression engine to do case-insensitive matching from that point on. |
(?-i) |
(default) Case-sensitivity flag. This does not operate as a group. It tells the regular expression engine to do case-sensitive matching from that point on. |
(?i ... ) |
Case-insensitive group. Behaves just like a normal group, but performs case-insensitive matches within the group. |
(?-i ... ) |
Case-sensitive group. Behaves just like a normal group, but performs case-sensitive matches within the group. Primarily for use after (-i) flag or inside a case-insensitive group. |
(?: ... ) |
Non-capturing group. Behaves just like a normal group, but does not record the matching characters in the array nor can the matched text be used for back-referencing. |
(?i: ... ) |
Case-insensitive non-capturing group. Behaves just like a non-capturing group, but performs case-insensitive matches within the group. |
(?-i: ... ) |
Case-sensitive non-capturing group. Behaves just like a non-capturing group, but performs case-sensitive matches within the group. |
(?m) |
^ and $ match newlines within data. |
(?s) |
. matches anything including newline. (by default "." don't match newline) |
(?x) |
Ignore whitespace and # comments. |
(?U) |
Invert greediness of quantifiers. |
. |
Match any single character (except newline). |
| |
Or. The expression on one side or the other can be matched. |
\ |
Escape a special character (have it match the actual character) or introduce a special character type (see below). |
\\ |
Match an actual backslash (\). |
\a |
Alarm, that is, the BEL character (chr(7)). |
\A |
Match only at beginning of string. |
\b |
Matches at a word boundary. |
\B |
Matches when not at a word boundary. |
\c |
Match a control character, based on the next character. For example, \cM matches ctrl-M. |
\d |
Match any digit (0-9). |
\D |
Match any non-digit. |
\e |
Match an escape character (chr(27)). |
\E |
end case modification. |
\f |
Match an formfeed character (chr(12)). |
\h |
any horizontal whitespace character. |
\H |
any character that is not a horizontal whitespace character. |
\n |
Match a linefeed (@LF, chr(10)). |
\Q |
quote (disable) pattern metacharacters till \E. |
\r |
Match a carriage return (@CR, chr(13)). |
\s |
Match any whitespace character: Chr(9) through Chr(13) which are Horizontal Tab, Line Feed, Vertical Tab, Form Feed, and Carriage Return, and the standard space ( Chr(32) ). |
\S |
Match any non-whitespace character. |
\t |
Match a tab character (chr(9)). |
\v |
any vertical whitespace character. |
\V |
any character that is not a vertical whitespace character. |
\w |
Match any "word" character: a-z, A-Z, 0-9 or underscore (_). |
\W |
Match any non-word character. |
\### |
Match the ascii character whose code is given or back-reference. Can be up to 3 octal digits.
Match back-reference if found. Match the prior group number given exactly. For example, ([:alpha:])\1 would match a double letter. |
\x## |
Match the ascii character whose code is given in hexadecimal. Can be up to 2 digits. |
\z |
Match only at end of string. |
\Z |
Match only at end of string, or before newline at the end. |
{x} |
Repeat the previous character, set or group exactly x times. |
{x,} |
Repeat the previous character, set or group at least x times. |
{0,x} |
Repeat the previous character, set or group at most x times. |
{x, y} |
Repeat the previous character, set or group between x and y times, inclusive. |
* |
Repeat the previous character, set or group 0 or more times. Equivalent to {0,} |
+ |
Repeat the previous character, set or group 1 or more times. Equivalent to {1,} |
? |
The previous character, set or group may or may not appear. Equivalent to {0, 1} |
? (after a repeating character) |
Find the smallest match instead of the largest. |