regular_expression_reference
Table of Contents
.NET Regular Expression Reference
Favourite Regexs
Email Address
^([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$
Log4Net Match Thread Id
^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \[10\].*?$
Characters that Match Location in Strings
Character | Description |
---|---|
^ | Specifies that the match must begin at either the first character of the string or the first character of the line. If you are analysing multiline input, the ^ will match the beginning of any line. |
$ | Specifies that the match must end at either the last character of the string, the last character before \n at the end of the string, or the last character at the end of the line. If you are analysing multiline input, the $ will match the end of any line. |
\A | Specifies that the match must begin at the first character of the string (and ignores multiple lines). |
\Z | Specifies that the match must end at either the last character of the string or the last character before \n at the end of the string (and ignores multiple lines). |
\z | Specifies that the match must end at the last character of the string (and ignores multiple lines). |
\G | Specifies that the match must occur at the point where the previous match ended. When used with Match.NextMatch, this arrangement ensures that matches are all contiguous. |
\b | Specifies that the match must occur on a boundary between \w (alphanumeric) and \W (non-alphanumeric) characters. The match must occur on word boundaries, which are the first or last characters in words separated by any non-alphanumeric characters. |
\B | Specifies that the match must not occur on a \b boundary. |
Character Escapes Used in Regular Expressions
Character | Description |
---|---|
\a | Matches a bell (alarm) \u0007. |
\b | In a regular expression, \b denotes a word boundary (between \w and \W characters) except within a [] character class, where \b refers to the backspace character. In a replacement pattern, \b always denotes a backspace. |
\t | Matches a tab \u0009. |
\r | Matches a carriage return \u000D. |
\v | Matches a vertical tab \u000B. |
\f | Matches a form feed \u000C. |
\n | Matches a new line \u000A. |
\e | Matches an escape \u001B. |
\040 | Matches an ASCII character as octal (up to three digits); numbers with no leading zero are backreferences if they have only one digit or if they correspond to a capturing group number. For example, the character \040 represents a space. |
\x20 | Matches an ASCII character using hexadecimal representation (exactly two digits). |
\cC | Matches an ASCII control character—for example, \cC is control-C. |
\u0020 | Matches a Unicode character using hexadecimal representation (exactly four digits). |
\ | When followed by a character that is not recognised as an escaped character, matches that character. For example, \* represents an asterisk (rather than matching repeating characters), and \\ represents a single backslash. |
Wildcard and Character Ranges Used in Regular Expressions
Character | Description |
---|---|
* | Matches the preceding character or subexpression zero or more times. For example, “zo*” matches “z” and “zoo”. The “*” character is equivalent to “{0,}”. |
+ | Matches the preceding character or subexpression one or more times. For example, “zo+” matches “zo” and “zoo”, but not “z”. The “+” character is equivalent to “{1,}”. |
? | Matches the preceding character or subexpression zero or one time. For example, “do(es)?” matches the “do” in “do” or “does”. The ? character is equivalent to “{0,1}”. |
{n} | The n is a non-negative integer. Matches exactly n times. For example, “o{2}” does not match the “o” in “Bob” but does match the two “o”s in “food”. |
{n,} | The n is a non-negative integer. Matches at least n times. For example, “o{2,}” does not match the “o” in “Bob” and does match all the “o”s in “foooood”. The sequence “o{1,}” is equivalent to “o+”. The sequence “o{0,}” is equivalent to “o*”. |
{n,m} | The m and n are non-negative integers, where “n ≤ m”. Matches at least n and at most m times. For example, “o{1,3}” matches the first three “o”s in “fooooood”. “o{0,1}” is equivalent to “o?”. Note that you cannot put a space between the comma and the numbers. |
? | When this character immediately follows any of the other quantifiers (*, +, ?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. A non-greedy pattern matches as little of the searched string as possible, whereas the default greedy pattern matches as much of the searched string as possible. For example, in the string “oooo”, “o+?” matches a single “o”, whereas “o+” matches all “o”s. |
. | Matches any single character except “\n”. To match any character including the “\n”, use a pattern such as “[\s\S]”. |
x|y | Matches either x or y. For example, “z|food ” matches “z” or “food”. “(z|f)ood ” matches “zood” or “food”. |
[xyz] | A character set. Matches any one of the enclosed characters. For example, “[abc]” matches the “a” in “plain”. |
[a-z] | A range of characters. Matches any character in the specified range. For example, “[a-z]” matches any lowercase alphabetic character in the range “a” through “z”. |
Characters Used in Regular Expressions
Character | Description |
---|---|
\d | Matches a digit character. Equivalent to “[0-9]”. |
\D | Matches a non-digit character. Equivalent to “[^0-9] ”. |
\s | Matches any white-space character, including Space, Tab, and form-feed. Equivalent to “[ \f\n\r\t\v]”. |
\S | Matches any non-white-space character. Equivalent to “[^ \f\n\r\t\v] ”. |
\w | Matches any word character, including underscore. Equivalent to “[A-Za-z0-9_]”. |
\W | Matches any non-word character. Equivalent to “[^A-Za-z0-9_] ”. |
regular_expression_reference.txt · Last modified: 2017/01/01 20:05 by 127.0.0.1