| Char | Name | Description | Input | Result | Reduced |
|---|---|---|---|---|---|
| . | dot | This is the wildcard character. It matches any single character except line break characters (\r and \n). | |||
| * | asterisk | An asterisk requires that the preceding character appear zero or more times. When matching, the asterisk will be greedy, including as many characters as possible. For example, for the string “a word here, a word there,” the pattern "a.*word" will match “a word here, a word.” In order to make a minimal match (just “a word”), use the question mark character (explained below). | |||
| + | plus | This character requires that the preceding character appears one or more times. When matching, the plus will be greedy (just like the asterisk, described above) unless you use the question mark character (explained below). | |||
| ? | question | This character makes the preceding character optional. If placed after a plus or an asterisk, it instead dictates that the match for this preceding symbol will be a minimal match, including as few characters as possible. | |||
| ^ | caret | The caret matches the start of the string. This does not include any characters—it considers merely the position itself. | |||
| $ | dollar | A dollar character matches the end of the string. This does not include any characters—it considers merely the position itself. | |||
| | | pipe | The pipe causes the regular expression to match either the pattern on the left of the pipe, or the pattern on the right. | |||
| () | parens | Round brackets define a group of characters that must occur together, to which you can then apply a modifier like *, +, or ? by placing it after the closing bracket. You can also refer to a bracketed portion of a regular expression later to obtain the portion of the string that it matched. | |||
| [] | brackets | Square brackets define a character class. A character class matches one character out of those listed within the square brackets.
A character class can include an explicit list of characters (for instance, [aqz], which is the same as (a|q|z)), or a range of characters (such as [a-z], which is the same as (a|b|c|…|z). A character class can also be defined so that it matches one character that’s not listed in the brackets. To do this, simply insert a caret (^) after the opening square bracket (so [^a] will match any single character except “a”). |
|||
| \ | backslash | If you want to use one of these special characters as a literal character to be matched by the regular expression pattern, escape it by placing a backslash (\) before it (for example, 1\+1=2 will match “1+1=2”). | |||
| \n | LF | This sequence matches a newline character. | |||
| \r | CR | This matches a carriage return character. | |||
| \t | tab | This matches a tab character. | |||
| \s | This sequence matches any whitespace character; | [ \n\r\t] | |||
| \S | This matches any non-whitespace character | [^ \n\r\t] | |||
| \d | This matches any digit; | [0-9] | |||
| \D | This sequence matches anything but a digit | [^0-9] | |||
| \w | This matches any “word” character | [a-zA-Z0-9_] | |||
| \W | This sequence matches any “non-word” character | [^a-zA-Z0-9_] | |||
| \b | This code is a little special because it doesn’t actually match a character. Instead, it matches a word boundary—the start or end of a word. | ||||
| \B | Like \b, this doesn’t actually match a character. Rather, it matches a position in the string that is not a word boundary. | ||||
| \\ | Matches an actual backslash character. So if you want to match the string “\n” exactly, your regular expression would be \\n, not \n (which matches a newline character). Similarly, if you wanted to match the string “\\” exactly, your regular expression would be \\\\. |