Inverse string matching is not currently supported. For example, to match all strings that do not contain hamsters, you cannot use: !(hamsters) You can, however, use inverse matching for specific character classes, such as: [^A] to match any string that contains any characters that are not the letter A. |
Notation | Function | Sample Matches |
Anything except *.|^$?+\(){}[] | Literal match, except if the character is part of a: • capture group • back-reference (e.g. $0 or \1) • other regular expression token (e.g. \w) | Text: My cat catches things. Regular expression: cat Matches: cat Depending on whether the feature looks for all instances, it may also match “cat” in the beginning of “catches”. |
\ | Escape character. If it is followed by: • An alphanumeric character, the alphanumeric character is not matched literally as usual. Instead, it is interpreted as a regular expression token. For example, \w matches a word, as defined by the locale. • Any regular expression special character: *.|^$?+\(){}[]\ this escapes interpretation as a regular expression token, and instead treats it as a normal letter. For example, \\ matches: \ | Text: /url?parameter=value Regular expression: \?param Matches: ?param |
(?i) | Turns on case-insensitive matching for subsequent evaluation, until it is turned off or the evaluation completes. | Text: /url?Parameter=value Regular expression: (?i)param Matches: Param Would also match pArAM etc. |
\n | Matches a new line (also called a line feed). Microsoft Windows platforms typically use \r\n at the end of each line. Linux and Unix platforms typically use \n. Mac OS X typically uses \r | Text: My cat catches things. Regular expression: \n Matches: The end of the text on Linux and other Unix-like platforms, only part of the line ending on Windows, and nothing on Mac OS X. |
\r | Matches a carriage return. | Text: My cat catches things. Regular expression: \r Matches: Part of the line ending on Windows, nothing on Linux/Unix, and the whole line ending on Mac OS X. |
\s | Matches a space, non-breaking space, tab, line ending, or other white space character. Tip: Many languages do not separate words with white space. Even in languages that usually use a white space separator, words can be separated with many other characters such as: \/-”’"“‘.,><—:; and new lines. In these cases, you should usually include those in addition to \s in a match set ( [] ) or may need to use \b (word boundary) instead. | Text: <a href=‘http://www.example.com’> Regular expression: www\.example\.com\s Matches: Nothing. Due to the final ’ which is a word boundary but not a white space, this does not match. The regular expression should be: www.example.com\b |
\S | Matches a character that is not white space, such as A or 9. | Text: My cat catches things. Regular expression: \S Matches: Mycatcatchesthings. |
\d | Matches a decimal digit such as 9. | Text: /url?parameterA=value1 Regular expression: \d Matches: 1 |
\D | Matches a character that is not a digit, such as A or b or É. | |
\w | Matches a whole word. Words are substrings of any uninterrupted combination of one or more characters from this set: [a-zA-Z0-9_] between two word boundaries (space, new line, :, etc.). It does not match Unicode characters that are equivalent, such as 三, ٣ or 光. | Text: Yahoo! Regular expression: \w Matches: Yahoo Does not match the terminal exclamation point, which is a word boundary. |
\W | Matches anything that is not a word. | Text: Sell?!?~ Regular expression: \W Matches: ?!?~ |
. | Matches any single character except \r or \n. Note: If the character is written by combining two Unicode code points, such as à where the core letter is encoded separately from the accent mark, this will not match the entire character: it will only match one of the code points. | Text: My cat catches things. Regular expression: c.t Matches: cat cat |
+ | Repeatedly matches the previous character or capture group, 1 or more times, as many times as possible (also called “greedy” matching) unless followed by a question mark ( ? ), which makes it optional. Does not match if there is not at least 1 instance. | Text: www.example.com Regular expression: w+ Matches: www Would also match “w”, “ww”, “wwww”, or any number of uninterrupted repetitions of the character “w”. |
* | Repeatedly matches the previous character or capture group, 0 or more times. Depending on its combination with other special characters, this token could be either: • * — Match as many times as possible (also called “greedy” matching). • *? — Match as few times as possible (also called “lazy” matching). | Text: www.example.com Regular expression: .* Matches: www.example.com All of any text, except line endings (\r and \n). |
Text: www.example.com Regular expression: (w)*? Matches: www Would also match common typos where the “w” was repeated too few or too many times, such as “ww” in w.example.com or “wwww” in wwww.example.com. It would still match, however, if no amount of “w” existed. | ||
? except when followed by = | Makes the preceding character or capture group optional (also called “lazy” matching). | Text: www.example.com Regular expression: (www\.)?example.com Matches: www.example.com Would also match example.com. |
?= | Looks ahead to see if the next character or capture group matches and evaluate the match based upon them, but does not include those next characters in the returned match string (if any). This can be useful for back-references where you do not want to include permutations of the final few characters, such as matching “cat” when it is part of “cats” but not when it is part of “catch”. | Text: /url?parameter=valuepack Regular expression: p(?=arameter) Matches: p, but only in “parameter, not in “pack”, which does not end with “arameter”. |
() | Creates a capture group or sub-pattern for back-reference or to denote order of operations. See also “Example: Inserting & deleting body text” and “What are back-references?”. | Text: /url/app/app/mapp Regular expression: (/app)* Matches: /app/app |
Text: /url?paramA=valueA¶mB=valueB Regular expression: (param)A=(value)A&\0B\1B Matches: paramA=valueA¶mB=valueB | ||
| | Matches either the character/capture group before or after the pipe ( | ). | Text: Host: www.example.com Regular expression: (\r\n)|\n|\r Matches: The line ending, regardless of platform. |
^ | Matches either: • the position of the beginning of a line (or, in multiline mode, the first line), not the first character itself • the inverse of a character, but only if ^ is the first character in a character class, such as [^A] This is useful if you want to match a word, but only when it occurs at the start of the line, or when you want to match anything that is not a specific character. | Text: /url?parameter=value Regular expression: ^/url Matches: /url, but only if it is at the beginning of the path string. It will not match “/url” in subdirectories. |
Text: /url?parameter=value Regular expression: [^u] Matches: /rl?parameter=vale | ||
$ | Matches the position of the end of a line (or, in multiline mode, the entire string), not the last character itself. | |
[] | Defines a set of characters or capture groups that are acceptable matches. To define a set via a whole range instead of listing every possible match, separate the first and last character in the range with a hyphen. Note: Character ranges are matched according to their numerical code point in the encoding. For example, [@-B] matches any UTF-8 code points from 40 to 42 inclusive: @AB | Text: /url?parameter=value1 Regular expression: [012] Matches: 1 Would also match 0 or 2. |
Text: /url?parameter=valueB Regular expression: [A-C] Matches: B Would also match “A” or “C”. It would not match “b”. | ||
{} | Quantifies the number of times the previous character or capture group may be repeated continuously. To define a varying number repetitions, delimit it with a comma. | Text: 1234567890 Regular expression: \d{3} Matches: 123 |
Text: www.example.com Regular expression: w{1,4} Matches: www If the string were a typo such as “ww ” or “wwww”, it would also match that. |