Regex Cheatsheet
The regular expression engine starts as soon as it can, grabs as much as it can, then tries to finish as soon as it can, while taking the first decision available to it.
Anchors
char - | - usage |
^ | Start of string, or start of line in multi-line pattern |
\A | Start of string |
$ | End of string, or end of line in multi-line pattern |
\Z | End of string |
\b | Word boundary |
\B | Not word boundary |
\< | Start of word |
\> | End of word |
Character Classes
char - | - usage |
\c | Control character |
\s | White space (space or tab) |
\S | Not white space |
\d | Digit, same as [0-9] |
\D | Not digit, same as [^0-9] |
\w | Alphanumeric (letters, numbers, underscore) |
\W | Not alphanumeric |
\x | Hexade cimal digit |
\O | Octal digit |
POSIX Classes
char - | - usage |
[:upper:] | Upper case letters |
[:lower:] | Lower case letters |
[:alpha:] | All letters |
[:alnum:] | Digits and letters |
[:digit:] | Digits |
[:xdigit:] | Hexade cimal digits |
[:punct:] | Punctuation |
[:blank:] | Space and tab |
[:space:] | Blank characters |
[:cntrl:] | Control characters |
[:graph:] | Printed characters |
[:print:] | Printed characters and spaces |
[:word:] | Digits, letters and underscore |
Assertions
char - | - usage |
?= | Lookahead assertion |
?! | Negative lookahead |
?<= | Lookbehind assertion |
?!= | or ?<!-- Negative lookbehind |
?--> | Once-only Subexpression |
?() | Condition [if then] |
?()| | Condition [if then else] |
?# | Comment |
Quantifiers
char - | - usage |
* | 0 or more |
+ | 1 or more |
? | 0 or 1 |
{3} | Exactly 3 |
{3,} | 3 or more |
{,5} | at most 5 |
{3,5} | 3, 4 or 5 |
Tip | Add a ? to a quantifier to make it ungreedy. |
Escape Sequences
char - | - usage |
\ | Escape following character |
\Q | Begin literal sequence |
\E | End literal sequence |
Tip | Within a literal sequence, no need to escape Metacharacters |
Metacharacters in regex need to be escaped in order to let regex recognize and match them.
Common Metacharacters: ^ [ . $ { * ( \ + ) | ? < >
Special Characters
char - | - usage |
\n | New line |
\r | Carriage return |
\t | Tab |
\v | Vertical tab |
\f | Form feed |
\xxx | Octal character xxx |
\xhh | Hex character hh |
To match above special characters, need to escape the backslash like this \\n
in a regex expression.
Groups and Ranges
char - | - usage |
. | Any character except new line (\n) |
(a|b) | a or b |
(...) | Group |
(?:...) | Passive (non-capturing) group |
[abc] | Range (a or b or c) |
[^abc] | Not (a or b or c) |
[a-q] | Lower case letter from a to q |
[A-Q] | Upper case letter from A to Q |
[0-7] | Digit from 0 to 7 |
\x | Group/subpattern number "x" |
Tip | Ranges are inclusive |
Pattern Modifiers
char - | - usage |
g | Global match |
i * | Case-insensitive |
m * | Multiple lines |
s * | Treat string as single line |
x * | Allow comments and whitespace in pattern |
e * | Evaluate replacement |
U * | Ungreedy pattern |
Tip | starred (*) are Perl-compatible Regular Expressions (PCRE) modifiers |
| https://www.pcre.org/original/doc/html/index.html |
String Replacement
char - | - usage |
$n | nth non-passive group |
$2 | "xyz" in /^(abc(xyz))$/ |
$1 | "xyz" in /^(?:abc)(xyz)$/ |
$` | Before matched string |
$' | After matched string |
$+ | Last matched string |
$& | Entire matched string |
Tip | Some regex implem ent ations use '\' instead of '$' (i.e. regex used by Splunk) |